Member of Technical Staff - Evaluations

via Ashby

About this role

OUR MISSION Reflection’s mission is to build open superintelligence and make it accessible to all. We’re developing open weight models for individuals, agents, enterprises, and even nation states. Our team of AI researchers and company builders come from DeepMind, OpenAI, Google Brain, Meta, Character.AI, Anthropic and beyond. ABOUT THE ROLE - Conduct critical comparative analysis to advance our understanding of model capabilities - Build and refine evaluation systems and processes that create tight feedback loops between data, evals, and model behavior - Develop generalizable evaluation frameworks that capture what matters for reasoning, alignment, and usefulness.…

Read the full description on Reflectionai's site →

What we'd score you on

reqspace match rubric

Five dimensions, recruiter-grade. Upload your resume and we'll generate a written explanation of where you fit and where the gaps are.

1

Skills match

For this role: openai, anthropic, teams

2

Level fit

We check your title trajectory against the seniority signal of the role.

3

Domain experience

Your work in the role's domain matters more than your years total. We weight recent and direct experience.

4

Recency

A skill you used last quarter weighs more than one from five years ago. We grade on recency, not lifetime.

5

Location fit

This role is based in a specific location. We weight your proximity and willingness to relocate.

Score yourself on this role.
Free · no card · written explanation included
See if I'm a fit →

Skills in this role

Pulled from the job description. These are the keywords we'll weight when scoring your fit.

openaianthropicteams

More at Reflectionai

See all open jobs at Reflectionai