Opening. This role exists to improve how we specify and learn human preferences at scale.
What you'll do
As a Senior Research Scientist on our Reward Models team, you'll lead research efforts to improve how we specify and learn human preferences at scale. Your work will directly shape how our models understand and optimize for what humans actually want — enabling Claude to be more useful, more reliable, and better aligned with human values.
What you need
- Experience training and evaluating reward models for large language models