Opening. This role is a unique combination of executing directly on ML research, data operations, and project management to improve our models. You'll own the end-to-end process of creating RL environments for new capabilities: identifying high-value tasks, designing reward signals, managing vendor relationships, and measuring impact on model performance.
What you'll do
- Improve and execute our fine-tuning strategies for adapting Claude to new domains and tasks
- Manage technical relationships with external data vendors, including evaluation of data quality and reward design
What you need
- Experience with fine-tuning large language models for specific domains or real-world use cases and/or domain expertise in an area where we would like to make our models more useful