Opening. We are working on an arsenal of proprietary research, tools, and resources that serve all of our enterprise clients. As a Staff Agent Post-Training MLRE, you will build out our next-gen Agent RL training platform.
What you'll do
You'll integrate cutting edge research into our training stack, enabling MLREs on the Enterprise AI team to deploy use-cases ranging from next-generation AI cybersecurity firewall LLMs to training foundation healthtech search models.
- Train state of the art models, developed both internally and from the community, to deploy to our enterprise customers.
- Research cutting edge algorithms to integrate directly into our training stack.
- Design solutions that enable complex multi-agent systems to directly learn from both process + outcome based rewards.
What you need
- 5+ years of LLM training in a production environment
- Experience with post-training methods like RLHF/RLVR and related algorithms like PPO/GRPO etc.
- Publications in top conferences such as NEURIPS, ICLR, or ICML within the last two years
- PhD or Masters in Computer Science or a related field