We are seeking passionate Research Scientists and Engineers to join our growing Pre-training team in Zurich. We are involved in developing the next generation of large language models. The team primarily focuses on multimodal capabilities: giving LLMs the ability to understand and interact with modalities other than text.
What you'll do
In this role, you will work at the intersection of cutting-edge research and practical engineering, contributing to the development of safe, steerable, and trustworthy AI systems.
- Conduct research and implement solutions in areas such as model architecture, algorithms, data processing, and optimizer development
- Independently lead small research projects while collaborating with team members on larger initiatives
- Design, run, and analyze scientific experiments to advance our understanding of large language models
- Optimize and scale our training infrastructure to improve efficiency and reliability
- Develop and improve dev tooling to enhance team productivity
- Contribute to the entire stack, from low-level optimizations to high-level model design
What you need
- Degree (BA required, MS or PhD preferred) in Computer Science, Machine Learning, or a related field
- Strong software engineering skills with a proven track record of building complex systems
- Expertise in Python and deep learning frameworks
- Have worked on high-performance, large-scale ML systems, particularly in the context of language modeling
- Familiarity with ML Accelerators, Kubernetes, and large-scale data processing
- Strong problem-solving skills and a results-oriented mindset
- Excellent communication skills and ability to work in a collaborative environment