We're hiring a Machine Learning Engineer to bridge the gap between frontier research and real-world impact. As a key member of our GPS Engineering team, you will lead the charge in research into Agent design, Deep Research and AI Safety/reliability, developing novel methodologies that not only power public sector applications but set new standards across the entire Scale organisation.
Your mission is threefold:
- Frontier Research & Publication: Leading research into LLM/agent capabilities, reasoning, and safety, with the goal of publishing at top-tier venues (NeurIPS, ICML, ICLR).
- Cross-Org Impact: Developing generalised techniques in Agent design, AI Safety and Deep Research agents that scale across our commercial and government platforms.
- Mission-Critical Applications: Engineering high-stakes AI systems that impact millions of citizens globally.
You will:
- Pioneer Novel Architectures: Design and train state-of-the-art models and agents, moving beyond “off-the-shelf” solutions to create custom architectures for complex public sector reasoning tasks.
- Lead AI Safety Initiatives: Research and implement robust safety frameworks, including red teaming, alignment (RLHF/DPO), and bias mitigation strategies essential for sovereign AI.
- Drive Deep Research Capabilities: Develop agents capable of long-horizon reasoning and autonomous information synthesis to solve complex problems for national security and public policy.
- Publish and Contribute: Represent Scale in the broader research community by publishing high-impact papers and contributing to open-source breakthroughs.
- Consult as a Subject Matter Expert: Act as a technical authority for public sector leaders, advising on the theoretical limits and safety requirements of emerging AI.
- Build Evaluation Frontiers: Create new benchmarks and evaluation protocols that define what success looks like for high-stakes, non-commercial AI applications.
Ideally, you’d have:
- Advanced Degree: PhD or Master’s in Computer Science, Mathematics, or a related field with a focus on Deep Learning.
- Research Track Record: A portfolio of first-author publications at major conferences (NeurIPS, ICML, CVPR, EMNLP, etc.).
- Engineering Rigour: Strong proficiency in Python, deep learning frameworks (PyTorch/JAX), with the ability to write production-ready code that scales.
- Safety Expertise: Experience in alignment, robustness, or interpretability research.
Nice to haves:
- Experience with large-scale distributed training on massive clusters.
- Experience in building agentic systems that are reliable.
- Experience in Sovereign AI or working with highly regulated data environments.
- A zero-to-one mindset: Comfortable navigating ambiguity and defining research directions from scratch.
XML job scraping automation by YubHub