We're hiring a Machine Learning Engineer to bridge the gap between frontier research and real-world impact. As a key member of our GPS Engineering team, you will lead the charge in research into Agent design, Deep Research and AI Safety/reliability, developing novel methodologies that not only power public sector applications but set new standards across the entire Scale organisation.

Your mission is threefold:

Frontier Research & Publication: Leading research into LLM/agent capabilities, reasoning, and safety, with the goal of publishing at top-tier venues (NeurIPS, ICML, ICLR).
Cross-Org Impact: Developing generalised techniques in Agent design, AI Safety and Deep Research agents that scale across our commercial and government platforms.
Mission-Critical Applications: Engineering high-stakes AI systems that impact millions of citizens globally.

You will:

Pioneer Novel Architectures: Design and train state-of-the-art models and agents, moving beyond “off-the-shelf” solutions to create custom architectures for complex public sector reasoning tasks.
Lead AI Safety Initiatives: Research and implement robust safety frameworks, including red teaming, alignment (RLHF/DPO), and bias mitigation strategies essential for sovereign AI.
Drive Deep Research Capabilities: Develop agents capable of long-horizon reasoning and autonomous information synthesis to solve complex problems for national security and public policy.
Publish and Contribute: Represent Scale in the broader research community by publishing high-impact papers and contributing to open-source breakthroughs.
Consult as a Subject Matter Expert: Act as a technical authority for public sector leaders, advising on the theoretical limits and safety requirements of emerging AI.
Build Evaluation Frontiers: Create new benchmarks and evaluation protocols that define what success looks like for high-stakes, non-commercial AI applications.

Ideally, you’d have:

Advanced Degree: PhD or Master’s in Computer Science, Mathematics, or a related field with a focus on Deep Learning.
Research Track Record: A portfolio of first-author publications at major conferences (NeurIPS, ICML, CVPR, EMNLP, etc.).
Engineering Rigour: Strong proficiency in Python, deep learning frameworks (PyTorch/JAX), with the ability to write production-ready code that scales.
Safety Expertise: Experience in alignment, robustness, or interpretability research.

Nice to haves:

Experience with large-scale distributed training on massive clusters.
Experience in building agentic systems that are reliable.
Experience in Sovereign AI or working with highly regulated data environments.
A zero-to-one mindset: Comfortable navigating ambiguity and defining research directions from scratch.

XML job scraping automation by YubHub

Machine Learning Engineer, Global Public Sector at Scale

Job Description

Growth Marketing Manager – Lifecycle

Global Supply Manager – SaaS

Manager, Law Enforcement Response Team

Food Service Specialist

Member of Technical Staff – Mid-training

IT Systems Engineer