Full-Time

Research Engineer, Multimodal Reinforcement Learning at Google DeepMind

Company Google DeepMind
Sector Technology
Posted Posted 1 days ago

Job Description

Are you a Research Engineer with a passion for Reinforcement Learning and Multimodality? Join Google DeepMind's Frontier AI Unit! We are seeking a researcher to help us make learning efficient through conversational environments.nnWhile text-based reasoning has shown immense promise, we are moving the frontier toward image-grounded, multimodal, and retrieval-augmented conversational setups. You will bridge the gap between conversational learning and the visual domain, applying the latest RL methods to create scalable, semi-verifiable environments that power the next generation of our models (e.g., Gemini).nnAs a Research Engineer, you will play a pivotal role in expanding Meta Reinforcement Learning to multimodal setups. You will help us leapfrog current industry benchmarks by extending our focus from verifiable domains to semi-verifiable, multimodal domains (e.g., Lens, Image-grounded reasoning).nnThis is an ecosystem play: you will leverage our advantages in autoraters and autousers to scale the creation of these conversational environments. You will be the bridge between the core conversational work and the specifics of grounding in the visual domain, moving our training infra from static data towards dynamic, multi-turn environments.nnKey responsibilities:nn* Design and implement novel RL algorithms that enable multi-turn reasoning and learning in multimodal (text + vision) environments.n* Contribute to the "ecosystem" of autoraters and autousers, building the infrastructure needed to generate high-quality, semi-verifiable training environments at scale.n* Apply state-of-the-art methods to solve strategic problems, specifically closing the gap between single-turn and multi-turn embeddings (retrieval-augmented reasoning).n* Track, interpret, and analyze complex experiments, providing scientific rigor to our training pipelines.n* Act as a connector between teams (Google Research, Core, GDM GenAI), helping to build shared pipelines for conversational infrastructure that serve product needs in Search, Lens, and YouTube.nnWhat We Can Offer You:nn* The opportunity to publish and contribute to the scientific community, specifically in the high-impact intersection of RL, Multimodality, and Reasoning.n* Access to world-class compute and the existing infrastructure of autoraters/autousers, allowing you to focus on innovation rather than building from scratch.n* Direct impact: your work will directly influence the reasoning capabilities of Google’s flagship models (Gemini), moving the needle on how models learn and interact with the world.n* Collaborative culture: work alongside world-leading experts in RL and Generative AI in a supportive, growth-oriented environment.nnAbout you:nnWe are looking for a Research Engineer who is not just technically proficient but deeply curious about the mechanics of learning. You should be up to date with the latest methods in RL and eager to apply them to messy, ambiguous, and high-impact strategic problems. You are comfortable bridging the gap between abstract research and concrete implementation.nnEssential Skills:nn* PhD or Equivalent Experience: A PhD in Computer Science, AI, or related field, or equivalent practical experience, with a specific focus on Reinforcement Learning (RL).n* Proven Research Track Record: A history of scientific contributions (e.g., publications at NeurIPS, ICML, ICLR, CVPR) or significant contributions to state-of-the-art AI models.n* Multimodal Experience: Concrete experience working with multimodal models (vision + language) and understanding the specific challenges of grounding text in visual data.n* Engineering Excellence: Strong coding skills (Python, JAX/TensorFlow/PyTorch) and experience designing and executing complex experiments.nnUseful Skills:nn* Retrieval & Embeddings: Experience with retrieval-augmented generation (RAG), embedding spaces, or search infrastructure.n* Multi-Agent Systems: Familiarity with self-verification, introspection, reflection, or multi-agent negotiation frameworks.n* Infrastructure: Experience building or scaling training environments, autoraters, or reward models.

XML job scraping automation by YubHub

Similar Jobs

Full-Time

Model Behavior Tutor – Social Cognition & EQ

xAI
Remote
More Info
Full-Time

Model Behavior Tutor – Epistemic Rigor & Truthfulness

xAI
Remote
More Info
Full-Time

Member of Technical Staff – Grok Chat Model

xAI
Palo Alto, CA
More Info
Full-Time

Member of Technical Staff – X Platform Security

xAI
Palo Alto, CA
More Info
Full-Time

IT Systems Engineer

xAI
Palo Alto, CA
More Info
Full-Time

Senior IT Systems Engineer

xAI
Palo Alto, CA
More Info

Receive the latest articles in your inbox

Join the Houtini Newsletter

Practical AI tools, local LLM updates, and MCP workflows straight to your inbox.