Research Engineer, Human Understanding
Apply at source. Google DeepMind handles the application directly; Houtini doesn't take a fee from candidates or companies. We curate which companies appear; the listings come from yubhub.
What the team is looking for.
We are seeking a highly motivated Research Engineer with a strong background in multi-modal modelling for humans and a focus on speech & audio/visual to join the effort within Google DeepMind's Frontier AI unit.
This role is pivotal in developing foundational multimodal AI capabilities to understand, generate, and protect human likeness. As a key contributor, you will design and implement cutting-edge models and frameworks, pushing the boundaries of AI to enable foundational capabilities for human-centric understanding and generation.
This is a unique opportunity to contribute to impactful research and advance Google DeepMind's mission towards Artificial General Intelligence (AGI).
Key Responsibilities
- Advance multimodal human representations & understanding: Research and implement novel models and other multimodal techniques for a more holistic understanding of humans across visual, audio, and textual data.
- Conduct applied research: Conduct experimental research cycles from hypothesis to deployment.
- Drive technical projects: Take ownership of substantial technical projects within the effort, from ideation and design to implementation and evaluation, often involving cross-functional collaboration.
- Contribute to Infrastructure: Inform and contribute to the development of scalable and efficient research infrastructure for multimodal human understanding models and datasets.
- Design and execute strategies for tuning and adapting VLMs and other foundation models for specific tasks
Requirements
- PhD degree in Computer Science, Machine Learning, or a related technical field with 3+ years of relevant experience.
- Experience in developing machine learning models, such as audio & speech-visual models.
- Experience in working with and tuning large-scale vision language models.
- Strong programming skills in Python and experience with at least one major deep learning framework (e.g., JAX)
- Experience conducting independent research and development, including experimental design, implementation, and analysis.
Salary
The US base salary range for this full-time position is between $174,000 USD - $252,000 USD + bonus + equity + benefits.
- Python
- JAX
- Machine Learning
- Deep Learning
- Vision Language Models
- Audio & Speech-Visual Models
- Generative AI
- Reinforcement Learning
- Alignment Methods
- Multimodal Learning
- Privacy-Preserving Machine Learning
Other roles you might consider.
Filtered through the same AI-companies allowlist.
Member of Technical Staff (AI Policy and Strategic Initiatives)
Perplexity
Member of Technical Staff (AI Software Engineer, Agents)
Perplexity
Senior/Staff Applied AI Engineer, Fullstack
Mistral AI
Applied Scientist / Research Engineer
Mistral AI
Applied AI, Machine Learning Engineer
Mistral AI
Applied AI Engineer, Fullstack
Mistral AI
New to AI work? Start with these.
Six pieces of orientation. Most AI-company job specs assume you've done this kind of hands-on work already. If you haven't, an afternoon with one of these is the cheapest way to close the gap.
Claude Desktop, from zero.
The agentic-AI assistant most of the people you'd be working alongside use every day. Install, configure, first useful prompts.
What MCPs areThe best MCPs for Claude Desktop.
MCP servers extend an AI assistant with tools and data. The catalogue most teams use. Useful technical context for any AI-engineering role.
Code with AIClaude Code, the complete beginners' guide.
The CLI for AI-paired development. Required reading if you're applying for any engineering role that mentions agents, or any role full stop.
Run a local modelHow to set up LM Studio.
Running a model on your own machine teaches you more about how AI products work in three hours than a year of using ChatGPT will.
The hardware realityBeginner's guide to AI hardware.
What the infrastructure under the model actually looks like. Useful context for infrastructure, applied-AI and hardware roles.
Browse the stackMCP catalogue.
Eleven MCP servers Houtini maintains or recommends. Each detail page describes a real piece of working AI infrastructure.