At Google DeepMind, we're a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority.

The role of the Research Engineer will be to develop state-of-the-art methods for multimodal generative AI models, with a primary focus on image generation and editing. This role is for the team behind “Nano Banana”.

As a Research Engineer at Google DeepMind, you will lead our efforts in developing novel tools, infrastructure, and algorithms towards the end goal of solving and building Artificial General Intelligence. You will work collaboratively within and across Research fields, drawing on expertise from a variety of disciplines including deep learning, computer vision, language modeling, and advanced generative architectures.

Key responsibilities include designing, implementing, and evaluating cutting-edge deep learning algorithms, data curation, and evaluation infrastructure for multimodal generative AI, with a particular emphasis on image synthesis. You will report and present research findings and developments clearly and efficiently both internally and externally, verbally and in writing. You will also suggest and engage in team collaborations to meet ambitious research goals, while also driving significant individual contributions.

To succeed as a Research Engineer at Google DeepMind, we look for the following skills and experience:

PhD in Computer Science, Artificial Intelligence, Machine Learning, Computer Vision, or equivalent practical experience.
Proven experience in deep learning research and development, particularly in generative AI and related to image synthesis. This includes diffusion models and autoregressive generative models. Experience with post-training is a plus.
Exceptional engineering skills in Python and deep learning frameworks (e.g., Jax, TensorFlow, PyTorch), with a track record of building high-quality research prototypes and systems.
Strong publication record at top-tier machine learning, computer vision, and graphics conferences (e.g., NeurIPS, ICLR, ICML, SIGGRAPH, CVPR, ICCV).

In addition, the following would be an advantage:

Demonstrated experience in multimodal generative modeling, especially combining large language models with visual generation (e.g., text-to-image/video systems, joint autoregressive and diffusion models).
A keen eye for visual aesthetics and detail, coupled with a passion for creating high-quality, visually compelling generative content.
A real passion for AI!

XML job scraping automation by YubHub

Research Engineer, Multimodal Generative AI (Image/Video) at Google DeepMind

Job Description

Model Behavior Tutor – Social Cognition & EQ

Model Behavior Tutor – Epistemic Rigor & Truthfulness

Member of Technical Staff – Grok Chat Model

Member of Technical Staff – X Platform Security

IT Systems Engineer

Senior IT Systems Engineer

Job Description

Similar Jobs

Model Behavior Tutor – Social Cognition & EQ

Model Behavior Tutor – Epistemic Rigor & Truthfulness

Member of Technical Staff – Grok Chat Model

Member of Technical Staff – X Platform Security

IT Systems Engineer

Senior IT Systems Engineer

Receive the latest articles in your inbox

Join the Houtini Newsletter

Building the Agentic Stack for Work.