Opening. We are working on bringing capabilities and experiences from both 1P Google Services (e.g. Maps, Youtube, Gmail, Calendar, etc.) as well as 3P services into Gemini – all in service of making Gemini the best LLM-supercharged universal assistant there is.
What you'll do
In this role, you will work on bringing conversational, multimodal agentic experiences powered by 1P and 3P tools into Gemini and Gemini Live. This team focuses on horizontal model quality, and infrastructure for scaling to all APIs/MCPs out in the world.
- Improving the model’s ability to reason through complex requests, call various tools to fulfill the user request, including complex scenarios like chaining multiple tools across multiple turns.
- Building new agentic experiences powered by our core tools with multi-modal input in GeminiLive with camera and screenshare.
- Identifying gaps based on user feedback, and improving the model’s capability for function calling.
- Building data collection and eval infra to collect eval data in scalably and autoraters, autousers, automatic prompt optimization to help scale hill-climb and push the frontiers of what Gemini can do with MCP integrations.
What you need
- Experience working on software engineering projects from proof-of-concept through to implementation.
- Proven knowledge and experience of Python, C++, GCL in production environments.
- Experience in applying experimental ideas to applied problems.
- Great communication skills and interpersonal skills.
- Knowledge of machine learning and statistics.
- Experience productionizing state-of-the-art large language and multimodal models a plus.
- Experience fine-tuning large models (e.g. SFT, RLHF, prompt optimization) a plus.