We're seeking a Research Scientist to join our team in developing novel algorithmic architecture towards the end goal of solving and building Artificial General Intelligence.
In this role, you will make key contributions to the latest research developed in the Gemini audio pillar, including:
Unlocking new audio capabilities within the model, both in pre-training and post-training.
Improving quality of models for understanding and generation, including research to improve our tokenizers, better techniques for generation quality, and looking at joint audio and visual representations.
Better evaluation methods (human, auto raters, automated metrics) to measure quality of open-ended tasks.
To succeed in this role, you should have a PhD in Computer Science, Computer Vision, Speech Processing, or Machine Learning related field, experience working with LLMs, and audio or video understanding and/or generation experience. A proven track record of research and publications in areas such as audio generation, video generation, LLMs, and a real passion for AI are also advantages.
XML job scraping automation by YubHub