What the team is looking for.

Perplexity is hiring builders to join our Multimodal AI group, an industry-leading team defining the next generation of human-AI interaction. Our team is creating experiences that move beyond the touch interface, allowing people to communicate with AI through the form factors that best meet their needs. These include through voice, images, video, or new modalities we have yet to invent.

As an engineer on the Multimodal AI team, you will work across the stack to build the product experiences and platform systems that make this possible across our applications. Our stack spans immersive UIs, real-time audio processing, evaluation systems, backend infrastructure, and supporting libraries and SDKs. We’ll work with you to match your strengths and interests to the areas where you can drive the most impact, while collaborating as a team to turn ambiguous, bleeding-edge problems into reliable experiences for users.

Responsibilities

Design, build, and own product and multimodal platform systems for Perplexity.
Lead features, projects and products end-to-end, from problem definition to technical design, implementation, and launch.
Hill climb on hard problems, continuously iterating to improve for ourselves and customers.
Partner closely with engineers, product managers, designers, data scientists, and go-to-market teams.
Build systems that take into account the nuances of multimodal AI.
Work closely with client verticals to integrate new features into their stack.

What we're looking for

Experience building and operating production systems at a meaningful scale.
Ability to work up and down the stack, from deep systems primitives to getting the pixels and prompts just right.
Strong product judgment and the ability to translate user problems into simple, effective technical solutions.
Genuine interest and adoption of multimodal AI products and willingness to learn quickly.
Ability to think through novel problems and implement companion long-term solutions that scale.

Nice to have

Background including work with real-time audio or video processing.
Experience with audio stack technologies including audio processing modules (APMs), echo cancellation, noise reduction/cancellation, automatic-gain control (AGC), etc.
Experience with immersive UIs integrating with real-time data.
Some experience or familiarity with Rust or C++

Skills mentioned

Multimodal AI
Real-time audio processing
Immersive UIs
Backend infrastructure
Supporting libraries and SDKs
Rust
C++
Audio stack technologies
Echo cancellation
Noise reduction/cancellation
Automatic-gain control (AGC)

Member of Technical Staff (AI Software Engineer, Multimodal)

What the team is looking for.

Responsibilities

What we're looking for

Nice to have

Other roles you might consider.

Field Engineer

Technical Support Engineer

Forward Deployed Engineering Recruiter

Forward Deployed Engineering Recruiter

Member of Technical Staff (AI Software Engineer, Agents)

Forward Deployed Engineering Recruiter

New to AI work? Start with these.

Claude Desktop, from zero.

The best MCPs for Claude Desktop.

Claude Code, the complete beginners' guide.

How to set up LM Studio.

Beginner's guide to AI hardware.

MCP catalogue.