Engineering Manager – Inference at Perplexity

Company Perplexity

Location San Francisco

Salary Competitive salary

Posted Posted 0 days ago

Job Description

We are looking for an Inference Engineering Manager to lead our AI Inference team. This is a unique opportunity to build and scale the infrastructure that powers Perplexity's products and APIs, serving millions of users with state-of-the-art AI capabilities.

What you'll do

You will own the technical direction and execution of our inference systems while building and leading a world-class team of inference engineers. Our current stack includes Python, PyTorch, Rust, C++, and Kubernetes.

Lead and grow a high-performing team of AI inference engineers
Develop APIs for AI inference used by both internal and external customers
Architect and scale our inference infrastructure for reliability and efficiency

What you need

5+ years of engineering experience with 2+ years in a technical leadership or management role
Deep experience with ML systems and inference frameworks (PyTorch, TensorFlow, ONNX, TensorRT, vLLM)
Strong understanding of LLM architecture: Multi-Head Attention, Multi/Grouped-Query Attention, and common layers

Similar Jobs

Full-Time

Customer Success Associate (Comet Browser)

Perplexity

New York City, Belgrade, London

More Info

Full-Time

Data Scientist, Evals

Perplexity

London

More Info

Full-Time

Tech Lead Manager – Agents

Perplexity

San Francisco

More Info

Full-Time

Forward-Deployed Engineer – API Platform

Perplexity AI

New York City, London, San Francisco, Seattle

More Info

Contract

Customer Success Engineer

Perplexity AI

San Francisco, New York City

More Info

Full-Time

Business Development Representative

Perplexity

San Francisco, New York City

More Info

Job Description

Similar Jobs

Customer Success Associate (Comet Browser)

Data Scientist, Evals

Tech Lead Manager – Agents

Forward-Deployed Engineer – API Platform

Customer Success Engineer

Business Development Representative

Receive the latest articles in your inbox

Join the Houtini Newsletter

Building the Agentic Stack.