Full-Time

Engineering Manager – Inference at Perplexity

Company Perplexity
Location San Francisco
Sector Technology
Posted Posted on March 4, 2026

Job Description

We are looking for an Inference Engineering Manager to lead our AI Inference team. This is a unique opportunity to build and scale the infrastructure that powers Perplexity's products and APIs, serving millions of users with state-of-the-art AI capabilities.

What you'll do

You will own the technical direction and execution of our inference systems while building and leading a world-class team of inference engineers. Our current stack includes Python, PyTorch, Rust, C++, and Kubernetes.

  • Lead and grow a high-performing team of AI inference engineers
  • Develop APIs for AI inference used by both internal and external customers
  • Architect and scale our inference infrastructure for reliability and efficiency

What you need

  • 5+ years of engineering experience with 2+ years in a technical leadership or management role
  • Deep experience with ML systems and inference frameworks (PyTorch, TensorFlow, ONNX, TensorRT, vLLM)
  • Strong understanding of LLM architecture: Multi-Head Attention, Multi/Grouped-Query Attention, and common layers

XML job scraping automation by YubHub

Similar Jobs

Full-Time

Model Behavior Tutor – Social Cognition & EQ

xAI
Remote
More Info
Full-Time

Model Behavior Tutor – Epistemic Rigor & Truthfulness

xAI
Remote
More Info
Full-Time

Member of Technical Staff – Grok Chat Model

xAI
Palo Alto, CA
More Info
Full-Time

Member of Technical Staff – X Platform Security

xAI
Palo Alto, CA
More Info
Full-Time

IT Systems Engineer

xAI
Palo Alto, CA
More Info
Full-Time

Member of Technical Staff – Internal Tools

xAI
Palo Alto, CA
More Info

Receive the latest articles in your inbox

Join the Houtini Newsletter

Practical AI tools, local LLM updates, and MCP workflows straight to your inbox.