Member of Technical Staff, Applied Inference at xAI

Company xAI

Location Palo Alto

Salary Competitive salary

Posted Posted 0 days ago

Job Description

We are looking for a talented individual to join our team as a Member of Technical Staff, Applied Inference. In this role, you will be responsible for designing and implementing scalable distributed infrastructure for model serving, ensuring the reliability of inference services, and creating custom tools to trace, replay, and fix issues or crashes across the entire stack.

What you'll do

Architect and implement scalable distributed infrastructure for model serving, such as load balancing, auto scaling, batch scheduling, and global KVcache systems.
Ensure the reliability of inference services, targeting 100% uptime, a 0% error rate, and good tail performance, through proactive monitoring, fault-tolerant designs, and rigorous testing.

What you need

Experience with large-scale, high-concurrent production serving.
Experience with GPU inference engines.
Experience with testing, benchmarking, and the reliability of inference services.
Experience with designing and implementing CI/CD infrastructure.

Similar Jobs

Full-Time

Customer Success Associate (Comet Browser)

Perplexity

New York City, Belgrade, London

More Info

Full-Time

Data Scientist, Evals

Perplexity

London

More Info

Full-Time

Tech Lead Manager – Agents

Perplexity

San Francisco

More Info

Full-Time

Forward-Deployed Engineer – API Platform

Perplexity AI

New York City, London, San Francisco, Seattle

More Info

Full-Time

Business Development Representative

Perplexity

San Francisco, New York City

More Info

Full-Time

Engineering Site Lead

Perplexity

London

More Info

Job Description

Similar Jobs

Customer Success Associate (Comet Browser)

Data Scientist, Evals

Tech Lead Manager – Agents

Forward-Deployed Engineer – API Platform

Business Development Representative

Engineering Site Lead

Receive the latest articles in your inbox

Join the Houtini Newsletter

Building the Agentic Stack.