Member of Technical Staff, Inference at xAI

Company xAI

Location Palo Alto

Salary Competitive salary

Posted Posted 0 days ago

Job Description

This role is to optimize the latency and throughput of model inference, building reliable and performant production serving systems to serve billions of users, accelerating research on scaling test-time compute and rollout in reinforcement learning training, and model-hardware co-design for next-generation architectures.

What you'll do

Optimizing the latency and throughput of model inference.
Building reliable and performant production serving systems to serve billions of users.
Accelerating research on scaling test-time compute and rollout in reinforcement learning training.
Model-hardware co-design for next-generation architectures.

What you need

Worked on system optimizations for model serving, such as batching, caching, load balancing, and parallelism.
Worked on low-level optimizations for inference, such as GPU kernels and code generation.
Worked on algorithmic optimizations for inference, such as quantization, distillation, and speculative decoding, and low-precision numerics.

Similar Jobs

Full-Time

Site Ops Lead

xAI

Memphis

More Info

Full-Time

Facilities Maintenance Assistant

xAI

Memphis

More Info

Full-Time

Power Generation Engineer

xAI

Memphis

More Info

Full-Time

Facilities Operations Manager

xAI

Southaven, MS

More Info

Full-Time

Receiving and Logistics Clerk

xAI

Memphis

More Info

Full-Time

Electrical Engineer (EIT)

xAI

Memphis

More Info

Job Description

Similar Jobs

Site Ops Lead

Facilities Maintenance Assistant

Power Generation Engineer

Facilities Operations Manager

Receiving and Logistics Clerk

Electrical Engineer (EIT)

Receive the latest articles in your inbox

Join the Houtini Newsletter

Building the Agentic Stack.