We're looking for a Member of Technical Staff – Inference to join our team.
Our mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence.
As a Member of Technical Staff – Inference, you will be responsible for optimising the latency and throughput of model inference, building reliable and performant production serving systems to serve billions of users, accelerating research on scaling test-time compute and rollout in reinforcement learning training, and model-hardware co-design for next-generation architectures.
To be successful in this role, you will need to have worked on system optimisations for model serving, such as batching, caching, load balancing, and parallelism, worked on low-level optimisations for inference, such as GPU kernels and code generation, worked on algorithmic optimisations for inference, such as quantisation, distillation, and speculative decoding, and low-precision numerics, worked on large-scale inference engines or reinforcement learning frameworks, worked on large-scale, high-concurrent production serving, and worked on testing, benchmarking, and reliability of inference services.
The base salary for this role is $180,000 – $440,000 USD, and we offer a comprehensive total rewards package, including equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.
XML job scraping automation by YubHub