As the Technical Lead for the Inference team, you will drive the architecture and optimization of our inference backbone, ensuring high performance, scalability, and efficiency in a dynamic environment.
The role involves architecting and optimizing the inference for high-volume, low-latency, and high-availability environments, leading the acquisition and automation of benchmarks, collaborating with cross-functional teams, and innovating solutions to enhance our AI-powered applications.
Key responsibilities include:
- Architecting and optimizing the inference for high-volume, low-latency, and high-availability environments
- Leading the acquisition and automation of benchmarks at both micro and macro scales
- Introducing new techniques and tools to improve performance, latency, throughput, and efficiency in our model inference stack
- Building tools to identify bottlenecks and sources of instability, and designing solutions to address them
- Collaborating with machine learning researchers, engineers, and product managers to bring cutting-edge technologies into production
- Optimizing code and infrastructure to maximize hardware utilization and efficiency
- Mentoring and guiding team members, fostering a culture of collaboration, innovation, and continuous learning
Requirements include:
- Extensive experience in C++ and Python, with a strong focus on backend development and performance optimization
- Deep understanding of modern ML architectures and experience with performance optimization for inference
- Proven track record with large-scale distributed systems, particularly performance-critical ones
- Familiarity with PyTorch, TensorRT, CUDA, NCCL
- Strong grasp of infrastructure, continuous integration, and continuous development principles
- Ability to lead and mentor team members, driving projects from concept to implementation
- Results-oriented mindset with a bias towards flexibility and impact
- Passion for staying ahead of emerging technologies and applying them to AI-driven solutions
- Humble attitude, eagerness to help colleagues, and a desire to see the team succeed
Our Culture
We're driven to build a strong company culture and are looking for individuals with solid alignment with the following:
- Reason with rigor
- Are you audacious enough?
- Make our customers succeed
- Ship early and accelerate
- Leave your ego aside
XML job scraping automation by YubHub