As the Technical Lead for the Inference team, you will drive the architecture and optimization of our inference backbone, ensuring high performance, scalability, and efficiency in a dynamic environment.
What you'll do
- Architect and optimize the inference for high-volume, low-latency, and high-availability environments.
- Lead the acquisition and automation of benchmarks at both micro and macro scales.
What you need
- Extensive experience in C++ and Python, with a strong focus on backend development and performance optimization.