We are looking for an AI/HPC Network Development Engineer to join our team. As an AI/HPC Network Development Engineer, you will be responsible for designing and operating large scale networks with a focus on ethernet AI/HPC space. You will work closely with our team to develop and implement new network architectures and technologies to support our growing AI workloads.
What you'll do
- Design and operate large scale networks with a focus on ethernet AI/HPC space
- Develop and implement new network architectures and technologies to support our growing AI workloads
What you need
- A minimum of 10 years designing and operating large scale networks with 5 years in the ethernet AI/HPC space
- Deep understanding of congestion control on ethernet with Infiniband an added bonus
- Deep understanding of AI training and inference workloads and how they operate on the network
- Expertise in creating a portfolio of metrics for performance and operations to optimize the fleet for training and inference traffic
- Experience with Python to automate away repetitive tasks and facilitate your daily job working with and analyzing large sets of data