We are seeking a Visiting Researcher, Release to Production to work closely with Hardware/Software co-design teams, hardware designers, networking teams, system manufacturers, component vendors, capacity engineering, production engineering, production services, and data center operations teams to enable new systems that will be deployed in our production data centers.
This role will drive and execute end-to-end system validation strategy (hardware and software), with a focus on various AI/HPC hardware systems in datacenter applications. You will lead the bring-up, validation, and deployment of cutting-edge hardware systems in large scale deployment with active hands-on participations.
Responsibilities:
- Drive and execute end-to-end system validation strategy (hardware and software), with a focus on various AI/HPC hardware systems in datacenter applications
- Lead the bring-up, validation, and deployment of cutting-edge hardware systems in large scale deployment with active hands-on participations
- Explore new use cases with customer teams and identify related test methodologies/test cases accordingly
- Investigate and troubleshoot complex failures potentially related to Hardware systems with cross-function teams, which may involve different stacks like silicon, firmware, software, etc
- Triage failures and continue rootcausing while driving project development work forward
- Identify gaps and opportunities to improve test process and test methodologies across the NPI space
- Guide automation efforts and data analysis for NPI projects through engagement with related cross-function teams
- Communicate project progress and assessments to related internal and external teams
Minimum Qualifications:
- Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
- 8+ years of experience in hands-on SW, FW or HW engineering to build any of the following products (AI Silicon, GPUs, TPUs, Autonomous cars, AI servers)
- Experience in one or more domains such as: ASIC development (Silicon design, bringup, characterization, validation), board level debug, firmware validation, system validation
- Experience with leading Silicon or System troubleshooting and debugging
- Experience in developing test specifications, procedures, and debug guides for test solutions
Preferred Qualifications:
- Experience in driving state-of-the-art data and ML/AI solutions for hardware reliability and scalability
XML job scraping automation by YubHub