Full-Time

Member of Technical Staff, Hardware Health at Microsoft AI

Company Microsoft AI
Location Mountain View
Salary Competitive salary
Posted Posted 0 days ago

Job Description

Summary

Microsoft AI are looking for a talented Member of Technical Staff, Hardware Health, to ensure these systems deliver sustained reliability, performance, and availability across exascale-class deployments.

About the Role

We work closely with research, hardware, datacenter, and platform engineering teams to develop predictive health models, failure detection frameworks, and autonomous remediation systems that keep our AI clusters operating at frontier scale. Our team is responsible for Copilot, Bing, Edge, and generative AI research. Join us and help shape the future of personal computing.

Accountabilities

  • Design and develop next-generation hardware health monitoring and diagnostic frameworks for large GPU clusters (NVL16/NVL72/GB200+ scale).
  • Build predictive analytics pipelines leveraging telemetry, power, and thermal data to anticipate hardware degradation and systemic issues.

The Candidate we're looking for

Experience:

  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.

Technical skills:

  • Proficiency in hardware telemetry, diagnostics, or failure analysis tools.
  • Experience with exascale-class systems or cloud-scale AI clusters.

Personal attributes:

  • Strong analytical and problem-solving skills.
  • Excellent communication and collaboration skills.

Benefits

  • Competitive salary range: $139,900 – $274,800 per year.
  • Comprehensive benefits package, including health insurance, retirement plan, and paid time off.
  • Opportunities for professional growth and development.
  • Collaborative and dynamic work environment.

Similar Jobs

Full-Time

Strategic Customer Success Manager

Synthesia
New York City
More Info
Full-Time

Software Engineer, Machine Learning

Synthesia
Europe
More Info
Full-Time

Software Engineer, Back End – Video Generation (Tech Lead Level)

Synthesia
London
More Info
Full-Time

Marketing Rev Ops Manager

Synthesia
London
More Info
Full-Time

GTM Methodology Lead

Synthesia
New York City
More Info
Full-Time

Customer Support Associate

Synthesia
US Remote
More Info

Receive the latest articles in your inbox

Join the Houtini Newsletter

Practical AI tools, local LLM updates, and MCP workflows straight to your inbox.