Full-Time

Principal Firmware Engineer – Server Manageability and Observability at NVIDIA

Company NVIDIA
Location Santa Clara
How You'll Work onsite
Level senior
Sector Technology
Posted Posted on March 9, 2026

Job Description

We're looking for a strong technical architect to own the end-to-end architecture of our data center systems, such as DGX and HGX, at the system software level. This includes firmware, kernel drivers, operating systems, and user mode drivers. You will work with component leads internally and engage with industry leading cloud service providers on taking these products to market.

What you'll be doing:

  • Serve as the primary technical point of contact for major customers, leading technological discussions, defining KPIs, gathering requirements, and addressing complex technical queries.
  • As a system software architect, lead technical innovation and strategic collaborations with major hyperscalers to architect next-generation data center products.
  • Align NVIDIA's roadmap with major customers' requirements through direct engagement.
  • Develop and drive adoption of new technologies and protocols.
  • Make critical technical decisions in ambiguous situations, mitigating risks through left-shift strategies.

What we need to see:

  • Deep expertise in scalable and performant server system architecture, focusing on SW/HW interfaces.
  • Extensive experience with complex system software for accelerators (GPUs, DPUs, FPGAs).
  • Mastery of system firmware (SBIOS, OpenBMC), embedded systems, and Linux kernel internals.
  • Proficiency in Out-of-Band and In-Band management architectures, device management protocols (e.g., MCTP, PLDM, SPDM, RDE) and system management protocols (Redfish, IPMI).
  • Extensive knowledge of networking technologies and protocols, including TCP/IP, Ethernet, InfiniBand, as well as advanced switching and routing concepts
  • Experience collaborating with platform security experts to define tradeoffs between security and ease of use.
  • Demonstrated success in leading complex, cross-functional projects to completion, showcasing the ability to influence and achieve results without direct authority in large-scale, collaborative environments. Demonstrable experience in implementing left shift strategy to de-risk program execution.

Ways to stand out from the crowd:

  • Knowledge of cloud and cluster level deployment and management systems. Participation and contributions in standards bodies such as OCP and DMTF.
  • Familiarity with NVIDIA HPC programming models and libraries (CUDA, cuDNN, DOCA)
  • Knowledge of enterprise storage architectures and distributed parallel processing paradigms

You will be eligible for equity and benefits.

XML job scraping automation by YubHub

Similar Jobs

Full-Time

Senior Legal Counsel

xAI
Tokyo, JP
More Info
Full-Time

North America Solutions Partner Manager

ElevenLabs
United States
More Info
Full-Time

Safety Engineer

ElevenLabs
United Kingdom
More Info
Full-Time

Manager, Sales Development – EMEA

Anthropic
Dublin, IE
More Info
Full-Time

Strategic Account Executive, Retail & Commercial Banking

Anthropic
San Francisco, CA | New York City, NY
More Info
Full-Time

Customer Success Manager, DACH

Anthropic
Munich, Germany
More Info

Receive the latest articles in your inbox

Join the Houtini Newsletter

Practical AI tools, local LLM updates, and MCP workflows straight to your inbox.