Full-Time

Senior Systems Engineer, Artificial Intelligence Operations at NVIDIA

Company NVIDIA
Location Santa Clara, Boulder, Remote
How You'll Work Remote
Level Senior
Sector Technology
Posted Posted 1 days ago

Job Description

You will be working as a Senior Systems Engineer in our Artificial Intelligence Operations team. We're building AI platforms for operating AI factories to make a lasting impact on resilient operations of AI clusters.

What you'll be doing:

You will bring together and understand internal and external customer requirements to improve AI cluster resiliency and design AIOps-based solutions that address these needs. You will develop automated workflows for issue detection and root cause analysis and closely collaborate with operators to debug sophisticated, full-stack AI cluster problems. You'll also deliver compelling technical presentations and lead hands-on demos or training, handle evaluation deployments (POC/POV), and ensure smooth, reliable installations by staying engaged and encouraging throughout the customer journey.

Requirements:

  • Bachelor of Science or equivalent experience
  • 8+ years of networking experience in enterprise or service provider environments, with strong hands-on expertise in routing and switching
  • Proficient in scripting and automation using Python or similar languages, with strong Linux expertise
  • Proven experience working directly with customers to resolve issues and ensure success in Systems Engineer or SRE roles
  • Exceptional oral, written, and presentation skills for clearly communicating complex technical topics
  • Demonstrated ability to collaborate effectively across teams, partnering with operations, engineering, and product development

Nice to have:

  • Experience with data center infrastructure and cloud architectures
  • Background in network performance monitoring or observability
  • Previous experience working at a technological start-up

With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world's most desirable employers. You will also be eligible for equity and benefits. Applications for this job will be accepted at least until March 13, 2026.

Job feed automation by YubHub

Skills & Requirements

networking routing switching Python Linux scripting automation data center infrastructure cloud architectures network performance monitoring observability start-up experience

Similar Jobs

Full-Time

Silicon Power Engineer

NVIDIA
India, Bengaluru
More Info
Full-Time

Solutions Architect, AI and ML

NVIDIA
Redmond, CA, Santa Clara, Seattle
More Info
Full-Time

Senior System BIOS Firmware Developer, Client Product

NVIDIA
Taipei
More Info
Full-Time

Senior Graphic Designer – Enterprise

NVIDIA
Santa Clara
More Info
Full-Time

ASIC Design Engineer, Hardware Tools and Methodology Development

NVIDIA
US, TX, Austin
More Info
Full-Time

Senior ASIC Verification Engineer, Coherent High Speed Interconnect

NVIDIA
US, CA, Santa ClaraUS, MA, WestfordUS, TX, AustinUS, OR, Hillsboro
More Info

Receive the latest articles in your inbox

Join the Houtini Newsletter

Practical AI tools, local LLM updates, and MCP workflows straight to your inbox.