Senior Systems Engineer, Artificial Intelligence Operations at NVIDIA

Company NVIDIA

Location Santa Clara, Boulder, Remote

How You'll Work Remote

Level Senior

Sector Technology

Posted Posted 1 days ago

Job Description

You will be working as a Senior Systems Engineer in our Artificial Intelligence Operations team. We're building AI platforms for operating AI factories to make a lasting impact on resilient operations of AI clusters.

What you'll be doing:

You will bring together and understand internal and external customer requirements to improve AI cluster resiliency and design AIOps-based solutions that address these needs. You will develop automated workflows for issue detection and root cause analysis and closely collaborate with operators to debug sophisticated, full-stack AI cluster problems. You'll also deliver compelling technical presentations and lead hands-on demos or training, handle evaluation deployments (POC/POV), and ensure smooth, reliable installations by staying engaged and encouraging throughout the customer journey.

Requirements:

Bachelor of Science or equivalent experience
8+ years of networking experience in enterprise or service provider environments, with strong hands-on expertise in routing and switching
Proficient in scripting and automation using Python or similar languages, with strong Linux expertise
Proven experience working directly with customers to resolve issues and ensure success in Systems Engineer or SRE roles
Exceptional oral, written, and presentation skills for clearly communicating complex technical topics
Demonstrated ability to collaborate effectively across teams, partnering with operations, engineering, and product development

Nice to have:

Experience with data center infrastructure and cloud architectures
Background in network performance monitoring or observability
Previous experience working at a technological start-up

With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world's most desirable employers. You will also be eligible for equity and benefits. Applications for this job will be accepted at least until March 13, 2026.

Job feed automation by YubHub

Skills & Requirements

networking routing switching Python Linux scripting automation data center infrastructure cloud architectures network performance monitoring observability start-up experience

Similar Jobs

Full-Time

Silicon Power Engineer

NVIDIA

India, Bengaluru

More Info

Full-Time

Solutions Architect, AI and ML

NVIDIA

Redmond, CA, Santa Clara, Seattle

More Info

Full-Time

Senior System BIOS Firmware Developer, Client Product

NVIDIA

Taipei

More Info

Full-Time

Senior Graphic Designer – Enterprise

NVIDIA

Santa Clara

More Info

Full-Time

ASIC Design Engineer, Hardware Tools and Methodology Development

NVIDIA

US, TX, Austin

More Info

Full-Time

Senior ASIC Verification Engineer, Coherent High Speed Interconnect

NVIDIA

US, CA, Santa ClaraUS, MA, WestfordUS, TX, AustinUS, OR, Hillsboro

More Info

Job Description

Skills & Requirements

Similar Jobs

Silicon Power Engineer

Solutions Architect, AI and ML

Senior System BIOS Firmware Developer, Client Product

Senior Graphic Designer – Enterprise

ASIC Design Engineer, Hardware Tools and Methodology Development

Senior ASIC Verification Engineer, Coherent High Speed Interconnect

Receive the latest articles in your inbox

Join the Houtini Newsletter

Building the Agentic Stack.