Full-Time

Software Engineer, Agent Infrastructure at OpenAI

Company OpenAI
Location San Francisco; New York City
Salary Competitive salary
Posted Posted 1 days ago

Job Description

Software Engineer, Agent Infrastructure

Location

San Francisco; New York City

Employment Type

Full time

Department

Scaling

Compensation

  • $230K – $385K • Offers Equity

The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.

Benefits

  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts

  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)

  • 401(k) retirement plan with employer match

  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)

  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees

  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)

  • Mental health and wellness support

  • Employer-paid basic life and disability coverage

  • Annual learning and development stipend to fuel your professional growth

  • Daily meals in our offices, and meal delivery credits as eligible

  • Relocation support for eligible employees

  • Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.

About the Team

The Agent Infrastructure team at OpenAI is responsible for building systems that enable training and deployment of highly useful AI agents, both internally and for the world.

We work hand-in-hand with researchers to design and scale the environment in which agentic models are trained – providing a workspace for AI models to execute code, debug issues, and develop software just as human SWEs do. Our training environment for agentic models operates at an extremely high scale and has the flexibility to emulate any environment in which an agent might work.

At the same time, our team builds and maintains OpenAI’s core platform for the deployment and execution of agents in production. Our systems power products such as Codex, Operator, tool use in ChatGPT, and future agentic products.

About the Role

As a Software Engineer on the Agent Infrastructure team, you will have the opportunity to work closely with both research and product at OpenAI – building and scaling systems to train highly capable agentic models, and building the platform and integrations to launch new agents to hundreds of millions of users worldwide.

Your work will consist of both building new capabilities – standing up the infrastructure and integrations needed to train more complex agentic models – and rapidly scaling these new capabilities to some of the largest compute clusters in the world. At the same time, you’ll be instrumental to the launch of agentic products at OpenAI – building, maintaining, and scaling the production platform on which all agents run.

Responsibilities

  • Push massive compute clusters to their limits. You will be a core contributor to a novel container orchestration platform built in-house by our team to scale far beyond what’s possible with systems like Kubernetes.

  • Develop and maintain FastAPI and gRPC APIs that serve as the interface for our agentic infrastructure used both in training and production.

  • Use Terraform to stand up and evolve complex infrastructure for both research and production.

  • Collaborate with research teams to stand up and optimize systems for novel AI training runs and experimental applications.

Requirements

  • Have deep experience working on large-scale machine learning infrastructure. You know how to reason about training at scale, identifying bottlenecks and engineering solutions to optimize system performance in training environments.

  • Know how to build new things from 0-1 quickly, and then scale them 1,000,000x.

  • Have a keen eye for performance and optimization. You know how to squeeze the most performance out of complex, globally-distributed systems.

  • Know your way around cloud platforms and work with infrastructure-as-code tech like Terraform.

  • Are driven by solving complex, ambiguous problems at the intersection of infrastructure scalability, virtualization efficiency, and agentic capabilities.

  • Have deep technical expertise in virtualization and containerization technologies (e.g. Kata, Firecracker, gVisor, Sysbox) and are passionate about optimizing runtime performance.

What We Offer

  • Competitive salary and equity package

  • Opportunity to work on cutting-edge AI infrastructure

  • Collaborative and dynamic team environment

  • Flexible work arrangements

  • Professional development opportunities

  • Access to the latest technology and tools

How to Apply

If you are a motivated and experienced software engineer looking to join a dynamic team and work on cutting-edge AI infrastructure, please submit your application. We look forward to hearing from you!

Similar Jobs

Full-Time

Strategic Customer Success Manager

Synthesia
New York City
More Info
Full-Time

Software Engineer, Machine Learning

Synthesia
Europe
More Info
Full-Time

Software Engineer, Back End – Video Generation (Tech Lead Level)

Synthesia
London
More Info
Full-Time

Marketing Rev Ops Manager

Synthesia
London
More Info
Full-Time

GTM Methodology Lead

Synthesia
New York City
More Info
Full-Time

Customer Support Associate

Synthesia
US Remote
More Info

Receive the latest articles in your inbox

Join the Houtini Newsletter

Practical AI tools, local LLM updates, and MCP workflows straight to your inbox.