We're seeking a Technical Program Manager for Compute Infrastructure to join our engineer-first TPM team. As a Technical Program Manager, you will own the end-to-end delivery of large-scale GPU clusters, partnering with engineers to bring clusters online across external providers and partners. You'll run a broad, parallel portfolio spanning hardware, networking, power, and cooling,driving execution, risk management, and crisp alignment from working teams through leadership to deliver production-ready capacity at scale.
In this role, you will:
- Lead end-to-end delivery of both New Compute SKUs and large-scale GPU clusters across an external partner ecosystem while supporting capacity planning for training and inference.
- Ability to contextually drive multi-threaded bring-up programs spanning hardware, networking, power, and cooling,owning plans, dependencies, and critical paths.
- Interface with chip providers to derisk long-term onboarding to new hardware platforms by working across kernels, comms, hardware, and scheduling engineering teams.
- Build and operationalize program mechanisms (roadmaps, milestones, risk registers, runbooks) that make delivery predictable at massive scale.
- Partner with engineering to improve cluster turn-up reliability, repeatability, and automation, reducing time-to-serve for new capacity.
- Support network operations and end-to-end physical and logical bring-up of OpenAI network Points-of-Presence (PoPs), including on-site deployment, rack cabling, and close collaboration with engineering teams.
- Coordinate cross-functional readiness (security, finance, operations, product/research stakeholders) to ship production-ready compute.
- Manage integration and handoffs across teams and partners,ensuring consistent execution, clear communication, and fast issue resolution.
- Identify bottlenecks and systemic gaps, then drive durable fixes across tooling, process, and partner interfaces.
- Provide crisp executive visibility on progress, tradeoffs, and risks across a large portfolio of concurrent programs.
You might thrive in this role if you:
- Possess a degree in a hard science, or have a demonstrated track record of engineering expertise.
- Have 5+ years of experience in program management for major projects including capital projects or hyperscaler infrastructure deployment.
- Demonstrate the ability to serve as the go-to person solely responsible for driving and delivering complex projects.
- Are comfortable managing cross-functional and cross-company teams; experience driving information and decision hygiene.
- Have an extensive track record of successfully delivering high-profile, technical projects against tight deadlines.
- Are technically adept and have effectively partnered with engineering or fundamental research teams of the highest caliber.
- Have experience interfacing with and leading external vendors including engineering firms, equipment suppliers, and/or construction firms.
- Have expertise in designing and implementing simple, scalable processes that solve complex problems.
- Have experience managing complicated dependencies such as logistics and/or supply chains.
- Are relentlessly resourceful and thrive in ambiguous, fast-paced environments.
- Are interested in and thoughtful about the impacts of AGI.
XML job scraping automation by YubHub