About the Role
The Sandbox service team at xAI builds and maintains a secure, scalable system that gives our models safe, controlled access to computational environments.
This infrastructure powers critical workloads across training and product, enabling models to run code, build software, interact with tools, and even control applications with user interfaces.
We provision containers and virtual machines on large-scale clusters, granting models interactive control over these remote environments.
Our work spans the full stack: from orchestrating massive jobs and resource scheduling at the cluster level, to fine-tuning filesystem performance on nodes.
Responsibilities
- Build and maintain a secure, scalable system that gives our models safe, controlled access to computational environments
- Provision containers and virtual machines on large-scale clusters, granting models interactive control over these remote environments
- Work on the full stack: from orchestrating massive jobs and resource scheduling at the cluster level, to fine-tuning filesystem performance on nodes
Requirements
- Expert knowledge of Rust, C++ or Go
- Familiarity with Python
- Deep experience with either Linux or Windows systems (familiarity with both is a strong plus)
- Experience with virtualisation and containerisation technologies (e.g., cgroups, KVM, gVisor, QEMU)
- Solid knowledge of the networking stack
XML job scraping automation by YubHub