Site Reliability Engineer – US Government at xAI

Company xAI

Location Palo Alto, CA; Washington, D.C.

Sector Technology

Posted Posted 1 days ago

Job Description

We are seeking a Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our AI systems. You will work closely with our development team to identify and resolve issues, and implement solutions to improve system performance.

Responsibilities:

Design, implement, and maintain scalable and reliable systems
Collaborate with development teams to identify and resolve issues
Develop and maintain monitoring and alerting systems
Analyse system performance and identify areas for improvement
Implement changes to improve system reliability and scalability

Requirements:

Strong understanding of Linux and Unix operating systems
Experience with containerisation and orchestration tools such as Docker and Kubernetes
Familiarity with cloud-based infrastructure and services
Strong problem-solving and analytical skills
Excellent communication and collaboration skills

Benefits: