Skip to content
Houtini.
Contact
Mistral AI
Mistral AI

Applied AI Engineer, Site Reliability Engineer - EMEA

Paris Applied AI engineering Senior Posted 3d ago

Apply at source. Mistral AI handles the application directly; Houtini doesn't take a fee from candidates or companies. We curate which companies appear; the listings come from yubhub.

Role description

What the team is looking for.

At Mistral AI, we're looking for an experienced Site Reliability Engineer to join our Applied AI team. As a key member of our team, you will be responsible for building and operating the framework to ensure our solution delivery is reliable and sustainable across all our accounts.

Your mission will be to design, build, and operate the infrastructure to support our AI solutions, ensuring they are scalable, secure, and aligned with customer needs. You will work closely with our development team to identify and resolve issues, and collaborate with our technical support team to provide excellent customer service.

In this role, you will operate in four concurrent modes:

  • BUILD: Design for a fleet of Mistral platforms and apps. Build proactivity to reduce reactivity. Productize reliability, author runbooks, create SLO templates, implement observability.
  • RUN: Operate the Tier-1 customer environments that Mistral are contracted to operate. Ensure SLO compliance, own on-call and incident response, manage drift, partner with Technical Support as L3 escalation, champion high signal post-mortems.
  • ENABLE: Productize how Mistral deploy, secure, and scale our Applied AI solutions. Engineer on-demand provisioning, author security baseline packages, embed security guardrails, automate everything.
  • SECURE: Own the security operations layer for our customer-side deployments. Lead CVE response across the fleet, ship supply-chain integrity controls (SBOM, signed images, provenance), co-page with InfoSec on security incidents, enforce secure-config baselines.

This is a framework-first, fleet management role at heart. If you're excited by the difference between solving one customer's problem and structurally solving the class of problem for every customer, this is the role.

Skills mentioned
  • multi-tenant Kubernetes
  • namespace segmentation
  • network policy
  • RBAC
  • admission control
  • operations at scale
  • observability stack
  • Prometheus
  • Grafana
  • OpenTelemetry
  • Loki
  • Tempo
  • Signoz
  • infrastructure as code
  • Terraform
  • Ansible
  • Python
  • Golang
  • security mindset
  • secure-SDLC
  • CVE response
  • supply-chain integrity