About Mistral AI
At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.
We are a company that democratizes AI through high-performance, optimized, open-source and cutting-edge models, products and solutions. Our comprehensive AI platform is designed to meet enterprise needs, whether on-premises or in cloud environments.
Our offerings include le Chat, the AI assistant for life and work. We are a team passionate about AI and its potential to transform society.
Role Summary
Our compute footprint is growing fast to support our science and engineering teams. We’re hiring a Datacenter HW Engineer to maintain, troubleshoot, and scale our GPU/CPU clusters safely and reliably.
What you will do
- Diagnose & operate core server/cluster components – Investigate and handle compute/storage hardware issues (CPU, memory, drives, NICs, GPUs, PSUs) and interconnect problems (switches, cables, transceivers; Ethernet/InfiniBand).
- Safety & procedures – Apply lockout/tagout (LOTO) and ESD discipline; follow pre/post-work checklists; maintain tidy, safe work areas.
- First-line diagnostics – Triage using LEDs, POST, beep codes and basic tests; capture evidence (photos, serials, results); open/update/close tickets with clear notes.
- Preventive maintenance – Provide feedback and ideas to improve proactive activities, monitoring, and targeted follow-ups on recurring or specific anomalies; help turn ad-hoc checks into SOPs, alerts, and dashboards.
- Parts & logistics – Receive and track parts, keep labeled inventory accurate, manage simple RMAs, and coordinate with vendors.
- Collaboration & escalation – Partner with senior hardware/firmware owners on complex or multi-node issues; communicate status and next steps crisply.
- Documentation & quality – Keep SOPs/checklists current; ensure zero undocumented changes and consistent, audit-ready records.
About you
- Hands-on mindset in datacenters/server hardware: you can install/re-seat/swap GPU/PCIe cards, NICs, PSUs, drives, and work cleanly in racks (rails, cabling, labeling).
- Disciplined and meticulous: follows checklists, ESD/LOTO; no rough handling; careful with all high-value server components.
- Practical electrical basics: power-off, PPE, short-circuit risk awareness.
- Comfortable in racks: cooling, network, storage, PDU, cable management; can lift/mount safely (within HSE limits).
- Clear communicator: short factual updates; reliable teammate; punctual and process-minded.
- Hardware-passionate, professionally grounded: strong curiosity and craft mindset.
Nice to have
- HPC/AI/Cloud at scale experience (production environments), large-fleet/server install & maintenance in datacenters.
- Basic networking (Ethernet/InfiniBand) and basic Linux (boot/check; no coding needed).
- Coding/automation skills (Python/Bash): small tools/scripts to improve checklists, photo/serial capture, inventory sync, or simple monitoring/reporting.
- Experience with inventory/RMA tools and vendor coordination.
- Exposure to HPC/research/industrial environments.
What we offer
- Competitive salary and equity package
- Health insurance
- Transportation allowance
- Sport allowance
- Meal vouchers
- Private pension plan
- Generous parental leave policy
XML job scraping automation by YubHub