ML Platform Engineer
Apply at source. Synthesia handles the application directly; Houtini doesn't take a fee from candidates or companies. We curate which companies appear; the listings come from yubhub.
What the team is looking for.
Synthesia is the world’s leading AI video platform for business, used by over 90% of the Fortune 100.
As AI continues to shape the way we live and work, Synthesia develops products to enhance visual communication and enterprise skill development, helping people work better and stay at the center of successful organizations.
We're looking for an Engineer to join the ML Platform team at Synthesia.
Our team builds and operates the systems that allow researchers and product teams to train, serve, and deploy generative models reliably and efficiently. This includes research infrastructure, production serving systems, internal tooling, and the platform interfaces that connect them.
What you’ll do
- Design and improve the platform systems that support model training, evaluation, and production serving.
- Build infrastructure and tooling that make ML workloads more reliable, scalable, and cost-efficient.
- Develop internal tools and workflows that are easy to operate both by humans and by agents.
- Work on the architecture behind how models are deployed, served, and operated across research and product environments.
- Improve how we schedule, monitor, and debug workloads running on GPUs and cloud infrastructure.
- Develop internal tools and abstractions and agentic systems that reduce operational overhead for researchers and engineers.
- Drive improvements across observability, automation, reliability, and developer experience.
- Collaborate closely with researchers and product engineers to understand pain points and turn them into robust platform capabilities.
- Contribute to technical direction and make pragmatic architectural tradeoffs as the platform grows.
You’ll thrive in this role if you have
- Strong experience building or operating production systems with a focus on reliability, scalability, and maintainability.
- A systems mindset: you naturally think in terms of bottlenecks, failure modes, interfaces, resource usage, and long-term operability.
- Solid hands-on experience with cloud infrastructure, Linux, and infrastructure automation.
- Experience with Kubernetes and operating distributed workloads in production.
- Strong coding skills, ideally in Python or similar languages used for backend systems and tooling.
- Strong judgment around where automation adds leverage, and where human control and reliability matter most.
- Experience building internal platforms, developer tooling, or infrastructure abstractions used by other engineers.
- Comfort working in ambiguous environments and taking ownership of open-ended technical problems.
- A pragmatic approach: you care about solving the right problem well, not over-engineering.
- Python
- cloud infrastructure
- Linux
- infrastructure automation
- Kubernetes
- distributed systems
- observability
- debugging
- automation
- agentic systems
- LLM-powered internal tools
- workflow orchestration systems
- performance optimization
- scheduling
- resource allocation
Other roles you might consider.
Filtered through the same AI-companies allowlist.
Enterprise Solutions Engineer - Chile
ElevenLabs
Deployment Strategist - Chile
ElevenLabs
Enterprise Solutions Engineer - Chile
ElevenLabs
Deployment Strategist - Chile
ElevenLabs
Enterprise Solutions Engineer - Chile
ElevenLabs
Technical Specialist, Claude Code
Anthropic
New to AI work? Start with these.
Six pieces of orientation. Most AI-company job specs assume you've done this kind of hands-on work already. If you haven't, an afternoon with one of these is the cheapest way to close the gap.
Claude Desktop, from zero.
The agentic-AI assistant most of the people you'd be working alongside use every day. Install, configure, first useful prompts.
What MCPs areThe best MCPs for Claude Desktop.
MCP servers extend an AI assistant with tools and data. The catalogue most teams use. Useful technical context for any AI-engineering role.
Code with AIClaude Code, the complete beginners' guide.
The CLI for AI-paired development. Required reading if you're applying for any engineering role that mentions agents, or any role full stop.
Run a local modelHow to set up LM Studio.
Running a model on your own machine teaches you more about how AI products work in three hours than a year of using ChatGPT will.
The hardware realityBeginner's guide to AI hardware.
What the infrastructure under the model actually looks like. Useful context for infrastructure, applied-AI and hardware roles.
Browse the stackMCP catalogue.
Eleven MCP servers Houtini maintains or recommends. Each detail page describes a real piece of working AI infrastructure.