Full-Time

Research Engineer, Interpretability at Anthropic

Company Anthropic
Location San Francisco
Salary Competitive salary
Posted Posted 0 days ago

Job Description

Opening. This role is to reverse-engineer how trained models work because we believe that a mechanistic understanding is the most robust way to make advanced systems safe.

What you'll do

The Interpretability team at Anthropic is working to reverse-engineer how trained models work. We're looking for researchers and engineers to join our efforts.

  • Implement and analyze research experiments, both quickly in toy scenarios and at scale in large models
  • Set up and optimize research workflows to run efficiently and reliably at large scale
  • Build tools and abstractions to support rapid pace of research experimentation
  • Develop and improve tools and infrastructure to support other teams in using Interpretability's work to improve model safety

What you need

  • Have 5-10+ years of experience building software
  • Are highly proficient in at least one programming language (e.g., Python, Rust, Go, Java) and productive with python
  • Have some experience contributing to empirical AI research projects
  • Have a strong ability to prioritize and direct effort toward the most impactful work and are comfortable operating with ambiguity and questioning assumptions.

Similar Jobs

Full-Time

Facilities Maintenance Assistant

xAI
Memphis
More Info
Full-Time

Power Generation Engineer

xAI
Memphis
More Info
Full-Time

Facilities Operations Manager

xAI
Southaven, MS
More Info
Full-Time

Receiving and Logistics Clerk

xAI
Memphis
More Info
Full-Time

Electrical Engineer (EIT)

xAI
Memphis
More Info

Receive the latest articles in your inbox

Join the Houtini Newsletter

Practical AI tools, local LLM updates, and MCP workflows straight to your inbox.