Research Engineer, Interpretability at Anthropic

Company Anthropic

Location San Francisco

Salary Competitive salary

Posted Posted 0 days ago

Job Description

Opening. This role is to reverse-engineer how trained models work because we believe that a mechanistic understanding is the most robust way to make advanced systems safe.

What you'll do

The Interpretability team at Anthropic is working to reverse-engineer how trained models work. We're looking for researchers and engineers to join our efforts.

Implement and analyze research experiments, both quickly in toy scenarios and at scale in large models
Set up and optimize research workflows to run efficiently and reliably at large scale
Build tools and abstractions to support rapid pace of research experimentation
Develop and improve tools and infrastructure to support other teams in using Interpretability's work to improve model safety

What you need

Have 5-10+ years of experience building software
Are highly proficient in at least one programming language (e.g., Python, Rust, Go, Java) and productive with python
Have some experience contributing to empirical AI research projects
Have a strong ability to prioritize and direct effort toward the most impactful work and are comfortable operating with ambiguity and questioning assumptions.

Similar Jobs

Full-Time

Site Ops Lead

xAI

Memphis

More Info

Full-Time

Facilities Maintenance Assistant

xAI

Memphis

More Info

Full-Time

Power Generation Engineer

xAI

Memphis

More Info

Full-Time

Facilities Operations Manager

xAI

Southaven, MS

More Info

Full-Time

Receiving and Logistics Clerk

xAI

Memphis

More Info

Full-Time

Electrical Engineer (EIT)

xAI

Memphis

More Info

Job Description

Similar Jobs

Site Ops Lead

Facilities Maintenance Assistant

Power Generation Engineer

Facilities Operations Manager

Receiving and Logistics Clerk

Electrical Engineer (EIT)

Receive the latest articles in your inbox

Join the Houtini Newsletter

Building the Agentic Stack.