Member of Technical Staff, Model Evaluation at xAI

Company xAI

Location Palo Alto

Salary Competitive salary

Posted Posted 0 days ago

Job Description

As a Member of Technical Staff, Model Evaluation, you will be responsible for providing complete assessment of models, deep diving into model training and data to identify weakness points, and communicating with modeling and data teams to come up with plans to improve model quality.

What you'll do

Provide complete assessment of models.
Deep dive into model training and data to identify the weakness point revealed in evaluation.
Communicate with modeling and data team to come up with plans to improve model quality.

What you need

Model assessment and evaluation task development (including public and in-house benchmarking).
Collect data and synthesize data for new evals.
Build infrastructure and framework for easy-to-use model evaluation, familiarity with inference frameworks like SGlang and vLLM.

Why this matters

This role is critical to the success of xAI's mission to create accurate and reliable AI systems. By evaluating and improving our models, you will be contributing directly to the advancement of AI technology and its potential to aid humanity.

Similar Jobs

Full-Time

Customer Success Associate (Comet Browser)

Perplexity

New York City, Belgrade, London

More Info

Full-Time

Data Scientist, Evals

Perplexity

London

More Info

Full-Time

Tech Lead Manager – Agents

Perplexity

San Francisco

More Info

Full-Time

Forward-Deployed Engineer – API Platform

Perplexity AI

New York City, London, San Francisco, Seattle

More Info

Full-Time

Business Development Representative

Perplexity

San Francisco, New York City

More Info

Full-Time

Engineering Site Lead

Perplexity

London

More Info

Job Description

Similar Jobs

Customer Success Associate (Comet Browser)

Data Scientist, Evals

Tech Lead Manager – Agents

Forward-Deployed Engineer – API Platform

Business Development Representative

Engineering Site Lead

Receive the latest articles in your inbox

Join the Houtini Newsletter

Building the Agentic Stack.