As a Member of Technical Staff, Model Evaluation, you will be responsible for providing complete assessment of models, deep diving into model training and data to identify weakness points, and communicating with modeling and data teams to come up with plans to improve model quality.
What you'll do
- Provide complete assessment of models.
- Deep dive into model training and data to identify the weakness point revealed in evaluation.
- Communicate with modeling and data team to come up with plans to improve model quality.
What you need
- Model assessment and evaluation task development (including public and in-house benchmarking).
- Collect data and synthesize data for new evals.
- Build infrastructure and framework for easy-to-use model evaluation, familiarity with inference frameworks like SGlang and vLLM.
Why this matters
This role is critical to the success of xAI's mission to create accurate and reliable AI systems. By evaluating and improving our models, you will be contributing directly to the advancement of AI technology and its potential to aid humanity.