Opening. This role is a high-impact position for a technically curious operator who is equally comfortable debating a complex evaluation rubric with an engineer and communicating strategy to Fortune 500 customers. You must be obsessed with the gold standard for AI performance, from the high-level approach to the granular details of data quality.
What you'll do
As a Strategic Projects Lead (SPL) for Enterprise Evaluations, you will oversee the evaluations that determine if an application is ready for the real world. You will define "what good looks like" for complex GenAI apps, curate the data needed to measure performance, and serve as one of the final gatekeepers for production readiness.
What you need
- Strong technical background (ideal to have a degree in computer science and Python knowledge). At a minimum, the role requires the ability to do data analytics using SQL or Python. You should be comfortable leveraging tools to automate tasks, generate synthetic data, or analyze evaluation results.