
Research, Evals

Research, Evals
exa
Exa is seeking an ML Evals Engineer to design and build an evaluation stack for their AI-driven search engine. The role focuses on measuring search quality in the context of large language models (LLMs) and involves collaboration with various engineering teams to optimize search performance. This position is based in San Francisco and offers sponsorship for international candidates.
Qualification
- Hands-on experience with machine learning, particularly in training, fine-tuning, or evaluating models.
- Strong engineering fundamentals with proficiency in Python and Rust.
- Experience building reliable systems and distributed pipelines.
- Familiarity with GPU/cluster jobs and high-performance computing.
- Ability to analyze data and design creative measurement strategies.
Responsibility
- Design and build the evaluation stack for Exa's search engine.
- Investigate methods to evaluate search engines in the context of LLMs.
- Create comprehensive and effective evaluation suites for search optimization.
- Develop scalable and reliable evaluation pipelines to track regressions and quality signals.
- Collaborate with ML researchers and engineers to enhance search models.




