
Member of Technical Staff, Synthetic Data

Member of Technical Staff, Synthetic Data
Cohere
Cohere is seeking a Machine Learning Engineer specializing in synthetic data to develop and manage synthetic data pipelines for advanced language models. The role involves optimizing data processes, conducting data analysis, and collaborating with cross-functional teams to enhance natural language processing capabilities. Cohere values diverse perspectives and offers a remote-friendly work environment.
Qualification
- Strong software engineering skills, with proficiency in Python.
- Experience building data pipelines.
- Familiarity with data processing and analysis techniques.
- Understanding of machine learning concepts and natural language processing.
- Ability to work collaboratively in a fast-paced environment.
Responsibility
- Design and build scalable inference pipelines that run on large GPU clusters.
- Conduct data ablations to assess data quality and experiment with data mixtures to enhance model performance.
- Research and implement innovative synthetic data curation methods.
- Collaborate with cross-functional teams, including researchers and engineers, to ensure data pipelines meet the demands of cutting-edge language models.
- Maintain and optimize the synthetic data pipeline.



