
Member of Technical Staff - ML Research Engineer, Foundation Model Data

Member of Technical Staff - ML Research Engineer, Foundation Model Data

Member of Technical Staff - ML Research Engineer, Foundation Model Data
Liquid AI
Liquid AI, a company spun out of MIT, is seeking a Member of Technical Staff - ML Research Engineer to focus on foundation model data. The role involves developing high-quality text data for AI model training and requires expertise in dataset engineering and machine learning. The position offers the opportunity to work with cutting-edge technology in a collaborative environment.
Qualification
- B.S. + 5 years experience or M.S. + 3 years experience or Ph.D. + 1 year of experience
- Expertise in data curation, cleaning, augmentation, and synthetic data generation techniques
- Ability to write and debug models in popular ML frameworks and experience with LLMs
- Strong programming skills in Python, focusing on clean, maintainable, and scalable code
- M.S. or Ph.D. in Computer Science, Electrical Engineering, Math, or a related field
Responsibility
- Create and maintain data cleaning, filtering, and selection pipelines for handling over 100TB of data
- Monitor public dataset releases on platforms like Hugging Face
- Develop web crawlers to gather datasets when public data is insufficient
- Write and maintain synthetic data generation pipelines
- Conduct ablation studies to evaluate new datasets and judging pipelines




