Staff Research Engineer, Model Efficiency

Cohere•New York

FullTimeremote full-time machine-learning ai python tensorflow pytorch data-science

Apply Now

Staff Research Engineer, Model Efficiency

Cohere•New York

FullTimeremote full-time machine-learning+5 more

Apply Now

Staff Research Engineer, Model Efficiency

Cohere

Apply Now

Cohere is seeking a Staff Research Engineer for their Model Efficiency team, focused on enhancing the efficiency of Large Language Models (LLMs) in AI systems. The role involves developing and deploying techniques to improve model performance while maintaining quality, within a diverse and inclusive remote-friendly environment.

Qualification

PhD in Machine Learning or a related field.
Understanding of LLM architecture and optimization techniques.
Significant experience with model efficiency enhancement techniques.
Strong software engineering skills.
Experience with publications at top-tier conferences (ICLR, ACL, NeurIPS).
Ability to work in a fast-paced, high-ambiguity start-up environment.
Passion for mentoring others.

Responsibility

Develop, prototype, and deploy techniques to improve model efficiency in production.
Optimize LLM inference given resource constraints.
Explore model architecture and MoE routing optimization.
Implement decoding and inference-time algorithm improvements.
Collaborate on software/hardware co-design for GPU acceleration.

Staff Research Engineer, Model Efficiency

Staff Research Engineer, Model Efficiency

Staff Research Engineer, Model Efficiency

Qualification

Responsibility

Similar Jobs

Research Engineer, Machine Learning (Horizons)

Data Scientist

Systems Engineer, Open Architecture, Active Clearance

ML Infrastructure Engineer, Safeguards

Software Engineer I / II

Similar Jobs

Similar Jobs

Research Engineer, Machine Learning (Horizons)

Data Scientist

Systems Engineer, Open Architecture, Active Clearance

ML Infrastructure Engineer, Safeguards

Software Engineer I / II