Member of Technical Staff, Model Efficiency

Cohere•New York

FullTimeremote full-time python c++machine-learning gpu cuda ai

Apply Now

Member of Technical Staff, Model Efficiency

Cohere•New York

FullTimeremote full-time python+5 more

Apply Now

Member of Technical Staff, Model Efficiency

Cohere

Apply Now

Cohere is seeking a Member of Technical Staff for their Model Efficiency team, focused on enhancing the performance of machine learning systems, particularly in the context of large language models (LLMs). The role involves optimizing model execution, improving performance metrics, and collaborating with various teams to implement innovative solutions. The company values diversity and offers a remote-friendly work environment.

Qualification

5+ years of experience in high-performance, production-quality code
Strong programming skills in C++ or Python (Rust/Go also welcome)
Experience with large language models and LLM inference ecosystem
Ability to diagnose and resolve performance bottlenecks
Strong bias for action and ability to iterate quickly

Responsibility

Improve core performance metrics of ML models
Identify bottlenecks in model execution
Develop optimizations for lower latency and higher throughput
Collaborate with modeling and systems teams
Experiment, measure, and ship improvements in inference efficiency

Member of Technical Staff, Model Efficiency

Member of Technical Staff, Model Efficiency

Member of Technical Staff, Model Efficiency

Qualification

Responsibility

Similar Jobs

Systems Engineer, Open Architecture, Active Clearance

Data Scientist

Software Engineer I / II

Tech Lead, LLM & Generative AI (Full Remote - Poland)

Software Engineer, ML Tools

Similar Jobs

Similar Jobs

Systems Engineer, Open Architecture, Active Clearance

Data Scientist

Software Engineer I / II

Tech Lead, LLM & Generative AI (Full Remote - Poland)

Software Engineer, ML Tools