Cohere logo

Software Engineer, Internal Infrastructure (Europe & UK)

CohereUnited Kingdom
FullTimekubernetesgopython+5 more
Apply Now
Cohere logo

Software Engineer, Internal Infrastructure (Europe & UK)

Cohere

Apply Now

Cohere is seeking a Software Engineer for its Internal Infrastructure team to build and operate Kubernetes GPU superclusters across multiple clouds, supporting AI researchers in optimizing infrastructure for AI workloads. The role emphasizes collaboration, stability, scalability, and observability in developing AI models.

Qualification

  • Deep experience running Kubernetes clusters at scale
  • Strong programming skills in Go or Python
  • Experience with Cloud Native infrastructure and Infrastructure as Code
  • Preference for contributing to Open Source solutions
  • Self-directed, adaptable, and strong problem-solving skills

Responsibility

  • Build and operate Kubernetes compute superclusters across multiple clouds
  • Partner with cloud providers to optimize infrastructure costs, performance, and reliability for AI workloads
  • Work closely with research teams to understand their infrastructure needs and improve stability, performance, and efficiency of model training techniques
  • Design and build resilient, scalable systems for training AI models with intuitive user interfaces
  • Encourage software best practices and participate in team processes such as knowledge sharing, reviews, and on-call duties

Similar Jobs