Senior Cloud Support Engineer

CoreWeave•Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA

Full Timekubernetes docker python devops ci-cd ai machine-learning customer-support

Apply Now

Senior Cloud Support Engineer

CoreWeave•Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA

Full Timekubernetes docker python+5 more

Apply Now

Senior Cloud Support Engineer

CoreWeave

Apply Now

Senior Cloud Support Engineer at CoreWeave, a specialized AI-focused cloud provider delivering Kubernetes-powered HPC infrastructure for GPU workloads. The role centers on hands-on troubleshooting, incident response, customer success, and mentoring a 24/7/365 Engineering CX team to ensure high performance and reliability for AI training workloads.

Qualification

5+ years of experience in cloud/support engineering or a related field.
Hands-on experience with Kubernetes-based production environments and container orchestration.
Experience supporting GPU-enabled AI workloads and high-performance computing clusters.
Strong troubleshooting, incident management, and root-cause analysis capabilities.
Excellent customer-facing communication skills with proven ability to coach and mentor others.
Experience mentoring or leading a team or junior engineers in a fast-paced environment.
Ability to work in a 24/7 on-call shift environment with flexible scheduling.
Proficiency in scripting/automation (e.g., Python or Bash) for diagnostics and tooling.

Responsibility

Provide hands-on troubleshooting for GPU HPC cloud platforms powered by Kubernetes, resolving issues impacting AI training workloads and mission-critical applications.
Lead and mentor team members across CoreWeave disciplines, helping them develop technical skills and strengthen troubleshooting capabilities.
Deliver real-time feedback and coaching, review tickets, and identify opportunities for process and performance improvements.
Manage incident response and participate in on-call rotations to maintain service levels and minimize downtime.
Collaborate with data center, hardware, software engineering, and research teams to maintain platform integrity across data centers and client workloads.
Contribute to postmortems, root-cause analyses, and knowledge-base documentation to prevent recurrence of issues.
Optimize performance, reliability, and scalability of GPU workloads on Kubernetes-based HPC infrastructure; suggest and implement automation and tooling improvements.

Senior Cloud Support Engineer

Senior Cloud Support Engineer

Senior Cloud Support Engineer

Qualification

Responsibility

Similar Jobs

Senior Software Engineer, Machine Learning

Staff AI Engineer - AI Product

Enterprise Sales Engineer - Poland

Solutions Engineer (India Startup Program)

DataOps Engineer (AI Platform Engineer)

Similar Jobs

Similar Jobs

Senior Software Engineer, Machine Learning

Staff AI Engineer - AI Product

Enterprise Sales Engineer - Poland

Solutions Engineer (India Startup Program)

DataOps Engineer (AI Platform Engineer)