OpenAI logo

Software Engineer, Inference – AMD GPU Enablement

OpenAISan Francisco
FullTimegpucudahip+5 more
Apply Now
OpenAI logo

Software Engineer, Inference – AMD GPU Enablement

OpenAI

Apply Now

OpenAI is seeking a Software Engineer for its Inference team to optimize and scale inference infrastructure on AMD GPU platforms. The role involves working on low-level kernel performance and high-level distributed execution, focusing on enhancing inference performance and collaborating with various teams to ensure smooth operation of AI models on new hardware.

Qualification

  • Experience writing or porting GPU kernels using HIP, CUDA, or Triton.
  • Familiarity with communication libraries like NCCL/RCCL.
  • Experience with distributed inference systems and scaling models across accelerators.
  • Problem-solving skills for end-to-end performance challenges across hardware and system libraries.
  • Excitement to work in a fast-moving team building new infrastructure from first principles.

Responsibility

  • Own bring-up, correctness and performance of the OpenAI inference stack on AMD hardware.
  • Integrate internal model-serving infrastructure (e.g., vLLM, Triton) into GPU-backed systems.
  • Debug and optimize distributed inference workloads across memory, network, and compute layers.
  • Validate correctness, performance, and scalability of model execution on large GPU clusters.
  • Collaborate with teams to design and optimize high-performance GPU kernels using HIP, Triton, or other frameworks.
  • Build, integrate, and tune collective communication libraries (e.g., RCCL) for parallel model execution.

Similar Jobs