
GPU Kernel Engineer

GPU Kernel Engineer
Baseten
Baseten is seeking a GPU Kernel Engineer to enhance AI acceleration by optimizing GPU kernels for machine learning operations. The role involves working on high-performance computing tasks that directly impact AI model performance, contributing to innovative projects in a fast-paced environment.
Qualification
- Strong experience in GPU programming, particularly with CUDA and PTX assembly.
- Deep understanding of machine learning operations and performance optimization techniques.
- Experience with performance analysis tools such as Nsight Systems and Torch Profiler.
- Familiarity with advanced GPU features like tensor cores and quantization methods.
- Ability to work collaboratively in a fast-paced, innovative environment.
Responsibility
- Design and implement high-performance GPU kernels for key ML operations, including matrix multiplications and attention mechanisms.
- Write and optimize code using CUDA, PTX assembly, and architecture-specific techniques.
- Apply advanced performance optimization methods such as memory coalescing and tensor core acceleration.
- Implement cutting-edge features like quantization (FP8/FP4) and compute/communication overlap.
- Identify and resolve performance bottlenecks using tools like Nsight Systems and Torch Profiler.
- Collaborate with research teams to productionize theoretical advancements.




