Databricks logo

Staff Software Engineer - GenAI inference

DatabricksSan Francisco, California
Full Timeaimachine-learningpython+5 more
Apply Now
Databricks logo

Staff Software Engineer - GenAI inference

Databricks

Apply Now

The Staff Software Engineer for GenAI inference will lead the architecture and development of the inference engine for Databricks Foundation Model API, focusing on optimizing performance for large-scale LLMs. The role involves collaboration with researchers, cross-functional teams, and external representation through benchmarks and contributions.

Qualification

  • BS/MS/PhD in Computer Science or a related field
  • Strong software engineering background (6+ years) in performance-critical systems
  • Proven track record of owning complex system components and driving architectural decisions
  • Deep understanding of ML inference internals including attention, MLPs, and quantization
  • Hands-on experience with CUDA, GPU programming, and key libraries (cuBLAS, cuDNN, NCCL)
  • Strong background in distributed systems design including RPC frameworks and memory partitioning
  • Demonstrated ability to uncover and solve performance bottlenecks across layers

Responsibility

  • Own and drive the architecture, design, and implementation of the inference engine optimized for large-scale LLMs inference
  • Partner closely with researchers to integrate new model architectures or features into the engine
  • Lead end-to-end optimization for latency, throughput, memory efficiency, and hardware utilization across GPUs and accelerators
  • Define and guide standards for instrumentation, profiling, and tracing tooling to identify bottlenecks
  • Architect scalable routing, batching, scheduling, memory management, and dynamic loading mechanisms for inference workloads
  • Ensure reliability, reproducibility, and fault tolerance in inference pipelines, including A/B launches and model versioning
  • Collaborate on integrating with federated, distributed inference infrastructure
  • Drive cross-team collaboration with platform engineers, cloud infrastructure, and security/compliance teams
  • Represent the team externally through benchmarks, whitepapers, and open-source contributions

Similar Jobs