
Staff Software Engineer, Foundational Model Serving

Staff Software Engineer, Foundational Model Serving

Staff Software Engineer, Foundational Model Serving
Databricks
Databricks is seeking a Staff Software Engineer for their Foundation Model Serving team, which focuses on hosting and serving AI model inference for both open-source and proprietary models. The role involves designing and building systems for high-throughput, low-latency inference on GPU workloads, collaborating with various teams to enhance the product experience and infrastructure.
Qualification
- 10+ years of experience building and operating large-scale distributed systems.
- Experience with customer-facing APIs, Edge Gateways, or ML Inference services.
- Strong interest in building LLM APIs and runtimes at scale.
- Ability to influence architectural direction and make trade-offs for performance optimization.
- Experience in collaborating cross-functionally with various teams.
Responsibility
- Design and implement core systems and APIs for Databricks Foundation Model Serving, ensuring scalability and reliability.
- Partner with product and engineering leadership to define the technical roadmap and architecture for serving workloads.
- Drive architectural decisions to optimize performance, throughput, and operational efficiency for GPU serving workloads.
- Contribute to key components across the serving infrastructure, ensuring smooth operations at scale.
- Collaborate with product, platform, and research teams to translate customer needs into reliable systems.
- Establish best practices for code quality, testing, and operational readiness, mentoring other engineers.
- Represent the team in cross-organizational technical discussions and influence the broader AI platform strategy.



