Software Engineer - GenAI inference

Databricks•San Francisco, California

Full Timepython machine-learning cuda gpu distributed-systems performance-optimization ai tensorflow

Apply Now

Software Engineer - GenAI inference

Databricks•San Francisco, California

Full Timepython machine-learning cuda+5 more

Apply Now

Software Engineer - GenAI inference

Databricks

Apply Now

The Software Engineer for GenAI inference at Databricks will design, develop, and optimize the inference engine for the Foundation Model API, focusing on large language model (LLM) serving systems. The role involves collaboration with researchers and cross-functional teams to enhance performance and scalability of the inference stack.

Qualification

BS/MS/PhD in Computer Science or a related field
3+ years of experience in performance-critical systems
Solid understanding of ML inference internals including attention, MLPs, and quantization
Hands-on experience with CUDA and GPU programming
Comfortable designing and operating distributed systems including RPC frameworks and memory partitioning
Ability to uncover and solve performance bottlenecks across various layers
Experience building instrumentation and profiling tools for ML models
Ability to collaborate closely with ML researchers

Responsibility

Contribute to the design and implementation of the inference engine optimized for large-scale LLMs inference
Collaborate with researchers to integrate new model architectures and features into the engine
Optimize latency, throughput, memory efficiency, and hardware utilization across GPUs and accelerators
Build and maintain instrumentation, profiling, and tracing tools to identify bottlenecks
Develop scalable routing, batching, scheduling, memory management, and dynamic loading mechanisms for inference workloads
Support reliability, reproducibility, and fault tolerance in inference pipelines
Integrate with federated, distributed inference infrastructure and manage communication overhead
Document and share learnings, contributing to internal best practices and open-source efforts

Software Engineer - GenAI inference

Software Engineer - GenAI inference

Software Engineer - GenAI inference

Qualification

Responsibility

Similar Jobs

Research Engineer, Machine Learning (Horizons)

Software Engineer, Machine Learning

Lead Machine Learning Engineer

Machine Learning Engineer, Simulation Realism

AI Engineer

Similar Jobs

Similar Jobs

Research Engineer, Machine Learning (Horizons)

Software Engineer, Machine Learning

Lead Machine Learning Engineer

Machine Learning Engineer, Simulation Realism

AI Engineer