AI Inference Engineer (London)

Perplexity•London

FullTimepython rust c++machine-learning ai pytorch kubernetes cuda

Apply Now

AI Inference Engineer (London)

Perplexity•London

FullTimepython rust c+++5 more

Apply Now

AI Inference Engineer (London)

Perplexity

Apply Now

We are seeking an AI Inference Engineer to join our team in London, focusing on the deployment of machine learning models for real-time inference. The role involves developing APIs, optimizing inference stacks, and enhancing system reliability using technologies like Python, Rust, C++, and PyTorch.

Qualification

Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX)
Familiarity with LLM architectures and inference optimization techniques
Understanding of GPU architectures
Experience with GPU kernel programming using CUDA
Proficiency in Python, Rust, and C++

Responsibility

Develop APIs for AI inference for internal and external customers
Benchmark and address bottlenecks in the inference stack
Improve reliability and observability of systems
Respond to system outages
Explore and implement LLM inference optimizations

AI Inference Engineer (London)

AI Inference Engineer (London)

AI Inference Engineer (London)

Qualification

Responsibility

Similar Jobs

AI Research Engineer

HPC Engineer, AI and Data

Staff AI Engineer - AI Product

Research Engineer, Machine Learning (Horizons)

ML Infrastructure Engineer, Safeguards

Similar Jobs

Similar Jobs

AI Research Engineer

HPC Engineer, AI and Data

Staff AI Engineer - AI Product

Research Engineer, Machine Learning (Horizons)

ML Infrastructure Engineer, Safeguards