
Lead Machine Learning Engineer

Lead Machine Learning Engineer

Lead Machine Learning Engineer
ThoughtWorks
The Lead Machine Learning Engineer at Thoughtworks focuses on optimizing AI model inference for efficiency, speed, and cost-effectiveness across various environments. This role combines technical expertise with leadership, guiding teams through complex challenges and ensuring high-performing, sustainable AI solutions.
Qualification
- Deep technical capability in machine learning and inference optimization.
- Experience with advanced optimization techniques and model deployment.
- Strong leadership skills with the ability to mentor and guide teams.
- Proficiency in designing scalable inference systems and architectures.
- Familiarity with cloud, on-premises, and edge deployment environments.
Responsibility
- Lead the design and implementation of advanced model optimization pipelines, including quantization, pruning, and distillation.
- Architect and tune inference runtimes and serving frameworks to achieve optimal performance across deployments.
- Guide teams in implementing high-throughput serving strategies (continuous batching, KV caching, speculative decoding, asynchronous scheduling).
- Develop benchmarks and performance dashboards to measure and communicate system-level efficiency improvements (throughput, latency, GPU utilization, cost).
- Evaluate trade-offs across accuracy, performance, and cost, and design architectures to meet target SLAs across varied hardware environments (cloud, on-prem, edge).
- Collaborate with infrastructure, MLOps, and product teams to embed inference optimization into production workflows and platform designs.
- Provide technical leadership and mentorship to engineers, fostering a culture of experimentation, rigor, and continuous performance improvement.
- Engage with clients to translate optimization outcomes into business value and articulate the ROI of technical improvements.




