Monks logo

Senior AI Inference Engineer

MonksLATAM; NAMER
Full Timepythonawskubernetes+5 more
Apply Now
Monks logo

Senior AI Inference Engineer

Monks

Apply Now

The Senior AI Inference Engineer at Monks will design and deliver advanced AI systems for clients in Media, Entertainment, Gaming, and Sports. This role involves transforming complex business problems into scalable AI architectures, focusing on real-time video processing and multi-modal content interpretation. The engineer will manage the full lifecycle of AI inference solutions, from discovery to optimization on cloud infrastructure.

Qualification

  • Proficiency in Python and experience with AI inference services.
  • Strong understanding of Vision Language Models and their integration into workflows.
  • Experience with LLM/agent orchestration frameworks.
  • Familiarity with Kubernetes and cloud infrastructure, particularly AWS.
  • Knowledge of modern NVIDIA GPU architectures and optimization techniques.

Responsibility

  • Architect, implement, and optimize end-to-end AI inference services and agentic pipelines in Python.
  • Design autonomous agents that can interpret, reason about, and act on video and multi-modal content.
  • Integrate Vision Language Models (e.g., GPT-4o, Gemini Pro Vision, LLaVA) into robust, production-grade workflows.
  • Leverage LLM/agent orchestration frameworks (e.g., LangGraph, AutoGen, Semantic Kernel) to coordinate complex visual AI tasks.
  • Deploy and operate services on Kubernetes, ensuring reliability and scalability under heavy media workloads.
  • Architect distributed systems on AWS, making informed trade-offs across performance, cost, and resilience.
  • Optimize workloads for modern NVIDIA GPU architectures, focusing on real-time and high-throughput media use cases.
  • Collaborate directly with clients in MEGS, participating in pre-sales discussions to validate feasibility and shape solutions.

Similar Jobs