
AI Engineer, AIOps & Infrastructure

AI Engineer, AIOps & Infrastructure

AI Engineer, AIOps & Infrastructure
EloquentAI
Eloquent AI is seeking a Senior Software Engineer specializing in AIOps & Infrastructure to design and optimize scalable AI infrastructure for enterprise AI agents. The role focuses on automating LLMOps and MLOps workflows, optimizing GPU workloads, and ensuring the reliability of AI systems in production. The company is a fast-growing AI firm based in San Francisco, transforming financial services with innovative AI solutions.
Qualification
- 5+ years of experience in software engineering, MLOps, or infrastructure development.
- Strong expertise in Kubernetes and experience managing containerized ML workloads.
- Deep understanding of cloud platforms (AWS, GCP, Azure) and distributed computing.
- Proficiency in Python, with experience in developing ML applications.
- Experience with monitoring and observability tools for AI systems.
Responsibility
- Design and build scalable ML infrastructure for deploying and maintaining AI agents in production.
- Automate LLMOps and MLOps workflows, ensuring seamless model training, fine-tuning, deployment, and monitoring.
- Optimize GPU and cloud compute workloads, improving efficiency and reducing latency for large-scale AI systems.
- Develop Kubernetes-based solutions, including custom operators for ML model orchestration.
- Improve system observability and reliability, implementing logging, monitoring, and performance tracking for AI models.
- Work with ML and engineering teams to streamline data pipelines, model serving, and inference optimizations.
- Ensure security, compliance, and reliability in AI infrastructure, maintaining high availability and scalability.
- Participate in on-call rotations, ensuring 24/7 reliability of critical AI systems.




