
ML Infrastructure Engineer, Safeguards

ML Infrastructure Engineer, Safeguards

ML Infrastructure Engineer, Safeguards
Anthropic
Anthropic is seeking a Machine Learning Infrastructure Engineer to join their Safeguards organization. The role focuses on building and scaling infrastructure for AI safety systems, ensuring that AI operates reliably and aligns with human values. The engineer will work on designing ML infrastructure, optimizing performance, and collaborating with research teams to implement safety measures.
Qualification
- 5+ years of experience building production ML infrastructure, ideally in safety-critical domains like fraud detection, content moderation, or risk assessment
- Proficient in Python and experienced with ML frameworks like PyTorch, TensorFlow, or JAX
- Hands-on experience with cloud platforms (AWS, GCP) and container orchestration (Kubernetes)
- Understanding of distributed systems principles and experience building systems that handle high-throughput, low-latency workloads
Responsibility
- Design and build scalable ML infrastructure to support real-time and batch classifier and safety evaluations across the model ecosystem
- Build monitoring and observability tools to track model performance, data quality, and system health for safety-critical applications
- Collaborate with research teams to productionize safety research, translating experimental safety techniques into robust, scalable systems
- Optimize inference latency and throughput for real-time safety evaluations while maintaining high reliability standards
- Implement automated testing, deployment, and rollback systems for ML models in production safety applications
- Partner with Safeguards, Security, and Alignment teams to understand requirements and deliver infrastructure that meets safety and production needs
- Contribute to the development of internal tools and frameworks that accelerate safety research and deployment




