
Senior Software Engineer, Post-Training & RL Frameworks

Senior Software Engineer, Post-Training & RL Frameworks

Senior Software Engineer, Post-Training & RL Frameworks
Waymo
Waymo is seeking a Senior Software Engineer for its ML Frameworks & Efficiency team to enhance pre-trained models for autonomous driving. The role involves developing core training systems for reinforcement learning, collaborating with various teams, and improving model performance through innovative strategies.
Qualification
- B.S. in Computer Science, Math, or 8+ years equivalent real-world experience
- Proficient in distributed systems design with an understanding of ML efficiency
- Experience with ML frameworks and libraries
- Strong programming skills in Python or similar languages
- Familiarity with reinforcement learning algorithms and techniques
Responsibility
- Report into the Head of ML Frameworks & Efficiency
- Develop the core training system for adapting RL techniques to unprecedented scales and heterogeneous environments (i.e. CPU/GPU/TPU)
- Collaborate with teams to integrate the latest rollout strategies, policies, and RL algorithms (i.e. REINFORCE, DPO, PPO) into the system
- Improve the end-to-end RL training pipeline for efficient and scalable learners/actors, and low-latency distributed reply buffers for persisting data produced by the rollouts
- Build evaluations, analyze experimental results and iterate quickly to improve model performance and training workflows
- Stay current with the latest research in RL, Vision-Language-Action (VLA) models, and World models to inform and inspire new programs



