
Sr. IT Linux Site Reliability Engineer

Sr. IT Linux Site Reliability Engineer

Sr. IT Linux Site Reliability Engineer
SpaceX
SpaceX is seeking a Sr. Linux Site Reliability Engineer to join their Information Technology Linux Infrastructure team. The role focuses on Kubernetes and containerized technologies, requiring expertise in design, maintenance, scaling, and optimization to support critical business functions. The ideal candidate should thrive in a fast-paced environment and demonstrate self-motivation and ingenuity.
Qualification
- Bachelor’s degree in Computer Science or a STEM discipline and 5+ years of systems engineering experience; OR 7+ years of systems engineering experience in lieu of a degree.
- Experience deploying and supporting Kubernetes in production environments.
- Strong knowledge of containerization technologies and orchestration tools.
- Experience with automation tools such as Ansible and Terraform.
- Ability to work collaboratively in a diverse team environment and communicate effectively with internal business units.
Responsibility
- Install, manage, scale and optimize Kubernetes and RKE clusters using Ansible, Terraform and adjacent technologies in production environments.
- Work closely with other SpaceX engineers to gather requirements, research, evaluate, design, plan, deploy, and support software platforms and related technologies running in Kubernetes.
- Build highly resilient, high-performance, scalable, and robust systems.
- Make recommendations, justify, and implement improvements using an accepted change control methodology.
- Define, document and follow standards and best practices for systems design, testing, and implementation.
- Foster an environment of collaboration and cross-training, upskilling the team in Kubernetes expertise.
- Drive scripting, self-service and automation to develop solutions to reduce administrative overhead and TOIL.
- Participate in on-call rotation to handle urgent after-hours work when necessary.




