
Staff SRE Engineer

Staff SRE Engineer

Staff SRE Engineer
RDCCareers
Realtor.com® is seeking a Staff Site Reliability Engineer to join their Operations Excellence team. This role focuses on enhancing the reliability and operational excellence of their platform infrastructure, impacting millions of users. The position involves technical leadership, mentoring, and establishing best practices in a dynamic environment.
Qualification
- Proven experience in Site Reliability Engineering or related field
- Strong knowledge of AWS services, particularly EKS and Fargate
- Experience with CI/CD tools and practices
- Familiarity with observability tools such as NewRelic
- Understanding of reliability patterns and chaos engineering
- Ability to mentor and lead technical discussions
- Strong analytical skills for cost optimization and infrastructure management
Responsibility
- Design and maintain highly available AWS infrastructure including EKS clusters and Fargate (ECS)
- Own reliability of critical services such as Skyway (CI/CD), Frontdoor (Tyk), and Pantheon (Apollo GraphQL)
- Establish SLIs, SLOs, and error budgets for Tier 1/2/3 systems
- Lead architectural reviews for reliability and cost-efficiency
- Drive adoption of reliability patterns including circuit breakers and automated failover
- Build comprehensive observability using NewRelic for APM and distributed tracing
- Create actionable dashboards and alerts to reduce MTTD and MTTR
- Analyze infrastructure spend and implement FinOps practices




