Monks logo

Site Reliability Engineer (SRE) - Media Production Infrastructure

MonksCupertino, California, United States
Full Timeon-sitefull-timedevops+5 more
Apply Now
Monks logo

Site Reliability Engineer (SRE) - Media Production Infrastructure

Monks

Apply Now

Monks is looking for a highly skilled Site Reliability Engineer (SRE) to join their Platform Engineering team, supporting a media production environment for a global technology company. The role focuses on ensuring high availability, performance, and resilience of critical systems, with responsibilities including infrastructure management, storage expertise, and monitoring.

Qualification

  • 14+ years of experience in Site Reliability Engineering or related fields.
  • Strong expertise in Storage Area Network (SAN) management and troubleshooting.
  • Proficient in networking and system administration, including DNS and directory services.
  • Experience with monitoring tools and creating custom dashboards for system observability.
  • Ability to provide on-site and remote support in a 24/7 operational environment.

Responsibility

  • Maintain and troubleshoot all production hardware, servers, and storage infrastructure, focusing on the Storage Area Network (SAN).
  • Execute maintenance and support for the SAN environment, including firmware/software updates for fiber switches, RAIDs, and ape systems.
  • Manage Directory services, network services (DNS, static IPs, subnet masks), and configure shares and permissions on the SAN.
  • Manage and improve custom dashboards for 24/7 monitoring of systems, RAIDs, temperature sensors, and backup/archive processes.
  • Contribute to the development and maintenance of custom applications and dashboards that support media workflows.
  • Provide active on-site support and participate in a 24/7 on-call rotation for critical interventions.
  • Manage the Backup and Archive environment, maintain tape systems, and prepare projects for archiving to the cloud.

Similar Jobs