
Senior Engineer, Network Observability

Senior Engineer, Network Observability
CoreWeave
CoreWeave is seeking a Senior Engineer for Network Observability to enhance their GPU cloud network's reliability through advanced monitoring and analytics. The role involves developing observability platforms using Python and Golang, collaborating with various teams to unify network data, and implementing scalable telemetry solutions.
Qualification
- Proficiency in Python and Golang for developing observability solutions.
- Experience with network observability tools and platforms.
- Familiarity with protocols such as gNMI and SNMP.
- Knowledge of streaming analytics and telemetry solutions.
- Experience with monitoring tools like Prometheus and Grafana.
Responsibility
- Develop, optimize, and maintain network observability platforms using Python and Golang.
- Create and automate collectors, exporters, and dashboards for network health visibility.
- Collaborate with Network Engineering and Platform teams to unify logs, metrics, and events into a single observability pipeline.
- Design and implement scalable telemetry solutions using protocols like gNMI, SNMP, and streaming analytics.
- Ensure advanced alerting and anomaly detection with tools such as Prometheus, Grafana, and Alertmanager.
- Integrate observability solutions across the broader infrastructure with network developers, site reliability engineers, and security teams.




