

Software Engineer, Distributed Data Systems

Software Engineer, Distributed Data Systems
exa
Exa is building a search engine from scratch to serve every AI application. We build massive-scale infrastructure to crawl the web, train state-of-the-art embedding models to index it, and develop super high performant vector databases in Rust to search over it. We also own a $5M H200 GPU cluster that regularly lights up tens of thousands of machines.
As a Data Engineer, you'll architect and build the data infrastructure that powers everything we do—from crawling billions of pages to training our embedding models to serving real-time search. You'll have enormous autonomy in designing systems that scale to hundreds of petabytes. If you've ever wanted to build data pipelines at a scale that most companies only dream about, this is your chance.
Desired Experience
- Deep understanding of lakehouse architectures (Delta Lake, Iceberg, Hudi) and when to use them
- Experience building and operating large-scale distributed data processing pipelines
- Hands-on experience with streaming data systems (Kafka, Flink, or similar)
- Familiarity with Ray, Spark, or ClickHouse at production scale
- An obsessive focus on reliability and building systems that don't page you at 3am
Bonus Points
- Experience with Lance or other vector-native storage formats
- Background in GPU-accelerated data processing (RAPIDS, cuDF)
Example Projects
- Design a lakehouse architecture that handles 100+ PB of web crawl data
- Build streaming pipelines that process billions of documents per day for real-time indexing
- Architect the data layer for our embedding training infrastructure on Ray
- Scale our ClickHouse deployment to handle analytical queries across petabytes of search logs
This is an in-person opportunity in San Francisco. We're happy to sponsor international candidates (e.g.,
Stem Opt
, OPT, H1B, O1, E3). In addition to premium healthcare benefits (medical, dental, vision), we also offer fertility benefits and a monthly wellness stipend to all of our employees.



