Fabrion logo

Data Engineer (Founding Team)

FabrionSan Francisco Bay Area
Apply Now
Fabrion logo

Data Engineer (Founding Team)

Fabrion

Apply Now

The Data/ETL Engineer role is part of a founding team at a startup backed by 8VC, focused on creating an AI-native platform that transforms enterprise data into actionable insights. The position involves building scalable data pipelines, connector frameworks, and knowledge graphs to support advanced data workflows and AI applications.

Qualification

  • 5+ years building large-scale data infrastructure in production environments
  • Deep experience with ingestion frameworks (Kafka, Airbyte, Meltano, Fivetran) and data pipeline orchestration (Airflow, Dagster, Prefect)
  • Comfortable processing unstructured data formats: PDFs, Excel, emails, logs, CSVs, web APIs
  • Experience working with columnar stores, object storage, and lakehouse formats (Iceberg, Delta, Parquet)
  • Strong background in knowledge graphs or semantic modeling (e.g. Neo4j, RDF, Gremlin, Puppygraph)
  • Familiarity with GraphQL, RESTful APIs, and designing developer-friendly data access layers
  • Experience implementing data governance: RBAC, ABAC, data contracts, lineage, data quality checks

Responsibility

  • Build highly reliable, scalable data ingestion and transformation pipelines across structured, semi-structured, and unstructured data sources
  • Develop and maintain a connector framework for ingesting from enterprise systems (ERPs, PLMs, CRMs, legacy data stores, email, Excel, docs, etc.)
  • Design and maintain the data fabric layer, including a knowledge graph enriched with ontologies, metadata, and relationships
  • Normalize and vectorize data for downstream AI/LLM workflows, enabling retrieval-augmented generation (RAG), summarization, and alerting
  • Create and manage data contracts, access layers, lineage, and governance mechanisms
  • Build and expose secure APIs for downstream services, agents, and users to query enriched semantic data
  • Collaborate with ML/LLM teams to feed high-quality enterprise data into model training and tuning pipelines

Similar Jobs