
Data Engineer (Founding Team)

Data Engineer (Founding Team)

Data Engineer (Founding Team)
Fabrion
The Data/ETL Engineer role is part of a founding team at a startup backed by 8VC, focused on creating an AI-native platform that transforms enterprise data into actionable insights. The position involves building scalable data pipelines, connector frameworks, and knowledge graphs to support advanced data workflows and AI applications.
Qualification
- 5+ years building large-scale data infrastructure in production environments
- Deep experience with ingestion frameworks (Kafka, Airbyte, Meltano, Fivetran) and data pipeline orchestration (Airflow, Dagster, Prefect)
- Comfortable processing unstructured data formats: PDFs, Excel, emails, logs, CSVs, web APIs
- Experience working with columnar stores, object storage, and lakehouse formats (Iceberg, Delta, Parquet)
- Strong background in knowledge graphs or semantic modeling (e.g. Neo4j, RDF, Gremlin, Puppygraph)
- Familiarity with GraphQL, RESTful APIs, and designing developer-friendly data access layers
- Experience implementing data governance: RBAC, ABAC, data contracts, lineage, data quality checks
Responsibility
- Build highly reliable, scalable data ingestion and transformation pipelines across structured, semi-structured, and unstructured data sources
- Develop and maintain a connector framework for ingesting from enterprise systems (ERPs, PLMs, CRMs, legacy data stores, email, Excel, docs, etc.)
- Design and maintain the data fabric layer, including a knowledge graph enriched with ontologies, metadata, and relationships
- Normalize and vectorize data for downstream AI/LLM workflows, enabling retrieval-augmented generation (RAG), summarization, and alerting
- Create and manage data contracts, access layers, lineage, and governance mechanisms
- Build and expose secure APIs for downstream services, agents, and users to query enriched semantic data
- Collaborate with ML/LLM teams to feed high-quality enterprise data into model training and tuning pipelines



