Data Engineer (Founding Team)

Fabrion

San Francisco Bay Area, California, United States permanent

Posted: August 11, 2025

Quick Summary

We're building a world-class team to tackle one of the industry’s most critical infrastructure problems. As a Data Engineer, you'll be responsible for designing and implementing data pipelines, ensuring data quality and integrity, and collaborating with cross-functional teams to drive business outcomes.

Required Skills

Job Description

Data/ETL Engineer (Founding Team)

Location: San Francisco Bay Area

Type: Full-Time

Compensation: Competitive salary + early-stage equity

Backed by 8VC, we're building a world-class team to tackle one of the industry’s most critical infrastructure problems.

About the Role

We’re building a multi-tenant, AI-native platform where enterprise data becomes actionable through semantic enrichment, intelligent agents, and governed interoperability. At the heart of this architecture lies our Data Fabric — an intelligent, governed layer that turns fragmented and siloed data into a connected ontology ready for model training, vector search, and insight-to-action workflows.

We're looking for engineers who enjoy hard data problems at scale: messy unstructured data, schema drift, multi-source joins, security models, and AI-ready semantic enrichment. You’ll build the backend systems, data pipelines, connector frameworks, and graph-based knowledge models that fuel agentic applications.

If you've worked on streaming unstructured pipelines, built connectors into ugly legacy systems, or mapped knowledge graphs that scale — this role will feel like home.

Responsibilities

• Build highly reliable, scalable data ingestion and transformation pipelines across structured, semi-structured, and unstructured data sources

• Develop and maintain a connector framework for ingesting from enterprise systems (ERPs, PLMs, CRMs, legacy data stores, email, Excel, docs, etc.)

• Design and maintain the data fabric layer — including a knowledge graph (Neo4j or Puppygraph) enriched with ontologies, metadata, and relationships

• Normalize and vectorize data for downstream AI/LLM workflows — enabling retrieval-augmented generation (RAG), summarization, and alerting

• Create and manage data contracts, access layers, lineage, and governance mechanisms

• Build and expose secure APIs for downstream services, agents, and users to query enriched semantic data

• Collaborate with ML/LLM teams to feed high-quality enterprise data into model training and tuning pipelines

What We’re Looking For

Core Experience:

• 5+ years building large-scale data infrastructure in production environments

• Deep experience with ingestion frameworks (Kafka, Airbyte, Meltano, Fivetran) and data pipeline orchestration (Airflow, Dagster, Prefect)

• Comfortable processing unstructured data formats: PDFs, Excel, emails, logs, CSVs, web APIs

• Experience working with columnar stores, object storage, and lakehouse formats (Iceberg, Delta, Parquet)

• Strong background in knowledge graphs or semantic modeling (e.g. Neo4j, RDF, Gremlin, Puppygraph)

• Familiarity with GraphQL, RESTful APIs, and designing developer-friendly data access layers

• Experience implementing data governance: RBAC, ABAC, data contracts, lineage, data quality checks

Mindset & Culture Fit:

• You’re a system thinker: you want to model the real world, not just process it

• Comfortable navigating ambiguous data models and building from scratch

• Passionate about enabling AI systems with real-world, messy enterprise data

• Pragmatic about scalability, observability, and schema evolution

• Value autonomy, high trust, and meaningful ownership over infrastructure

Bonus Skills

• Prior work with vector DBs (e.g. Weaviate, Qdrant, Pinecone) and embedding pipelines

• Experience building or contributing to enterprise connector ecosystems

• Knowledge of ontology versioning, graph diffing, or semantic schema alignment

• Familiarity with data fabric patterns (e.g. Palantir Ontology, Linked Data, W3C standards)

• Familiar with fine-tuning LLMs or enabling RAG pipelines using enterprise knowledge

• Experience enforcing data access policy with tools like OPA, Keycloak, Snowflake row-level security

Why This Role Matters

Agents are only as smart as the data they operate on. This role builds the foundation — the semantic, governed, connected substrate — that makes autonomous decision-making and agent action possible. From factory ERP records to geopolitical news alerts, the data fabric unifies it all.

If you're excited to tame complexity, unify chaos, and power intelligent systems with trusted data — we’d love to hear from you.

Why Apply Through MisuJob?

AI-Powered Job Matching: MisuJob uses advanced artificial intelligence to analyze your skills, experience, and career goals. Our matching algorithm compares your profile against thousands of job requirements to find positions where you have the highest chance of success. This saves you hours of manual job searching and ensures you only see relevant opportunities.

One-Click Applications: Once you create your profile, applying to jobs is effortless. Your resume and cover letter are automatically tailored to highlight the most relevant experience for each position. You can apply to multiple jobs in minutes, not hours.

Career Intelligence: Beyond job matching, MisuJob provides valuable career insights. See how your skills compare to market demands, identify skill gaps to address, and understand salary benchmarks for your experience level. Make data-driven decisions about your career path.

Frequently Asked Questions

How do I apply for this position?

Click the "Register to Apply" button above to create a free MisuJob account. Once registered, you can apply with one click and track your application status in your dashboard.

Is MisuJob free for job seekers?

Yes, MisuJob is completely free for job seekers. Create your profile, get matched with jobs, and apply without any cost. We help you find your dream job without any hidden fees.

How does AI matching work?

Our AI analyzes your resume, skills, and experience to understand your professional profile. It then compares this against job requirements using natural language processing to calculate a match percentage. Higher matches mean better fit for the role.

Can I apply to jobs in other countries?

Absolutely. MisuJob features jobs from companies worldwide, including remote positions. Filter by location or look for remote opportunities to find jobs that match your preferences.

Ready to Apply?

Join thousands of job seekers using MisuJob's AI to find and apply to their dream jobs automatically.

Interested in this position?