A skeleton dbt project implementing the VibeOps analytics engineering governance approach — a framework for AI-assisted analytics engineering where the environment governs agent behaviour, not the prompt.
Based on: VibeOps: A Governance and Optimisation Framework for Agentic Coding Environments — Natu Lauchande, Sanlam Fintech, 2026.
Standard dbt projects have SQL and config. This template adds a governance layer — three files and four slash commands that ensure an AI coding assistant (Claude Code, Cursor, etc.) builds models consistently, correctly, and with awareness of your data's specific quirks.
The core insight: the quality of AI-assisted analytics work is determined by the environment the agent operates in, not the model version or prompt style. When the agent knows that your CRM deduplicates on max(modified_at) and your analytics platform's sessions can span midnight, it stops making mistakes that compile and pass tests but produce wrong numbers.
your-dbt-project/
├── AGENTS.md ← What is true about THIS project
├── ANALYTICS_ENG_SKILL.md ← How to work in ANY dbt project
├── DBT_STYLE_GUIDE.md ← How to write SQL in THIS project
These three files are the entire governance system. Everything else follows from them.
| File | What goes in it | Who changes it |
|---|---|---|
AGENTS.md |
Hard constraints, layer rules, tribal knowledge about your sources | You + agent (with approval) |
ANALYTICS_ENG_SKILL.md |
Universal methodology: explore → contract → build → grade | Rarely — it's stack-agnostic |
DBT_STYLE_GUIDE.md |
CTE patterns, materialization per layer, naming rules | You, when conventions change |
This is the highest-value file. It has three sections:
- Hard Constraints — rules the agent must never break (layer boundaries, sacred tests, build protocol)
- Architecture — your four-layer setup, databases, naming conventions
- Tribal Knowledge — data truths that would otherwise live only in your team's heads
The Tribal Knowledge section is where most of the value lives. Example entries:
### Salesforce CRM
- Contact records are soft-deleted: `is_deleted = true`, not physically removed
- `account_id` can be NULL for leads that haven't been converted — always LEFT JOIN
- Fivetran syncs create duplicate rows during schema changes; deduplicate on max(systemmodstamp)
### Google Analytics 4
- Sessions can span midnight. Never use event_date for session boundaries — use event_timestamp
- user_pseudo_id is a device identifier, not a user identifier
Every time a session discovers something surprising about a source system, the agent proposes adding it here. You approve it. The next session starts with that knowledge.
The universal workflow — same regardless of what you're building:
- Explore — query the source before writing any SQL
- Data Contract — write the spec (grain, primary key, guarantees) before writing the model
- Build — one model at a time, run and test before moving on
- Grade — AnalyticsCheck self-score at session end
You generally don't need to change this file. The methodology travels; configuration stays in AGENTS.md.
Canonical SQL patterns for this project: staging CTE structure, incremental model config, multi-platform union pattern, incremental lookback, Snowflake-specific functions in use. When the agent wants to write a fact table, it reads this file to know the exact config block format.
.claude/commands/
├── explore-data.md ← Understand a source. No building.
├── new-source.md ← Integrate a new source end-to-end
├── edit-model.md ← Modify existing models with lineage awareness
└── analytics-check.md ← Self-grade the session
Usage in Claude Code (or any tool that supports slash commands mapped to markdown files):
/explore-data — "What does the orders table look like? What's the grain?"
/new-source — "Integrate the Stripe payments source"
/edit-model — "Add refund_amount to fact_orders"
/analytics-check — run at end of session
Each command is a markdown file that loads a structured workflow. The agent reads it and follows the steps. /new-source delegates its exploration phase to /explore-data — methodology doesn't duplicate, it delegates.
Before writing any model, the agent must produce a data contract from actual queries, not assumptions:
model: stg_stripe__charges
grain: "1 row per Stripe charge"
primary_key: charge_id
source_table: raw.stripe.charges
row_count_range: [50000, 500000]
guarantees:
- charge_id is unique and not null
- created_at is not null
- amount is not null and > 0
known_issues:
- ~2% of records have status = 'pending' indefinitely (Stripe quirk, not failures)
join_keys:
- customer_id → dim_customers.stripe_customer_id (match rate: ~95%)You cannot write a correct data contract without understanding the data. That's the point. The contract becomes the basis for dbt tests that enforce it on every run.
AnalyticsCheck = (0.25 × env_health)
+ (0.30 × output_quality)
+ (0.30 × (1 - architecture_distance))
+ (0.15 × process_compliance)
Target: ≥ 0.90 | Acceptable: ≥ 0.80 | Review required: < 0.80
| Dimension | What it measures |
|---|---|
| Environment Health | Did tribal knowledge cover the sources worked on? |
| Output Quality | Do dbt run and dbt test pass? Row counts within contract? |
| Architecture Distance | Layer boundaries respected? Correct patterns per layer? |
| Process Compliance | Queried before writing? Contract first? Incremental build? |
Low scores map directly to specific improvements: low Architecture Distance → update style guide; low Environment Health → add tribal knowledge to AGENTS.md.
git clone <this-repo> my-dbt-project
cd my-dbt-projectReplace the placeholder content with your actual project details:
- Your data warehouse (Snowflake, BigQuery, Databricks, Redshift)
- Your source systems
- Your database/schema names
- Start the Tribal Knowledge section empty — it fills itself as you work
Replace your_project_name, database names, and schema names.
Replace the Snowflake-specific functions with your warehouse's equivalents if needed. The patterns are generic — only the functions in the "Warehouse-Specific Patterns" section need updating.
It's universal. The only thing you might update is the data contract format section if your use case requires additional fields.
Place the .claude/commands/ files where your AI tool expects slash commands. In Claude Code, these are read automatically from .claude/commands/ at the project root.
your-dbt-project/
│
├── AGENTS.md # Edit this for your project
├── ANALYTICS_ENG_SKILL.md # Universal — leave mostly unchanged
├── DBT_STYLE_GUIDE.md # Edit warehouse-specific patterns
│
├── dbt_project.yml # Standard dbt config
├── packages.yml # dbt packages
├── profiles.yml # Local connection (gitignored)
│
├── .claude/
│ └── commands/ # Slash commands
│ ├── explore-data.md
│ ├── new-source.md
│ ├── edit-model.md
│ └── analytics-check.md
│
├── models/
│ ├── staging/ # views — column rename only
│ │ └── <source_name>/
│ │ ├── _sources.yml
│ │ └── stg_<source>__<table>.sql
│ │
│ ├── intermediate/ # table/incremental — business logic
│ │ └── int_<name>.sql
│ │
│ ├── marts/ # dim_* (table), fact_* (incremental)
│ │ ├── dim_<name>.sql
│ │ └── fact_<name>.sql
│ │
│ └── products/ # prod_* — BI-ready aggregations
│ └── prod_<name>.sql
│
├── macros/
│ ├── generate_database_name.sql # Multi-env database routing
│ └── get_custom_schema.sql
│
├── analyses/ # Exploration docs (markdown + SQL)
│ └── exploration_template.md
│
├── seeds/
├── snapshots/
└── tests/
Session runs
↓
New data truth discovered
↓
Agent proposes addition to AGENTS.md Tribal Knowledge
↓
You approve
↓
Next session starts with that knowledge
↓
AnalyticsCheck score rises over time
The environment gets smarter with every session. That's the system.
This template includes minimal working examples in each layer using a fictitious e-commerce scenario (orders, customers, products). They demonstrate the correct structure for each layer — not production-ready models.
See:
- models/staging/example_crm/ — staging pattern
- models/intermediate/int_orders_enriched.sql — intermediate pattern
- models/marts/dim_customers.sql — dimension pattern
- models/marts/fact_orders.sql — fact/incremental pattern
- models/products/prod_monthly_revenue.sql — product pattern
- VibeOps paper: VibeOps: A Governance and Optimisation Framework for Agentic Coding Environments — Natu Lauchande, Sanlam Fintech, 2026
- Reference software implementation: github.com/nlauchande/fastapi-vibeops-template
- dbt docs: docs.getdbt.com