Skip to content

santoshkumar-in/schema-drift

Repository files navigation

schema-drift

Database schema drift detector for polyglot Node.js stacks — snapshots and diffs MySQL and MongoDB schemas over time.

schema-drift diff output


Why schema-drift?

Your database silently changes over time. A developer runs ALTER TABLE directly on production. A new code path starts inserting MongoDB documents with a different field structure. Someone renames a field but doesn't migrate existing documents.

schema-drift catches all of this by snapshotting your actual database state and diffing it over time — across both MySQL and MongoDB simultaneously.

Install

npm install -g schema-drift

Quick start

schema-drift snapshot         # capture baseline
schema-drift snapshot         # capture again after changes
schema-drift diff             # see what changed

Commands

Command Description
snapshot Capture current schema state
diff Compare two snapshots, show drift
status Quick health summary
list List all saved snapshots

Configuration

Create .schemadriftrc in your project root:

{
  "label": "production",
  "sampleSize": 500,
  "connections": {
    "mysql": { "host": "localhost", "user": "root", "password": "pass", "database": "myapp" },
    "mongodb": { "uri": "mongodb://localhost:27017", "database": "myapp" }
  }
}

How MongoDB sampling works

MongoDB is schemaless — there is no information_schema. Instead, schema-drift samples up to sampleSize documents per collection and builds a statistical field profile: which fields exist, how frequently, and what types they contain.

Drift is reported when type distributions shift significantly between snapshots.


Local development

Prerequisites

  • Node.js ≥ 18
  • npm ≥ 9
  • Docker (for running MySQL and MongoDB locally)

Setup

git clone https://github.com/santoshkumar-in/schema-drift.git
cd schema-drift
npm install

Run locally without installing globally

# Run directly
node src/index.js snapshot

# Or link it so `schema-drift` resolves as a command globally on your machine
npm link
schema-drift snapshot

npm link creates a symlink from your global bin to src/index.js. Run npm unlink -g schema-drift to remove it.

Project structure

src/
├── index.js              # CLI entry point — Commander.js wiring
├── commands/
│   ├── snapshot.js       # schema-drift snapshot
│   ├── diff.js           # schema-drift diff
│   ├── status.js         # schema-drift status
│   └── list.js           # schema-drift list
├── core/
│   ├── mysql.js          # MySQL introspection via information_schema
│   ├── mongo.js          # MongoDB statistical schema inference
│   ├── differ.js         # Snapshot comparison engine
│   ├── store.js          # Snapshot file persistence (.schema-drift/)
│   └── suggestions.js    # Migration code generator
└── utils/
    ├── config.js          # cosmiconfig loader for .schemadriftrc
    └── render.js          # terminal output — drift report, tables, summary

Key modules to know

src/core/mongo.js is the most interesting module. inferCollectionSchema() samples documents via $sample, walks every document recursively with walkDocument() to extract field paths, and builds a frequency and type distribution map. This is the statistical inference engine that makes MongoDB schema tracking possible without a fixed schema.

src/core/differ.js compares two snapshots. For MySQL it diffs column types, nullability, and indexes. For MongoDB it detects type distribution shifts, frequency changes, and field renames using camelCase/snake_case heuristics.

src/core/store.js manages snapshot persistence. Snapshots are stored as JSON files in .schema-drift/ with an index.json tracking the latest per label. You can commit these files to get a full audit trail in git history.

Running tests

npm test

Tests use Jest with --experimental-vm-modules for ESM support. Test files live in __tests__/ and require no real database — MySQL and MongoDB are fully mocked.

To test a specific suite:

node --experimental-vm-modules node_modules/.bin/jest differ
node --experimental-vm-modules node_modules/.bin/jest mongo

Testing against real databases

The unit tests mock all database calls. The most useful local test is running the full end-to-end flow against live databases — real connections, real snapshots, real drift detection.

1. Start databases with Docker

Create docker-compose.yml in a scratch folder (e.g. ~/schema-drift-testenv/):

version: "3.8"
services:
  mysql:
    image: mysql:8.0
    environment:
      MYSQL_ROOT_PASSWORD: testpass
      MYSQL_DATABASE: testapp
    ports:
      - "3306:3306"
    healthcheck:
      test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-ptestpass"]
      interval: 5s
      retries: 10
  mongodb:
    image: mongo:7
    ports:
      - "27017:27017"
    healthcheck:
      test: ["CMD", "mongosh", "--eval", "db.runCommand('ping')"]
      interval: 5s
      retries: 10
cd ~/schema-drift-testenv
docker compose up -d
docker compose ps   # wait until both show healthy (~15s)

2. Seed test data

Save as seed.sh in the same folder and run it:

#!/usr/bin/env bash
set -e

MYSQL_CTR=$(docker ps --filter "ancestor=mysql:8.0" --format "{{.Names}}" | head -1)
MONGO_CTR=$(docker ps --filter "ancestor=mongo:7"   --format "{{.Names}}" | head -1)

docker exec "$MYSQL_CTR" mysql -uroot -ptestpass testapp -e "
CREATE TABLE IF NOT EXISTS users (
  id INT PRIMARY KEY AUTO_INCREMENT,
  name VARCHAR(100) NOT NULL,
  email VARCHAR(255) NOT NULL UNIQUE,
  created_at DATETIME DEFAULT NOW()
);
CREATE TABLE IF NOT EXISTS orders (
  id INT PRIMARY KEY AUTO_INCREMENT,
  user_id INT NOT NULL,
  total DECIMAL(10,2),
  status VARCHAR(20) DEFAULT 'pending',
  created_at DATETIME DEFAULT NOW()
);
INSERT IGNORE INTO users (name, email) VALUES
  ('Priya Sharma', 'priya@example.com'),
  ('Rahul Verma',  'rahul@example.com');
INSERT INTO orders (user_id, total, status) VALUES
  (1, 1299.00, 'delivered'), (1, 450.50, 'pending');
"

docker exec "$MONGO_CTR" mongosh testapp --eval "
db.products.insertMany([
  { name: 'Laptop', price: 55000, stock: 10, tags: ['electronics'] },
  { name: 'Phone',  price: 18000, stock: 45, tags: ['mobile']      }
]);
db.events.insertMany([
  { type: 'page_view', userId: 'u1', url: '/home',    timestamp: new Date() },
  { type: 'click',     userId: 'u2', url: '/product', timestamp: new Date() }
]);
"

echo "✓ MySQL seeded"
echo "✓ MongoDB seeded"
chmod +x seed.sh && bash seed.sh

3. Configure schema-drift

In your schema-drift project root, create .schemadriftrc:

{
  "label": "local",
  "sampleSize": 100,
  "connections": {
    "mysql":   { "host": "localhost", "user": "root", "password": "testpass", "database": "testapp" },
    "mongodb": { "uri": "mongodb://localhost:27017", "database": "testapp" }
  }
}

4. Take a baseline snapshot

node src/index.js snapshot --label local

5. Introduce real drift

MYSQL_CTR=$(docker ps --filter "ancestor=mysql:8.0" --format "{{.Names}}" | head -1)
MONGO_CTR=$(docker ps --filter "ancestor=mongo:7"   --format "{{.Names}}" | head -1)

docker exec "$MYSQL_CTR" mysql -uroot -ptestpass testapp -e "
  ALTER TABLE users ADD COLUMN phone VARCHAR(15);
  ALTER TABLE orders MODIFY COLUMN total VARCHAR(20);
  CREATE INDEX idx_orders_status ON orders(status);
"

docker exec "$MONGO_CTR" mongosh testapp --eval '
  db.products.updateMany({ name: "Phone" }, { $set: { price: "18000" } });
  db.products.insertOne({ name: "Keyboard", cost: 1500, stock: 15 });
  db.events.updateMany({ type: "click" }, { $set: { sessionId: "abc123" } });
'

6. Snapshot again and diff

node src/index.js snapshot --label local
node src/index.js diff --label local

The report catches all 6 changes: the new phone column, the total type change (error), the new index, MongoDB price type drift (error), the pricecost possible rename, and the new sessionId field.

7. Try the other commands

node src/index.js status --label local          # health summary
node src/index.js list   --label local          # list snapshots on disk
node src/index.js diff   --label local --db mysql  # MySQL only
node src/index.js diff   --label local --ci        # exits 1 on errors

Teardown

docker compose down -v    # -v wipes volumes and all seeded data

Contributing

Bug reports and pull requests are welcome. For significant changes, open an issue first to discuss the approach.

When contributing:

  • Tests required for all new behaviour — npm test must pass
  • No real database required for tests — keep all DB calls mocked
  • Keep commits focused — one logical change per commit

License

MIT © Santosh Kumar

About

Database schema drift detector for polyglot Node.js stacks — snapshots and diffs MySQL and MongoDB schemas over time.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors