ADR-040: Database Schema Migration Management
Status: Proposed Date: 2025-10-20 Deciders: Development Team Tags: #database #schema #migrations #devops
Overview
Every software system evolves over time. You add a new feature that needs a new database table, or you realize an existing column needs a different data type, or you need to add an index for better performance. This is normal and healthy—but it creates a practical problem: how do you safely apply these changes to existing databases without breaking things?
In the early days of our system, we used a single schema/init.sql file that worked great for fresh installations but became a coordination nightmare as the project grew. When a developer added a new table, they'd manually edit this monolithic file. If you already had a database running, you'd have to figure out which parts of the file you'd already applied and which were new, then carefully run just the new bits. Miss a dependency? Your database breaks. Run something twice? You might get errors or, worse, silent data corruption.
The core insight is simple: databases need version control just like code. When you pull the latest code from git, you want a command that safely applies any new schema changes—whether it's your first time setting up or you've been running the system for months. The solution is a migration system: numbered SQL files (001, 002, 003...) combined with a tracking table that remembers which migrations have already been applied.
We chose to build a simple bash-based migration runner rather than adopting heavyweight tools like Flyway or Alembic. It's about 100 lines of code, has zero external dependencies beyond the PostgreSQL tools already in our Docker containers, and perfectly fits our linear development model. Each migration is an idempotent SQL file that can safely run multiple times, and the system automatically skips migrations you've already applied. It's simple, transparent, and exactly powerful enough for our needs without the complexity of enterprise migration frameworks.
Context
As the knowledge graph system evolves, we're adding incremental schema changes (new tables, columns, constraints, etc.). Currently, we use a monolithic schema/init.sql file that's executed on fresh database initialization.
The Problem:
- Each feature adds schema patches directly to init.sql
- No tracking of which migrations have been applied
- No safe way to apply schema changes to existing databases
- Manual coordination required to merge patches into stable schema versions
- Risk of applying patches out of order or duplicating changes
Recent Example:
Adding kg_api.embedding_config table for ADR-039 required manual insertion into init.sql. If a developer has an existing database, they must:
1. Manually run the new SQL
2. Hope they don't miss any dependencies
3. Track which patches they've applied vs. which are missing
This doesn't scale.
Decision
Implement a simple, bash-based migration system using a schema_migrations tracking table and numbered migration files.
Key Components:
- schema_migrations table - Tracks applied migrations
- schema/migrations/ directory - Ordered SQL migration files
- scripts/migrate-db.sh - Migration runner script
- Numbered migrations -
001_baseline.sql,002_add_embedding_config.sql, etc.
Migration Filename Convention:
{version}_{description}.sql
Examples:
001_baseline.sql
002_add_embedding_config.sql
003_add_user_preferences.sql
Migration Runner Behavior:
./scripts/database/migrate-db.sh
# Checks schema_migrations table
# Applies only unapplied migrations in order
# Records each migration in schema_migrations
# Idempotent: safe to run multiple times
Example Migration File:
-- Migration: 002_add_embedding_config.sql
-- Description: Add embedding configuration table for ADR-039
BEGIN;
CREATE TABLE IF NOT EXISTS kg_api.embedding_config (
id SERIAL PRIMARY KEY,
provider VARCHAR(50) NOT NULL,
model_name VARCHAR(200),
...
);
-- Insert default config
INSERT INTO kg_api.embedding_config (provider, active)
VALUES ('openai', TRUE);
COMMIT;
Alternatives Considered
Alternative 1: Flyway (Java-based)
Pros: - Industry standard - Robust feature set - Good tooling (Flyway Desktop in 2025) - SQL-based migrations (familiar)
Cons: - Requires Java runtime (adds dependency to Docker image) - Heavyweight for our simple use case - Rollback only in paid version - More complex than we need
Verdict: ❌ Too heavyweight for current needs
Alternative 2: Liquibase (Java-based)
Pros: - Very flexible (SQL, XML, YAML, JSON formats) - Excellent for complex branching/merging migrations - Built-in rollback support - 2025 flow enhancements for orchestration
Cons: - Requires Java runtime - Steeper learning curve than Flyway - XML/YAML overhead for simple migrations - Higher resource consumption
Verdict: ❌ Overengineered for our use case
Alternative 3: Alembic (Python-based)
Pros: - Python-native (matches our stack) - SQLAlchemy integration - Programmatic migrations (Python code) - Active Python community support
Cons: - Requires SQLAlchemy (we use direct psycopg2) - PostgreSQL + Apache AGE may not map well to SQLAlchemy ORM - AGE graph structures don't fit ORM paradigm - Adds Python dependency overhead
Verdict: ⚠️ Good option, but SQLAlchemy mismatch with AGE
Alternative 4: shmig (BASH tool)
Pros: - Simple BASH script (~400 lines) - POSIX-compatible - Supports PostgreSQL, MySQL, SQLite3 - Minimal dependencies (just psql)
Cons: - External dependency (another tool to install) - Feature set beyond our needs - Would need to maintain if project becomes inactive
Verdict: ⚠️ Good, but custom script is simpler
Alternative 5: Custom Bash Script (CHOSEN)
Pros: - Zero external dependencies (uses Docker's psql) - ~100 lines of code (easy to understand) - Matches our existing bash script patterns - Simple schema_migrations tracking table - Perfect for linear schema evolution - Can evolve to Alembic/Flyway later if needed
Cons: - No rollback support (forward-only) - Manual SQL writing (no ORM abstraction) - Less battle-tested than Flyway/Liquibase
Verdict: ✅ Best fit for current requirements
Implementation Plan
Phase 1: Migration Infrastructure (This ADR)
1. Create schema_migrations table
CREATE TABLE IF NOT EXISTS public.schema_migrations (
version INTEGER PRIMARY KEY,
name TEXT NOT NULL,
applied_at TIMESTAMP NOT NULL DEFAULT NOW()
);
2. Create schema/migrations/ directory structure
schema/
├── init.sql # Keep for backward compat (calls migrate-db.sh)
└── migrations/
├── 001_baseline.sql # Current schema as of ADR-040
├── 002_add_embedding_config.sql
└── README.md # Migration conventions
3. Build migration runner: scripts/migrate-db.sh
#!/bin/bash
# Migration runner for PostgreSQL schema changes
#
# Usage:
# ./scripts/database/migrate-db.sh # Apply all pending migrations
# ./scripts/database/migrate-db.sh --dry-run # Show what would be applied
#
# Idempotent: Safe to run multiple times
set -e
# Check which migrations have been applied
# Apply pending migrations in order
# Record in schema_migrations table
4. Update schema/init.sql
#!/bin/bash
# Initialize database schema
# This script now delegates to the migration runner
./scripts/database/migrate-db.sh
5. Update docker-compose.yml (if needed) - Ensure init.sql still runs on container creation - No changes needed if init.sql just calls migrate-db.sh
Phase 2: Extract Baseline (Immediate)
Extract current schema as 001_baseline.sql:
# Capture current schema/init.sql as baseline migration
cp schema/init.sql schema/migrations/001_baseline.sql
Create 002_add_embedding_config.sql:
-- Migration: Add embedding configuration table
-- ADR: ADR-039 Local Embedding Service
-- Date: 2025-10-20
BEGIN;
-- embedding_config table already in baseline
-- This migration is a no-op for fresh installs
-- But allows existing DBs to add the table
CREATE TABLE IF NOT EXISTS kg_api.embedding_config (
-- ... (full table definition from ADR-039)
);
COMMIT;
Phase 3: Future Migrations
Workflow for adding schema changes:
- Create new migration file:
schema/migrations/003_description.sql - Write SQL changes (use BEGIN/COMMIT for safety)
- Test on fresh database:
./scripts/database/migrate-db.sh - Test on existing database:
./scripts/database/migrate-db.sh - Commit migration file to git
Periodically consolidate:
- When schema stabilizes (e.g., release milestones)
- Merge all migrations into new baseline: schema/migrations/stable_v2_baseline.sql
- Archive old migrations in schema/migrations/archived/
Consequences
Positive
✅ Safe schema evolution - Track which migrations have been applied - Idempotent migrations (safe to re-run) - Clear audit trail of schema changes
✅ Developer productivity
- ./scripts/database/migrate-db.sh works on fresh and existing databases
- No manual SQL coordination
- Git history shows schema evolution
✅ Simple and maintainable - ~100 lines of bash code - No external dependencies - Easy to debug and modify
✅ Compatible with current workflow - Docker container init still works - Existing databases can migrate forward - Backward compatible with init.sql approach
Negative
⚠️ No automatic rollback - Forward-only migrations - Manual rollback SQL required if needed - Mitigation: Use BEGIN/COMMIT and test thoroughly
⚠️ Manual SQL writing - No ORM abstraction for schema changes - Developer must write PostgreSQL SQL - Mitigation: We already do this, no change
⚠️ Linear migration path only - No branching/merging support (unlike Liquibase) - Works for our current development model - Mitigation: Can switch to Alembic/Flyway later if needed
Risks and Mitigation
Risk: Migration fails mid-execution
- Mitigation: Wrap migrations in BEGIN/COMMIT transactions
- Mitigation: Test on fresh database before production
Risk: Developer forgets to create migration - Mitigation: Code review checklist includes migration check - Mitigation: CI/CD could compare schema vs. migrations
Risk: Migration file numbering conflicts - Mitigation: Use timestamp prefixes if parallel development - Mitigation: Currently single developer, low risk
Migration File Conventions
Naming
{version}_{description}.sql
Version: 001, 002, 003, ... (zero-padded 3 digits)
Description: snake_case, descriptive (e.g., add_user_roles)
Examples:
001_baseline.sql
002_add_embedding_config.sql
003_add_concept_metadata.sql
010_consolidate_auth_tables.sql
Structure
-- Migration: {version}_{description}
-- Description: Brief explanation of changes
-- ADR: Link to related ADR (if applicable)
-- Date: YYYY-MM-DD
BEGIN;
-- Schema changes here
-- Use CREATE TABLE IF NOT EXISTS for safety
-- Use ALTER TABLE ADD COLUMN IF NOT EXISTS (PostgreSQL 9.6+)
-- Data migrations (if needed)
-- Insert default data
-- Update existing rows
COMMIT;
Best Practices
- Idempotent migrations
- Use
IF NOT EXISTS/IF EXISTS -
Check before altering
-
Transactional
- Wrap in
BEGIN/COMMIT -
All-or-nothing execution
-
Self-documenting
- Comment purpose and ADR reference
-
Explain complex changes
-
Tested
- Test on fresh database
-
Test on database with existing data
-
Atomic
- One migration = one logical change
- Don't combine unrelated changes
Comparison to Other Projects
Ruby on Rails: Uses ActiveRecord migrations with timestamps Django: Uses numbered migrations per app (0001, 0002, ...) Node.js (Knex): Uses timestamp prefixes (20250120_add_users.js) Flyway: Uses V1__, V2__ version prefixes
Our approach: Closest to Django's numbered migrations, adapted for PostgreSQL with bash runner.
Related ADRs
- ADR-039: Local Embedding Service (triggered need for migration system)
- ADR-016: Apache AGE Migration (major schema change that would have benefited from migrations)
References
- Tutorial: https://www.sheshbabu.com/posts/demystifying-postgres-schema-migrations/
- pg_migrate.sh: https://github.com/maxpoletaev/pg_migrate.sh (inspiration)
- shmig: https://github.com/mbucc/shmig (reference implementation)
- Flyway vs. Liquibase comparison: https://www.bytebase.com/blog/flyway-vs-liquibase/
Notes
Why not just version control init.sql? - Git history shows what changed, not what's applied - Developer with existing DB can't tell if they need to run a snippet - No atomic "apply this change" operation
When to upgrade to Alembic/Flyway? - When we need programmatic migrations (Python logic in migrations) - When we need branching/merging support (multiple parallel dev branches) - When we need automated rollback - Not needed for current development pace
Migration vs. Seed Data? - Migrations: Schema structure (tables, columns, constraints) - Seed data: Default/initial data (admin user, default config) - Migrations can include seed data if required for schema to work