Skip to content

Database Migrations

This guide covers everything you need to know about managing database migrations with Alembic.

Overview

The 2Sigma Backend uses Alembic for database schema migrations. Alembic tracks schema changes over time and allows you to:

  • Version control your database schema
  • Apply changes incrementally
  • Roll back changes if needed
  • Share schema changes with the team

Migration Basics

Current Migration Status

Check which migrations have been applied:

alembic current

View migration history:

alembic history --verbose

Applying Migrations

Apply all pending migrations:

alembic upgrade head

Apply up to a specific migration:

alembic upgrade abc123  # Use revision ID

Apply one migration forward:

alembic upgrade +1

Rolling Back Migrations

Roll back one migration:

alembic downgrade -1

Roll back to a specific migration:

alembic downgrade abc123

Roll back all migrations (WARNING: drops all data):

alembic downgrade base

Creating New Migrations

Auto-generating Migrations

Alembic can detect model changes and generate migrations automatically:

# 1. Make changes to your models in app/models/
# 2. Generate migration
alembic revision --autogenerate -m "add user profile fields"

# 3. Review the generated file in alembic/versions/
# 4. Edit if needed (see below)
# 5. Apply the migration
alembic upgrade head

Manual Migrations

For complex changes, create an empty migration:

alembic revision -m "add custom indexes"

Edit the generated file in alembic/versions/:

def upgrade() -> None:
    op.create_index(
        'ix_users_email_active',
        'users',
        ['email', 'is_active'],
        unique=False
    )

def downgrade() -> None:
    op.drop_index('ix_users_email_active', table_name='users')

Migration Best Practices

Always Review Auto-Generated Migrations

Alembic's auto-generation is powerful but not perfect. Always review:

# After creating migration
cat alembic/versions/XXXXX_description.py

Common issues to check:

  • Dropping columns: Ensure you actually want to delete data
  • Renaming operations: Alembic may detect renames as drop + add
  • Type changes: Verify data compatibility
  • Default values: Ensure they're appropriate for existing rows
  • Nullable changes: Check if existing data allows NOT NULL

Use Descriptive Names

# Good
alembic revision --autogenerate -m "add user streak tracking"
alembic revision --autogenerate -m "add chat session sharing"

# Bad
alembic revision --autogenerate -m "update"
alembic revision --autogenerate -m "changes"

Test Migrations in Both Directions

Always test upgrade AND downgrade:

# Apply migration
alembic upgrade head

# Test rollback
alembic downgrade -1

# Re-apply
alembic upgrade head

Handle Data Migrations Carefully

When migrating data, use batch operations:

from alembic import op
import sqlalchemy as sa

def upgrade() -> None:
    # Add column with default
    op.add_column('users', sa.Column('locale', sa.String(), nullable=True))

    # Populate existing rows
    op.execute("UPDATE users SET locale = 'en' WHERE locale IS NULL")

    # Make non-nullable
    op.alter_column('users', 'locale', nullable=False)

def downgrade() -> None:
    op.drop_column('users', 'locale')

Use Batch Operations for Large Tables

For tables with millions of rows:

def upgrade() -> None:
    with op.batch_alter_table('large_table') as batch_op:
        batch_op.add_column(sa.Column('new_field', sa.String()))
        batch_op.create_index('ix_new_field', ['new_field'])

Common Migration Scenarios

Adding a Column

def upgrade() -> None:
    op.add_column('users',
        sa.Column('phone_number', sa.String(20), nullable=True)
    )

def downgrade() -> None:
    op.drop_column('users', 'phone_number')

Adding a Column with Default

def upgrade() -> None:
    op.add_column('users',
        sa.Column('is_verified', sa.Boolean(), 
                  nullable=False, server_default='false')
    )

def downgrade() -> None:
    op.drop_column('users', 'is_verified')

Renaming a Column

def upgrade() -> None:
    op.alter_column('users', 'name', new_column_name='full_name')

def downgrade() -> None:
    op.alter_column('users', 'full_name', new_column_name='name')

Adding a Foreign Key

def upgrade() -> None:
    op.add_column('posts', sa.Column('user_id', sa.Integer(), nullable=False))
    op.create_foreign_key(
        'fk_posts_user_id',
        'posts', 'users',
        ['user_id'], ['id'],
        ondelete='CASCADE'
    )

def downgrade() -> None:
    op.drop_constraint('fk_posts_user_id', 'posts', type_='foreignkey')
    op.drop_column('posts', 'user_id')

Adding an Index

def upgrade() -> None:
    op.create_index('ix_users_email', 'users', ['email'], unique=True)

def downgrade() -> None:
    op.drop_index('ix_users_email', table_name='users')

Adding a Check Constraint

def upgrade() -> None:
    op.create_check_constraint(
        'ck_users_age_positive',
        'users',
        'age >= 0'
    )

def downgrade() -> None:
    op.drop_constraint('ck_users_age_positive', 'users', type_='check')

Creating a New Table

def upgrade() -> None:
    op.create_table(
        'notifications',
        sa.Column('id', sa.Integer(), primary_key=True),
        sa.Column('user_id', sa.Integer(), nullable=False),
        sa.Column('message', sa.Text(), nullable=False),
        sa.Column('read', sa.Boolean(), default=False),
        sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.func.now()),
        sa.ForeignKeyConstraint(['user_id'], ['users.id'], ondelete='CASCADE')
    )

def downgrade() -> None:
    op.drop_table('notifications')

Adding an Enum

from alembic import op
import sqlalchemy as sa

def upgrade() -> None:
    # Create enum type
    status_enum = sa.Enum('DRAFT', 'PUBLISHED', 'ARCHIVED', name='post_status')
    status_enum.create(op.get_bind())

    # Add column using enum
    op.add_column('posts',
        sa.Column('status', status_enum, nullable=False, server_default='DRAFT')
    )

def downgrade() -> None:
    op.drop_column('posts', 'status')
    sa.Enum(name='post_status').drop(op.get_bind())

Handling Migration Conflicts

Merge Conflicts

When multiple developers create migrations:

# You may see multiple head revisions
alembic heads

# Create a merge migration
alembic merge -m "merge migrations" heads
alembic upgrade head

Reset Migration History (Development Only)

If migrations are messy in development:

# WARNING: This drops all data!

# 1. Drop database
dropdb -U postgres ai_tutor_dev
createdb -U postgres ai_tutor_dev

# 2. Delete migration files (keep __init__.py)
rm alembic/versions/*.py

# 3. Create fresh migration from models
alembic revision --autogenerate -m "initial schema"

# 4. Apply
alembic upgrade head

Production Warning

Never reset migration history in production! Always use forward migrations.

Debugging Migration Issues

View SQL Without Applying

alembic upgrade head --sql > migration.sql

Review the SQL before applying it.

Stamp Current State

If migrations and database are out of sync:

# Mark database as being at a specific revision without running migrations
alembic stamp head

Use this when manually applying changes or fixing broken state.

Revision ID Not Found

If you get "Can't locate revision":

# Check what's in the database
psql -U ai_tutor_user -d ai_tutor_dev -c "SELECT * FROM alembic_version;"

# Check available revisions
alembic history

# Stamp to correct revision
alembic stamp <correct_revision_id>

Migration File Structure

Understanding a migration file:

"""add user profile

Revision ID: abc123def456
Revises: previous_revision_id
Create Date: 2026-01-20 10:30:00.000000
"""
from alembic import op
import sqlalchemy as sa

# revision identifiers
revision = 'abc123def456'
down_revision = 'previous_revision_id'  # Previous migration
branch_labels = None
depends_on = None

def upgrade() -> None:
    """Apply the migration"""
    # Your schema changes here
    pass

def downgrade() -> None:
    """Revert the migration"""
    # Undo the changes
    pass

Environment-Specific Migrations

Testing Migrations

# Create test database
createdb ai_tutor_test

# Update DATABASE_URL in .env temporarily
DATABASE_URL=postgresql+asyncpg://user:pass@localhost/ai_tutor_test

# Run migrations
alembic upgrade head

# Clean up
dropdb ai_tutor_test

Production Migrations

Before deploying:

  1. Backup the database:

    pg_dump -U postgres -d ai_tutor_prod > backup_$(date +%Y%m%d).sql
    

  2. Test on staging:

    # Restore production data to staging
    pg_restore -U postgres -d ai_tutor_staging backup.sql
    
    # Apply migrations
    alembic upgrade head
    

  3. Deploy during low-traffic window

  4. Monitor for errors

Advanced Topics

Branched Migrations

Create a branch for experimental features:

alembic revision -m "experimental feature" --branch-label=experimental

Depending on Multiple Revisions

# In migration file
depends_on = ('revision1', 'revision2')

Offline SQL Generation

Generate SQL scripts for DBA review:

alembic upgrade head --sql > deploy_migration.sql

The DBA can review and apply manually.

Troubleshooting Checklist

  • Is PostgreSQL running? (pg_isready)
  • Is DATABASE_URL correct in .env?
  • Have you reviewed the auto-generated migration?
  • Did you test both upgrade and downgrade?
  • Are there merge conflicts? (alembic heads)
  • Is the database state in sync? (alembic current)

Next Steps