Applying Database Migrations in Production on Kubernetes

Unlike regular application code changes, database schema migrations in production require actual downtime. We haven’t built infrastructure to seamlessly switch between schemas in production.

Prerequisites

  • This procedure should only be followed for application updates that change DB schema and have an Alembic migration. See Developing database migrations for information about creating migrations with Alembic.
  • A new docker image with the DB change should be pushed to Docker Hub.
  • Check if additional steps are needed to apply the migration, such as seeding new data into the DB.

Procedure

  1. Bring the deployment down:

    kubectl delete deployment keeper-deployment
    

    Watch for the pods to be deleted with kubectl get pods.

  2. Deploy the maintenance pod.

    First update keeper-mgmt-pod.yaml with the new uWSGI container’s image name. Look for the lines:

    - name: uwsgi
      image: "lsstsqre/ltd-keeper:latest"  # update this
    

    Then deploy the pod:

    kubectl create -f keeper-mgmt-pod.yaml
    
  3. Log into the maintenance pod and apply the migration:

    kubectl exec keeper-mgmt -c uwsgi -i -t /bin/bash
    

    To apply the migration:

    ./run.py db upgrade
    

    When the upgrade is complete, log out of the management pod’s shell:

    exit
    
  4. Delete the management pod:

    kubectl delete pod keeper-mgmt
    
  5. Deploy the application.

    First update the uWSGI container’s image name in keeper-deployment.yaml to match the one used by the maintenance pod, and deploy it:

    kubectl create -f keeper-deployment
    

Troubleshooting

Unexpected branched state

It’s possible for Alembic to get into an unexpected branching state, producing an error message during a run.py db upgrade like:

alembic.util.exc.CommandError: Requested revision 1ba709663f26 overlaps with other requested revisions 0c0c70d73d4b

The run.py db heads and run.py db branches and run.py db current commands will show a normal, linear version history. A true validation is to inspect the alembic_version table in the database.

Following Connect to the Cloud SQL Instance and Create a keeper Database, log into the database and show the alembic_version table:

use keeper;
select * from alembic_version;

If more than one version row is present, then the table can be easily reset. First, drop the alembic_version table:

drop table alembic_version;

Then in the management pod, stamp the database version:

./run.py db stamp $VERSION

where $VERSION is the ID of the known current migration. This creates a new alembic_version table with a single row specifying the current version. Now the database upgrade can be retried.