-
Notifications
You must be signed in to change notification settings - Fork 192
Modifying the database schema
This is a temporary section with instructions for developers to have them test the database migrations that will be released with v2.0. The instructions below hopefully make it as easy as possible to test this.
- Checkout the latest
develop
branch:git checkout develop && git pull
- Install latest dependencies:
pip install -U -e .[tests,pre-commit]
- Run
verdi status
: this will update your configuration to the latest schema version
-
Create a clone of the PostgreSQL database you want to migrate
- Login as the
postgres
user:sudo su - postgres
- Load the postgres program:
psql
- If it is already loaded in postgres, you can clone it in
psql
directly:CREATE DATABASE aiida_clone WITH TEMPLATE aiida_original_db OWNER aiida;
Make sure to change the names of the databases and the owner of course. IMPORTANT: if you are setting a new database owner, you'll need also to assign permissions for each table. Check here to know how to do it. - If the database is on another machine and you want to test the migration on your workstation.
- Go to the remote machine and dump the database:
pg_dump --no-privileges --no-owner -h localhost -d aiida_original_db -U aiida -W > aiida_original_db.psql
- Copy over the
aiida_original_db.psql
file to your workstation - Create a new database in
psql
:CREATE DATABASE aiida_clone OWNER aiida;
- Load the database dump:
psql -h localhost -d aiida_original_db -U aiida -W < aiida_original_db.psql
- Go to the remote machine and dump the database:
- Login as the
-
Check statistics of the database (this information should be kept for reporting):
- Note whether it is Django or SqlAlchemy. If you don't know, run
SELECT * FROM alembic_version;
inpsql
. If it returns a value, it is SqlAlchemy, if it errors withERROR: relation "alembic_version" does not exist
it is Django - Get database node count:
SELECT count(*) FROM db_dbnode;
- Get database size:
SELECT pg_size_pretty(pg_database_size('aiida_clone'));
- Get database revision:
- For SqlAlchemy:
SELECT * FROM alembic_version;
- For Django:
SELECT name FROM django_migrations WHERE app = 'db' ORDER BY id DESC LIMIT 1;
- For SqlAlchemy:
- Note whether it is Django or SqlAlchemy. If you don't know, run
-
Create a clone of the repository (Note: this is only necessary if your database revision is below a certain revision; the migrations above it will not affect the repository, including the repository migration itself, as it will leave the original repo intact and simply write the new disk object store in parallel.)
- Django: if you have revision 0027 or above, there is no need to clone the repo
- SqlAlchemy: if your revision is in the following list, there is no need to clone the repo:
['1de112340b16', '1de112340b17', '1de112340b18', '34a831f4286d', '535039300e4a', '1feaea71bd5a', '7536a82b2cc4', '0edcdd5a30f0', 'bf591f31dd12', '118349c10896', '91b573400be5', '7b38a9e783e7', 'e734dd5e50d7', 'e797afa09270', '26d561acd560', '07fac78e6209', 'de2eaf6978b4', '1830c8430131', '1b8ed3425af9', '3d6190594e19', '5a49629f0d45', '5ddd24e52864', 'd254fdfed416', '61fc0913fae9', 'ce56d84bcc35']
-
Create a profile with the correct database and repository configured
- Easiest is to open
config.json
and clone an entry and simply update the name of the database and the location of the repository - For the
storage.backend
key you should simply put"psql_dos"
regardless whether the database has a Django or SqlAlchemy schema. The migrations will automatically detect this from the database itself. -
IMPORTANT: if the database has an old schema version (see the point above) you should have made a clone of the repository and you should make sure that the
storage.config.repository_uri
key points to the correct path
- Easiest is to open
-
Before running any migration, you should be able to run
verdi -p PROFILE status
andverdi -p PROFILE storage version
- Make sure the daemon is not running
- Run
time verdi -p aiida-profile storage migrate -f
. IMPORTANT do not forget thetime
in front. We would like to gather this information to get an idea of how long the migrations typically take. - Copy the log messages from the migrations printed to stdout.
Note: You can use verdi devel run-sql "SQL_TEXT"
, to run any SQL against the profile's database
- Rerun the statistics database size and node count in
psql
:SELECT count(*) FROM db_dbnode;
SELECT pg_size_pretty(pg_database_size('aiida_clone'));
- Run
verdi status
and check that storage connection is green - Run
verdi storage info --statistics
and check the outcome is as expected - Open
verdi shell
and do some tests: queries, opening repository files of some nodes etc.
For each database for which you test the migration, please report the following:
- Database backend (Django or SqlAlchemy)
- Starting revision
- Node count before migration
- Node count after migration
- Database size before migration
- Database size after migration
- Time taken for the actual migration
- Messages printed to stdout by the migrations
- Any errors you encountered or problems you noticed afterwards when manually inspecting the data
- Output of the following command:
verdi devel run-sql "SELECT pt.tablename AS TableName, t.indexname AS IndexName, pc.reltuples, pg_size_pretty(pg_relation_size(quote_ident(pt.tablename)::text)), pg_size_pretty(pg_relation_size(quote_ident(t.indexrelname)::text)), t.idx_scan FROM pg_tables AS pt LEFT OUTER JOIN pg_class AS pc ON pt.tablename=pc.relname LEFT OUTER JOIN (SELECT pc.relname AS TableName, pc2.relname AS IndexName, psai.idx_scan, psai.indexrelname FROM pg_index AS pi JOIN pg_class AS pc ON pc.oid = pi.indrelid JOIN pg_class AS pc2 ON pc2.oid = pi.indexrelid JOIN pg_stat_all_indexes AS psai ON pi.indexrelid = psai.indexrelid )AS T ON pt.tablename = T.TableName WHERE pt.schemaname='public';"
The Django database schema can be found in aiida.backends.djsite.db.models <https://github.com/aiidateam/aiida-core/tree/develop/aiida/backends/djsite/db/models.py>
_.
If you need to change the database schema follow these steps:
-
Make all the necessary changes to
aiida.backends.djsite.db.models <https://github.com/aiidateam/aiida-core/tree/develop/aiida/backends/djsite/db/models.py>
_. -
Create a new migration file by running::
$ python aiida/backends/djsite/manage.py makemigrations
This will create the migration file in
aiida/backends/djsite/db/migrations
whose name begins with a number followed by some description. If the description is not appropriate then change it to something better but retain the number. -
Open the generated file and make the following changes:
.. code-block:: python
from aiida.backends.djsite.db.migrations import upgrade_schema_version ... REVISION = # choose an appropriate version number (hint: higher than the last migration!) DOWN_REVISION = # the revision number of the previous migration ... class Migration(migrations.Migration): ... operations = [ ... upgrade_schema_version(REVISION, DOWN_REVISION) ]
-
The migration file now contains some migrations steps that were generated automatically. Please make sure that they are correct. Also, if you want to add some changes that affect the content of the database -- you should do it "manually" by adding some sql commands that will run directly on your database:
.. code-block:: python
forward_sql = [ """UPDATE db_dbgroup SET type_string = 'auto.import' WHERE type_string = 'aiida.import';""", """UPDATE db_dbgroup SET type_string = 'auto.run' WHERE type_string = 'autogroup.run';""", ] reverse_sql = [ """UPDATE db_dbgroup SET type_string = 'aiida.import' WHERE type_string = 'auto.import';""", """UPDATE db_dbgroup SET type_string = 'autogroup.run' WHERE type_string = 'auto.run';""", ] ... operations = [ ... migrations.RunSQL( sql='\n'.join(forward_sql), reverse_sql='\n'.join(reverse_sql)), upgrade_schema_version(REVISION, DOWN_REVISION), ... ]
As you can see here, you should not only provide the SQL commands to upgrade your database, but also the commands to revert these changes in case you want to perform a downgrade (see:
sql=forward_sql
,reverse_sql=reverse_sql
) -
Change the
LATEST_MIGRATION
variable inaiida/backends/djsite/db/migrations/__init__.py
to the name of your migration file:.. code-block:: python
LATEST_MIGRATION = '0003_my_db_update'
This allows AiiDA to get the version number from your migration and make sure the database and the code are in sync.
-
Migrate your database to the new version using
verdi
and specifying the correct profile::$ verdi -p {profile} database migrate
In case you want to (and, most probably, you should) test the downgrade operation, please check the list of available versions of the database::
$ python aiida/backends/djsite/manage.py showmigrations db
The output will look something like the following::
db [X] 0001_initial [X] 0002_db_state_change [X] 0003_add_link_type [X] 0004_add_daemon_and_uuid_indices [X] 0005_add_cmtime_indices [X] 0006_delete_dbpath [X] 0007_update_linktypes [X] 0008_code_hidden_to_extra [X] 0009_base_data_plugin_type_string [X] 0010_process_type [X] 0011_delete_kombu_tables [X] 0012_drop_dblock [X] 0013_django_1_8 [X] 0014_add_node_uuid_unique_constraint [X] 0015_invalidating_node_hash [X] 0016_code_sub_class_of_data [X] 0017_drop_dbcalcstate [X] 0018_django_1_11
Choose the previous migration step and migrate to it::
$ python aiida/backends/djsite/manage.py migrate db 0017_drop_dbcalcstate
Check that both upgrade and downgrade changes are successful and if yes, go to the next step.
-
Add tests for your migrations to the
aiida-core/aiida/backends/djsite/db/subtests/migrations
module.
| Note | Such a test can only be applied to the migration of the database content. | For example, you can not test modifications of the database column names.
The SQLAlchemy database schema can be found in aiida.backends.sqlalchemy.models <https://github.com/aiidateam/aiida-core/tree/develop/aiida/backends/sqlalchemy/models>
_.
If you need to change the database schema follow these steps:
-
Make all the necessary changes to the model than you would like to modify located in the
aiida/backends/sqlalchemy/models
directory. -
Create a new migration file by going to
aiida/backends/sqlalchemy
and executing::$ python aiida/backends/sqlalchemy/manage.py revision "This is the description for the next revision"
This will create a new migration file in
aiida/backends/sqlalchemy/migrations/versions
whose names begins with an automatically generated hash and the provided message for this new migration. Modify the migration message to accurately describe the purpose of the migration. -
Have a look at the generated migration file and ensure the migration is correct. The file should contain automatically generated hashes that point to the previous and to the current revision:
.. code-block:: python
revision = 'e72ad251bcdb' down_revision = 'b8b23ddefad4'
Also
upgrade()
anddowngrade()
function definitions should be present in the file:.. code-block:: python
def upgrade(): # some upgrade operations def downgrade(): # some downgrade operations
If you want to add some changes that affect the content of the database -- you should do it "manually" by adding some SQL commands that will run directly on your database. See the following example and adapt it for your needs:
.. code-block:: python
from sqlalchemy.sql import text forward_sql = [ """UPDATE db_dbgroup SET type_string = 'auto.import' WHERE type_string = 'aiida.import';""", """UPDATE db_dbgroup SET type_string = 'auto.run' WHERE type_string = 'autogroup.run';""", ] reverse_sql = [ """UPDATE db_dbgroup SET type_string = 'aiida.import' WHERE type_string = 'auto.import';""", """UPDATE db_dbgroup SET type_string = 'autogroup.run' WHERE type_string = 'auto.run';""", ] def upgrade(): conn = op.get_bind() statement = text('\n'.join(forward_sql)) conn.execute(statement) def downgrade(): conn = op.get_bind() statement = text('\n'.join(reverse_sql)) conn.execute(statement)
If you want to learn more about the migration operations, you can have a look at the
Alembic documentation <https://alembic.sqlalchemy.org/en/latest/>
_. -
Migrate your database to the new version using
verdi
and specifying the correct profile::$ verdi -p {profile} database migrate
-
Add tests for your migrations to
aiida-core/aiida/backends/sqlalchemy/tests/test_migrations.py
The alembic_manage.py provides several options to control your SQLAlchemy migrations. By executing::
$ python aiida/backends/sqlalchemy/manage.py --help
you will get a full list of the available arguments that you can pass and commands. Briefly, the available commands are:
- upgrade: Upgrade to the later version.
- downgrade: Downgrade the version of your database.
- history: Lists the available migrations in chronological order.
- current: Displays the current version of the database.
- revision: Creates a new migration file based on the model changes.
Alembic migrations should work automatically and migrate your database to the latest version. However, if you were using SQLAlchemy before we introduced Alembic, you may get a message like to following during the first migration::
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) relation
"db_dbuser" already exists [SQL: '\nCREATE TABLE db_dbuser (\n\tid SERIAL
NOT NULL, \n\temail VARCHAR(254), \n\tpassword VARCHAR(128),
\n\tis_superuser BOOLEAN NOT NULL, \n\tfirst_name VARCHAR(254),
\n\tlast_name VARCHAR(254), \n\tinstitution VARCHAR(254), \n\tis_staff
BOOLEAN, \n\tis_active BOOLEAN, \n\tlast_login TIMESTAMP WITH TIME ZONE,
\n\tdate_joined TIMESTAMP WITH TIME ZONE, \n\tCONSTRAINT db_dbuser_pkey
PRIMARY KEY (id)\n)\n\n']
In this case, you should manually create the Alembic table in your database and add a line with the database version number. To do so, use psql to connect to the desired database::
$ psql aiidadb_sqla
where you should replace aiidadb_sqla
with the name of the database that you would like to modify.
Then, execute the following commands:
.. code-block:: sql
CREATE TABLE alembic_version (version_num character varying(32) not null, PRIMARY KEY(version_num));
INSERT INTO alembic_version VALUES ('e15ef2630a1b');
GRANT ALL ON alembic_version TO aiida;