This document describes the basics of how to set up the TraceBase Project repository in order to start developing/contributing.
Install the latest Python (version 3.9.1), and make sure it is in your path:
$ python --version
Python 3.9.1
Test to make sure that the python
command now shows your latest python install:
$ python --version
Python 3.9.1
Install Postgres via package installer from
https://www.postgresql.org. Be sure to make note
of where it installs the psql
command-line utility, so you can add it to your
PATH, e.g. if you see:
Command Line Tools Installation Directory: /Library/PostgreSQL/13
Then, add this to your PATH:
/Library/PostgreSQL/13/bin
Configuration:
Username: postgres
Password: tracebase
Port: 5432
In the Postgres app interface, you can find where the postgresql.conf
file is
located. Open it in a text editor and make sure these settings are uncommented
& correct:
client_encoding: 'UTF8'
default_transaction_isolation: 'read committed'
log_timezone = 'America/New_York'
Manually create the tracebase database (tracebase
) in postgres:
createdb -U postgres tracebase
Create a tracebase postgres user:
> create user tracebase with encrypted password 'mypass';
> ALTER USER tracebase CREATEDB;
> grant all privileges on database tracebase to tracebase;
git clone https://github.com/Princeton-LSI-ResearchComputing/tracebase.git
cd tracebase
Create a virtual environment (from a bash shell) and activate it, for example:
python3 -m venv .venv
source .venv/bin/activate
Install Django and psycopg2 dependencies as well as linters and other
development related tools. Use requirements/prod.txt
for production
dependencies.
python -m pip install -U pip
python -m pip install -r requirements/dev.txt
Django:
python3 -m django --version
4.2.16
Create a new secret:
python -c "import secrets; print(secrets.token_urlsafe())"
Database and secret key information should not be stored directly in settings
that are published to the repository. We use environment variables to store
configuration data. This makes it possible to easily change between
environments of a deployed application (see The Twelve-Factor
App). The .env
file you create here is pre-
configured to be ignored by the repository, so do not explicitly check it in.
Copy the TraceBase environment example:
cp TraceBase/.env.example TraceBase/.env
Update the .env file to reflect the new secret key and the database credentials you used when setting up Postgres.
Set up the project's postgres database:
python manage.py migrate
python manage.py createcachetable
To be able to access the admin page, on the command-line, run:
python manage.py createsuperuser
and supply your desired account credentials for testing.
python manage.py loaddata data_types data_formats
python manage.py load_study --infile DataRepo/data/examples/compounds_tissues_treatments_lcprotocols/study.xlsx
python manage.py load_study --infile DataRepo/data/examples/13C_Valine_and_PI3Ki_in_flank_KPC_mice/study.xlsx
python manage.py load_study --infile DataRepo/data/examples/obob_fasted/study.xlsx
python manage.py load_study --infile DataRepo/data/examples/obob_fasted_ace_glycerol_3hb_citrate_eaa_fa_multiple_tracers/study.xlsx
python manage.py load_study --infile DataRepo/data/examples/obob_fasted_glc_lac_gln_ala_multiple_labels/study.xlsx
To run the development server in your sandbox, execute:
python manage.py runserver
Then go to this site in your web browser:
http://127.0.0.1:8000/
All pull requests must pass linting prior to being merged.
Currently, all pushes are linted using GitHub's Super-Linter. The configuration files for the most used linters have been setup in the project root to facilitate linting on developers' machines.
Linting for this project runs automatically on GitHub when a PR is submitted, but this section describes how to lint your changes locally.
For the most commonly used linters (e.g. for python, HTML, and Markdown files) it is recommended to install linters locally and run them in your editor. Some linters that may be useful to install locally include:
- Code
- Python
- JavaScript
- HTML
- CSS
- Markdown
- Config
It is recommended to run superlinter (described below) routinely or automatically before submitting a PR, but if you want a quick check while developing, you can run these example linting commands on the command-line, using each linter's config that we've set up for superlinter:
find . \( -type f -not -path '*/\.*' -not -path "*bootstrap*" \
-not -path "*__pycache__*" \) -exec jscpd {} \;
flake8 --config .flake8 --extend-exclude migrations,.venv .
pylint --rcfile .pylintrc --load-plugins pylint_django \
--django-settings-module TraceBase.settings -d E1101 \
TraceBase DataRepo *.py
black --exclude '\.git|__pycache__|migrations|\.venv' .
isort --sp .isort.cfg -c -s migrations -s .venv -s .git -s __pycache__ .
mypy --config-file .mypy.ini --disable-error-code annotation-unchecked .
find . \( ! -iname "*bootstrap*" -not -path '*/\.*' -iname "*.js" \) \
-exec standard --fix --verbose {} \;
htmlhint -c .htmlhintrc .
stylelint --config .stylelintrc.json --ip '**/bootstrap*' **/*.css
markdownlint --config .markdown-lint.yml .
textlint -c .textlintrc.json **/*.md
editorconfig-checker -v -exclude '__pycache__|\.DS_Store|\~\$.*' TraceBase DataRepo
Note, some of these linter installs can be rather finicky, so if you have trouble, consider running Super-Linter locally, as described below.
In addition to linting files as you write them, developers may wish to run Superlinter on the entire repository locally. This is most easily accomplished using Docker. Create a script outside of the repository that runs superlinter via docker and run it from the repository root directory. Example script:
#!/usr/bin/env sh
docker pull github/super-linter:slim-v6
docker run \
-e FILTER_REGEX_EXCLUDE="(\.pylintrc|migrations|static\/bootstrap.*)" \
-e LINTER_RULES_PATH="/" \
-e IGNORE_GITIGNORED_FILES=true \
-e RUN_LOCAL=true \
-v /full/path/to/tracebase/:/tmp/lint github/super-linter
Note: The options FILTER_REGEX_EXCLUDE
, LINTER_RULES_PATH
, and
IGNORE_GITIGNORED_FILES
should match the settings in the GitHub Action in
.github/workflows/superlinter.yml
All pull requests must implement tests of the changes implemented prior to being
merged. Each app should either contain tests.py
or a tests
directory
containing multiple test scripts. Currently, all tests are implemented using
the TestCase framework.
See these resources for help implementing tests:
- Testing in Django (Part 1) - Best Practices and Examples
- Django Tutorial Part 10: Testing a Django web application
All pull requests must pass new and all previous tests, and pass a migration check before merging. Run the following locally before submitting a pull request:
python manage.py test
python manage.py makemigrations --check --dry-run
Any pull requests that include changes to the model, must include an update to the migrations and the resulting auto-generated migration scripts must be checked in.
Create the migration scripts:
python manage.py makemigrations
Check for unapplied migrations:
python manage.py showmigrations
Apply migrations to the postgres database:
python manage.py migrate
TraceBase has an ArchiveFile
class that is used to store data files on the
file system. The files
are stored locally using the
MEDIA_ROOT
and
MEDIA_URL
settings. A
FileField
is used to manage store the files and to track the storage location in the
database.
Archived files are stored in
{MEDIA_ROOT}/archive_files/{YYYY-MM}/{DATA_TYPE}/{FILENAME}"
. Duplicate file
names are made unique by Django's
Storage.save()
method.
When running tests, it is desirable that the file stored do not remain on the
file system after testing is complete. This is accomplished in TraceBase by
using a custom test runner Tracebase/runner.py
. The test runner changes the
MEDIA_ROOT
and DEFAULT_FILE_STORAGE
settings during test runs to use a
temporary location on local file storage.
Per the
FileField.delete()
documentation, when a model is deleted, related files are not deleted. If you
need to cleanup orphaned files, you’ll need to handle it yourself (for
instance, with a custom management command that can be run manually or
scheduled to run periodically via e.g. cron). See Management command to list
orphaned files in MEDIA_ROOT
#718.