-
Notifications
You must be signed in to change notification settings - Fork 1
data submission workflow
Detail of what scripts need to be called and what tables each script need to update.
- pip install ihm
- pip install biopython
- yum install cmake
- pip install mmcif
- pip install rcsb.utils.io
- pip install rcsb.utils.chemref
- pip install rcsb.utils.ec
- pip install rcsb.utils.seq
- pip install rcsb.utils.struct
- pip install rcsb.utils.taxonomy
- pip install rcsb.utils.multiproc
- pip install rcsb.utils.validation
- pip install rcsb.utils.config
Note: The above packages can be installed from PyPi.
In addition, the latest version of py-rcsb_db.tar.gz
file which contains the rcsb/db
directory already configured to run the required scripts is available on the salilab server (managed by Arthur).
- Following the instruction from RCSB Software Tools (only the first two steps for download and build) OR
- Install from source through
git clone --recurse-submodules https://github.com/rcsb/cpp-dict-pack.git
then
cd cpp-dict-pack
mkdir build
cd build
cmake .. -DMINIMAL_DICTS=ON
make
# This processing will generate a bin folder under build
The deployment of IHMValidation pipeline requires several actions:
- Download pre-built binary image with 3rd party dependencies
- Pull IHMValidation code from github repo
- Create a neccesary directory structure
The exact commands are available in IHMValidation deployment script and were already incorporated in the dev
and prod
deployment scripts.
- Convert partial mmCIF (user uploaded file) to mmCIF using python-ihm:
# From the scripts/make-mmCIF directory run:
python3 make-mmcif.py input.cif
Note: This package is used for converting mmCIF that can be converted to JSON and loaded into ermRest. This is not used to create mmCIF in the submission workflow.
Requirements for this step:
- Biopython
- make-mmcif.py (provided by Brinda)
- Input CIF file (e.g., input.cif) uploaded by user
- Copy output.cif from the previous step to py-rcsb_db/rcsb/db/tests-validate/test-output/ihm-files
- Convert mmCIF to JSON using py-rcsb_db:
# From the scripts/make-json/py-rcsb_db directory run:
python3 rcsb/db/tests-validate/testSchemaDataPrepValidate-ihm.py
Note: Output JSON files in rcsb/db/tests-validate/test-output
Requirements for this step:
- Brinda will provide the followings files that need to be properly installed:
- a python script i.e. rcsb/db/tests-validate/testSchemaDataPrepValidate-ihm.py
- a yml file i.e., rcsb/db/config/exdb-config-example-ihm.yml
- a json file i.e., CACHE/data_type_and_coverage/scan-ihm_dev-type-map.json
- IHM dictionary file i.e., ihm-extension.dic in CACHE/dictionaries
- Use JSON file to populate tables
- struct (editable)
- entity (editable)
- entity_poly (not editable)
- entity_poly_seq (not editable)
- pdbx_poly_seq_scheme (not editable)
- chem_comp (not editable)
- atom_type (not editable)
- struct_asym (not editable)
- ihm_entity_poly_segment (editable)
- ihm_struct_assembly (editable)
- ihm_struct_assembly_details (editable)
- ihm_model_representation (editable)
- ihm_model_representation_details (editable)
- ihm_modeling_protocol (editable)
- ihm_model_list (not editable)
- ihm_model_group (editable)
- ihm_model_group_link (editable)
- Check out a file from Entry_Related_File table that hasn't been processed.
- Retrieves the file from hatrac.
- Populates the file's corresponding table (using the File_Type) with the file content. Make sure that a foreign key for each individual row to the Entry_Related_File is added.
- Get
mmCIF dictionary software suite
from RCSB software tools website. - Follow steps 1 and 2 in the instructions for installation.
- The serialized
sdb
file (mmcif_ihm_vx.xx.sdb
) can be obtained from the IHM-dictionary Git repository. Brinda will provide the version that needs to be used, since the Deriva data model is a few versions behind the current dictionary version. - Execute command for validating mmCIF file (step 4):
./bin/CifCheck -f mmCIF_filename -dictSdb sdb_filename
To generate a validation report run the following command as the pdbihm
user from /mnt/vdb1/pdbihm
folder:
singularity exec --pid --bind IHMValidation/:/opt/IHMValidation,input:/ihmv/input,output:/ihmv/output,cache:/ihmv/cache ihmv_20231222.sif /opt/IHMValidation/ihm_validation/ihm_validator.py --output-root /ihmv/output --cache-root /ihmv/cache --force -f input/mmCIF_filename