Skip to content

Commit

Permalink
Update index.md
Browse files Browse the repository at this point in the history
  • Loading branch information
mbaudis committed Oct 21, 2024
1 parent 12e29a0 commit 7f760a1
Showing 1 changed file with 21 additions and 1 deletion.
22 changes: 21 additions & 1 deletion docs/importers/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,26 @@ Use the provided variants table template to reformat your input data accordingly

Some considerations:

*
* the upstream ids (`analysis_id`, `biosample_id`, `individual_id`) have to match the
ones in the metadata tables
* for `sequence_id` please use the refSeq ids for GRCh38 (e.g. `refseq:NC_000005.10`
instead of `chr5` or such); see `bycon/rsrc/genomes/grch38/refseq_chromosomes.yaml`

...

### 4. Import the data

Importing data requires the existence of all upstream entities. Basically, if one
imports biosamples it is checked if they have values for `individual_id` and then if
these values have existing records. Therefore a complete import sequence when using a
combined `metadata.tsv` file will look like:

1. `importers/individualsInserter.py -d myOwnDatasetName -i wherever/metadata.tsv`
2. `importers/biosamplesInserter.py -d myOwnDatasetName -i wherever/metadata.tsv`
3. `importers/analysisInserter.py -d myOwnDatasetName -i wherever/metadata.tsv`
4. `importers/variantsInserter.py -d myOwnDatasetName -i wherever/variants.tsv`

The `-d` option is used to provide a dataset name (of an existing dataset; see
further information how to get there...).


0 comments on commit 7f760a1

Please sign in to comment.