Skip to content

Commit

Permalink
Merge pull request #27 from rvandewater/main
Browse files Browse the repository at this point in the history
Adding AUMCdb to MEDS-DEV
  • Loading branch information
mmcdermott authored Oct 29, 2024
2 parents 96a5bf6 + 65c6c27 commit 8d7547c
Show file tree
Hide file tree
Showing 6 changed files with 101 additions and 4 deletions.
34 changes: 34 additions & 0 deletions src/MEDS_DEV/datasets/AUMCdb/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# AUMCdb

## Description

The first freely accessible intensive care database from within the European Union containing de-identified health data related to tens of thousands of European intensive care unit admissions, including demographics, vital signs, laboratory tests and medications.

## Access Requirements

Taken from [official website](https://amsterdammedicaldatascience.nl/amsterdamumcdb/#requesting-access):

- **Access Policy**: Fill out and sign the combined Access and End User License form.
- **License (for files)**: Specify the license under which the dataset files are distributed.
- **Data Use Agreement**: Agreement found [here](https://amsterdammedicaldatascience.nl/content/uploads/sites/2/2022/12/arfeula_v1.6.pdf).
- **Required training**: Valid training courses include the Data or Specimens Only Research (DSOR) course from CITI, the Basic Course for Clinical Investigators (BROK) from NFU or an equivalent course. The DSOR course may be taken free of charge and is also needed to gain access to the MIMIC and eICU intensive care databases from the USA.

## Supported Tasks

- `tasks/mortality/in_icu/first_24h.yaml`

## MEDS-transformation

[MEDS_transforms](https://github.com/mmcdermott/MEDS_transforms) now includes the AUMCdb example. Please refer to the [guide](https://github.com/mmcdermott/MEDS_transforms/tree/main/AUMC_Example).

## Sources

1. [AUMCdb dataset](https://amsterdammedicaldatascience.nl/amsterdamumcdb/)
2. [AUMCdb Research Paper](https://journals.lww.com/ccmjournal/fulltext/2021/06000/sharing_icu_patient_data_responsibly_under_the.16.aspx)
3. [AUMCdb Data Use Agreement](https://amsterdammedicaldatascience.nl/content/uploads/sites/2/2022/12/arfeula_v1.6.pdf)
4. [Data Repository](https://easy.dans.knaw.nl/ui/home)
5. [Code Repository](https://github.com/AmsterdamUMC/AmsterdamUMCdb)

## Disclaimer

Please refer to the data owners and the most up-to-date information when using this dataset in your research.
8 changes: 8 additions & 0 deletions src/MEDS_DEV/datasets/AUMCdb/predicates.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
predicates:
icu_admission:
code: { regex: "^ICU_ADMISSION//.*" }
icu_discharge:
code: { regex: "^ICU_DISCHARGE//.*" }

death:
code: MEDS_DEATH
5 changes: 3 additions & 2 deletions src/MEDS_DEV/datasets/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,9 @@ To contribute a new dataset:

1. Fork this repository
2. Add your dataset predicates file in its respective folder (see `MIMIC-IV/predicates.yaml` for an example of predicate structure)
3. Test locally to ensure your dataset works correctly
4. Create a pull request with your changes
3. Test locally to ensure your dataset works correctly. Ideally specify the used packages and versions in the dataset information.
4. Specify the dataset information (including supported and custom tasks) in the template README.md file in the dataset's folder.
5. Create a pull request with your changes

## Notes

Expand Down
25 changes: 25 additions & 0 deletions src/MEDS_DEV/helpers/extract_task.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,27 @@
#!/bin/bash
print_help() {
echo "Usage: $(basename "$0") <MEDS_ROOT_DIR> <MEDS_DATASET_NAME> <MEDS_TASK_NAME> [additional parameters]"
echo
echo "Arguments:"
echo " MEDS_ROOT_DIR The root directory of the MEDS dataset to be used."
echo " MEDS_DATASET_NAME The name of the dataset to be used."
echo " MEDS_TASK_NAME The name of the task to be executed."
echo
echo "Additional parameters can be passed to the aces-cli command."
echo
echo "Example:"
echo " $(basename "$0") /path/to/meds/root dataset_name task_name --some-parameter=value"
}

if [[ "$1" == "--help" || "$1" == "-h" ]]; then
print_help
exit 0
fi

if [[ $# -lt 3 ]]; then
echo "Error: Missing required arguments. See --help for usage."
exit 1
fi

export MEDS_ROOT_DIR=$1
export MEDS_DATASET_NAME=$2
Expand All @@ -11,5 +34,7 @@ export MEDS_DEV_REPO_DIR

SHARDS=$(expand_shards "$MEDS_ROOT_DIR"/data)

echo "Running task $MEDS_TASK_NAME on dataset $MEDS_DATASET_NAME with MEDS_ROOT_DIR=$MEDS_ROOT_DIR and SHARDS=$SHARDS"

aces-cli --config-path="$MEDS_DEV_REPO_DIR"/configs --config-name="_ACES_MD" \
"hydra.searchpath=[pkg://aces.configs]" "data.shard=$SHARDS" -m "$@"
3 changes: 1 addition & 2 deletions src/MEDS_DEV/tasks/criteria/mortality/in_icu/first_24h.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,9 @@ description: >-
predicates:
icu_admission: ???
icu_discharge: ???
hospital_discharge: ???
death: ???
discharge_or_death:
expr: or(icu_discharge, death, hospital_discharge)
expr: or(icu_discharge, death)

trigger: icu_admission

Expand Down
30 changes: 30 additions & 0 deletions src/MEDS_DEV/templates/dataset.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# New Dataset Template

This is a template for creating a new dataset in MEDS-DEV. The dataset should be stored in a directory named after the dataset in the `src/MEDS-DEV/datasets` directory.

## Description

Describe the dataset in a few sentences. A link to the dataset's homepage and/or repository or a research paper is recommended.

## Access Requirements

Describe any access requirements for the dataset (e.g, human species research). If the dataset is publicly available, state that here. If the dataset is not publicly available, describe the process for obtaining access. We recommend the following topics be covered:

- **Access Policy**: Describe the access policy for the dataset, including any restrictions or permissions required.
- **License (for files)**: Specify the license under which the dataset files are distributed.
- **Data Use Agreement**: Specify any data use agreement that must be signed to access the dataset.
- **Required training**: Specify any training or certification required to access the dataset.

## Supported Tasks

Describe the existing tasks already present in MEDS-DEV that are covered. If there are new tasks that can be added, describe them here. Also note the `predicates.yaml` file that specifies the dataset's predicates.

## MEDS-transformation

Shortly specify the process of transforming this dataset to the MEDS format. If the dataset is already in the MEDS format when downloaded, specify that here.

## Sources

Summarize the sources of the dataset. If the dataset is a combination of multiple sources, list them here.

1. https://link-to-dataset.org

0 comments on commit 8d7547c

Please sign in to comment.