Skip to content

Commit

Permalink
Merge pull request #15 from mmcdermott/build_task_cmd
Browse files Browse the repository at this point in the history
A quick helper to help build ACES task labels.
  • Loading branch information
mmcdermott authored Sep 3, 2024
2 parents a124f02 + 77e8af5 commit e7119d1
Show file tree
Hide file tree
Showing 28 changed files with 83 additions and 20 deletions.
24 changes: 24 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,27 @@ TODO
### To Add Results

TODO

## Helpers

### To extract a task

First, clone the repo and install it locally with `pip install .` Then, make sure you have the desired task
criteria and dataset predicates yaml files in their respective locations in the repo.

Finally, run the following:

```bash
./src/MEDS_DEV/helpers/extract_task.sh $MEDS_ROOT_DIR $DATASET_NAME $TASK_NAME
```

E.g.,

```bash
./src/MEDS_DEV/helpers/extract_task.sh ../MEDS_TAB_COMPL_TEST/MIMIC-IV/ MIMIC-IV mortality/in_icu/first_24h
```

which will use the `datasets/MIMIC-IV/predicates.yaml` predicates file, the
`tasks/criteria/mortality/in_icu/first_24h.yaml` task criteria, and will run over the dataset in the root
directory at `../MEDS_TAB_COMPL_TEST/MIMIC-IV`, reading data from the `data` subdir of that root dir and
writing labels to the `task_labels` subdir of that root dir, in a name dependent manner.
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ requires = ["setuptools>=64", "setuptools-scm>=8.0", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "MEDS-DEV"
name = "MEDS_DEV"
dynamic = ["version"]
authors = [
{name="Matthew B. A. McDermott", email="[email protected]"},
Expand All @@ -28,7 +28,7 @@ classifiers = [
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
]
dependencies = ["meds==0.3", "es-aces==0.3.0"]
dependencies = ["meds==0.3.3", "es-aces==0.5.0"]

[tool.setuptools_scm]

Expand Down
17 changes: 0 additions & 17 deletions src/MEDS-DEV/datasets/MIMIC-IV/predicates.yaml

This file was deleted.

2 changes: 1 addition & 1 deletion src/MEDS-DEV/__init__.py → src/MEDS_DEV/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from importlib.metadata import PackageNotFoundError, version

__package_name__ = "MEDS-DEV"
__package_name__ = "MEDS_DEV"
try:
__version__ = version(__package_name__)
except PackageNotFoundError:
Expand Down
23 changes: 23 additions & 0 deletions src/MEDS_DEV/configs/_ACES_MD.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
defaults:
- _aces
- override data: sharded
- _self_

dataset_name: ${oc.env:MEDS_DATASET_NAME}
task_name: ${oc.env:MEDS_TASK_NAME}
output_dir: "${oc.env:MEDS_ROOT_DIR}/task_labels"

# TODO: find a nice way to have this be inferred automatically
MEDS_DEV_dir: "${oc.env:MEDS_DEV_REPO_DIR}"

data:
standard: meds
root: "${oc.env:MEDS_ROOT_DIR}/data"

# Cohort directory and name: used for automatically loading configs, saving results, and logging.
cohort_dir: ${output_dir}
cohort_name: ${task_name}

# Path to the task configuration file
config_path: ${MEDS_DEV_dir}/tasks/criteria/${task_name}.yaml
predicates_path: ${MEDS_DEV_dir}/datasets/${dataset_name}/predicates.yaml
Empty file.
File renamed without changes.
18 changes: 18 additions & 0 deletions src/MEDS_DEV/datasets/MIMIC-IV/predicates.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
predicates:
hospital_admission:
code: { regex: "^HOSPITAL_ADMISSION//.*" }
hospital_discharge:
code: { regex: "^HOSPITAL_DISCHARGE//.*" }

ED_registration:
code: { regex: "^ED_REGISTRATION//.*" }
ED_discharge:
code: { regex: "^ED_OUT//.*" }

icu_admission:
code: { regex: "^ICU_ADMISSION//.*" }
icu_discharge:
code: { regex: "^ICU_DISCHARGE//.*" }

death:
code: MEDS_DEATH
File renamed without changes.
Empty file.
15 changes: 15 additions & 0 deletions src/MEDS_DEV/helpers/extract_task.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/bin/bash

export MEDS_ROOT_DIR=$1
export MEDS_DATASET_NAME=$2
export MEDS_TASK_NAME=$3

shift 3

MEDS_DEV_REPO_DIR=$(python -c "from importlib.resources import files; print(files(\"MEDS_DEV\"))")
export MEDS_DEV_REPO_DIR

SHARDS=$(expand_shards "$MEDS_ROOT_DIR"/data)

aces-cli --config-path="$MEDS_DEV_REPO_DIR"/configs --config-name="_ACES_MD" \
"hydra.searchpath=[pkg://aces.configs]" "data.shard=$SHARDS" -m "$@"
File renamed without changes.
File renamed without changes.

0 comments on commit e7119d1

Please sign in to comment.