Skip to content

Commit

Permalink
add omop teva module (#61)
Browse files Browse the repository at this point in the history
  • Loading branch information
svittoz authored Jun 11, 2024
1 parent da3ed10 commit 3a70b96
Show file tree
Hide file tree
Showing 21 changed files with 30,670 additions and 2 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -120,3 +120,6 @@ ENV/
Biology_summary/*
my_custom_config.csv
eds_scikit/biology/viz_other/

# Plot test
omop_teva/*
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ repos:
hooks:
- id: blacken-docs
additional_dependencies: [black==20.8b1]
exclude: notebooks/
exclude: '(notebooks/|(.*)configuration-omop)'
- repo: https://github.com/pycqa/flake8
rev: 4.0.1
hooks:
Expand Down
3 changes: 3 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@
- Pyarrow fix now work on spark executors.
- Fix OMOP _date columns issue

### Added
- omop teva module

## v0.1.7 (2024-04-12)
### Changed
- Support for pyarrow > 0.17.0
Expand Down
86 changes: 86 additions & 0 deletions docs/functionalities/omop-teva/configuration-omop.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# OMOP Teva - Config

All plots generated by ```generate_omop_teva``` are based on the configuration file ```eds_scikit.plot.default_omop_teva_config```.

## Table configuration

A table configuration is defined by 3 parameters :

- __category columns__ list
- __date column__
- category columns __mapping__

Here is two possible configurations for OMOP condition table :

=== "Default condition teva configuration"

```python
"condition_occurrence": {
"category_columns": [
"visit_occurrence_id",
"care_site_short_name",
"condition_source_value",
"stay_source_value",
"visit_source_value",
"admission_reason_source_value",
"visit_type_source_value",
"destination_source_value",
"cdm_source",
],
"date_column": "condition_start_datetime",
"mapper": {
"visit_occurrence_id": {"not NaN": ".*"},
"condition_source_value": {"not NaN": ".*"},
},
},
```

=== "Custom diabete condition teva configuration"

```python
"condition_occurrence": {
# (1) Some columns were removed .
"category_columns": [
"visit_occurrence_id",
"care_site_short_name",
"condition_source_value",
"visit_source_value",
"visit_type_source_value",
"cdm_source",
],
# (2) Date column remain the same .
"date_column": "condition_start_datetime",
"mapper": {
"visit_occurrence_id": {"not NaN": ".*"},
# (3) Mapping to diabetic conditions .
"condition_source_value": {"has_diabete": r"^E10|^E11|^E12|^E13|^E14|O24"},
},
},
```


## Specifying table configuration

To specify configuration, simply update ```default_omop_teva_config``` and pass it to ```generate_omop_teva```.

```python
from eds_scikit.plot import generate_omop_teva
from eds_scikit.io.omop_teva_default_config import default_omop_teva_config

omop_teva_config = default_omop_teva_config

condition_mapper = {
"condition_source_value": {"has_diabete": r"^E10|^E11|^E12|^E13|^E14|O24"}
}

omop_teva_config["condition_occurrence"]["mapper"].update(condition_mapper)

start_date, end_date = "2021-01-01", "2021-12-01"
generate_omop_teva(data=data,
start_date=start_date,
end_date=end_date,
teva_config=omop_teva_config)
```

!!! warning "Adding a new table in default_omop_teva_config"
Feel free to add any new table in the configuration. Just make sure it has a ```visit_occurrence_id``` column.
Loading

0 comments on commit 3a70b96

Please sign in to comment.