-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCLA DATA Predicates file #23
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
999e209
anemia.yaml UCLA dataset
Simonlee711 08a191a
TODO
Simonlee711 b612691
Merge branch 'refs/heads/main' into ucla-data
kamilest c85f8c1
Update .gitignore with OS- and IDE-specific files
kamilest 15fcf4a
Run pre-commit
kamilest 065b20b
UCLA 24 HR ICU Morality Task
Simonlee711 fcd76e5
README
Simonlee711 d75ad72
code rabbit fixes
Simonlee711 f0b3af5
code rabbit fixes
Simonlee711 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# UCLA DDR MEDS Dataset | ||
|
||
UCLA Health serves well over half a million patients annually, and as a result, UCLA possesses a vast database of Electronic Health Records (EHR). The stemming Discovery Data Repository (DDR) serves as an invaluable resource for researchers. Within the DDR Medical record concepts and organization, and the UCLA ATLAS Precision Health Biobank which integrates genetic information and de-identified medical records enable precision health research. | ||
|
||
## Access Requirements | ||
|
||
In order to gain access you will need to clear and obtain HIPAA compliance. *HIPAA compliance is the practice of following the Health Insurance Portability and Accountability Act (HIPAA) to protect the privacy and security of health information.* The dataset is not publicly available but people within UCLA can help run analysis on your study by sending over **source code** and **model weights** if it is a pre-trained architecture. | ||
|
||
- **License (for files)**: Specify the license under which the dataset files are distributed. (Pending) | ||
- **Data Use Agreement**: Specify any data use agreement that must be signed to access the dataset. (Pending) | ||
|
||
## Supported Tasks | ||
|
||
Currently the tasks supported by the UCLA-MEDS format are follows: | ||
|
||
- `tasks/mortality/in_icu/first_24h.yaml` | ||
|
||
## MEDS-transformation | ||
|
||
Researchers at UCLA have transformed our DDR database into the medsformat making it quick and easy to run any analyses at our institution. | ||
|
||
## Sources | ||
|
||
Summarize the sources of the dataset. If the dataset is a combination of multiple sources, list them here. | ||
|
||
1. https://link-to-dataset.org | ||
|
||
## Contact | ||
|
||
For queries or questions. Feel free to send emails to | ||
|
||
- Simon A. Lee ([email protected]) | ||
- Jeffrey N. Chiang ([email protected]) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
predicates: | ||
hospital_admission: | ||
code: { regex: "^HOSPITAL_ADMISSION//.*" } | ||
hospital_discharge: | ||
code: { regex: "^HOSPITAL_DISCHARGE//.*" } | ||
|
||
ED_registration: | ||
code: { regex: "^ED_REGISTRATION//.*" } | ||
ED_discharge: | ||
code: { regex: "^ED_DISCHARGE//.*" } | ||
|
||
icu_admission: | ||
code: { regex: "^ICU_ADMISSION//.*" } | ||
icu_discharge: | ||
code: { regex: "^ICU_DISCHARGE//.*" } | ||
|
||
death: | ||
code: MORTALITY |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Documentation needed for predicate pattern format
The regex patterns for hospital events are consistently implemented across multiple datasets (UCLA, MIMIC-IV), but there's no documentation explaining:
This documentation would be valuable since:
🔗 Analysis chain
Consider documenting the expected format after "//"
The regex patterns for hospital events look good, but it would be helpful to document:
Let's check if there are any existing examples or documentation:
The previous search revealed that this pattern is consistently used across multiple datasets (MIMIC-IV, UCLA, AUMCdb). Let's check for documentation and example usage to make a more informed decision about whether documentation is really needed.
Let's check the README content and actual usage examples to make a final determination about whether documentation is needed.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
Length of output: 6849
Script:
Length of output: 591
Script:
Length of output: 807