Provenience of discharge summaries

This repository contains source code for the paper Hospital Discharge Summarization Data Provenance. If you only want to use the annotations, see the Inclusion in Your Projects section. The unsupervised automated method code is also located in a separate repository.

This repository contains the annotations used in the paper and classes for creating discharge summary provenience of data annotations. This project attempts to give an idea of from where physicians copy/paste and/or summarize previous medical records when writing discharge summaries.

The project also used an automated method for note matching and an automated method for note segmentation.

Inclusion in Your Projects

The purpose of this repository is to reproduce the results in the paper. If you want to use the annotations and/or use the pretrained model, please refer to the zensols.dsprov repository. This repository also provides a Docker image. If you use our annotations and/or code, please cite our paper.

Reproducing Results

The source annotation files are necessary to reproduce our results. Those can be obtained by requesting them from the authors.

Important: you must provide proof that you have access to by requesting MIMIC-III access in your email request for the source annotations.

Dependencies:

A macOS machine.
Microsoft Word (used to annotate spans across notes).
GNU make. What default that comes with macOS should be sufficient. However, brew might be necessary to install the GNU version of some system tools.

Steps to reproducing:

Clone this repository and go in to it: git clone https://github.com/uic-nlp-lab/dsprov && cd dsprov
Optionally create a virtual environment: python -m venv <Python install dir>
Install Python dependencies: SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True make deps
Copy the source annotations compressed file to the current directory.
Install the source annotation files in the corpus/completed directory: $ unzip dsprov-source-annotations.zip
Load MIMIC-III by following the Postgres instructions. Also see the zensols.mimic instructions for an SQLite alternative.
Edit etc/db.conf using the parameters of the installed database from the previous step.
Tell programs where to find the database configuration (assuming Bash): export MIMICSIDRC=./etc/db.conf
Create the corpus and matching statistics (also confirms everything is installed and working): ./harness.py excel -o match.xlsx
Check for errors and confirm the data in generated file is sound: open match.xlsx
Run the hyperparameter optimization: ./src/bin/opthyper.py opt -e 500

Citation

If you use this project in your research please use the following BibTeX entry:

@inproceedings{landesHospitalDischargeSummarization2023,
  title = {Hospital {{Discharge Summarization Data Provenance}}},
  booktitle = {The 22nd {{Workshop}} on {{Biomedical Natural Language Processing}} and {{BioNLP Shared Tasks}}},
  author = {Landes, Paul and Chaise, Aaron and Patel, Kunal and Huang, Sean and Di Eugenio, Barbara},
  date = {2023-07},
  pages = {439--448},
  publisher = {{Association for Computational Linguistics}},
  location = {{Toronto, Canada}},
  url = {https://aclanthology.org/2023.bionlp-1.41},
  urldate = {2023-07-10},
  eventtitle = {{{BioNLP}} 2023}
}

Also please cite the Zensols Framework:

@article{Landes_DiEugenio_Caragea_2021,
  title={DeepZensols: Deep Natural Language Processing Framework},
  url={http://arxiv.org/abs/2109.03383},
  note={arXiv: 2109.03383},
  journal={arXiv:2109.03383 [cs]},
  author={Landes, Paul and Di Eugenio, Barbara and Caragea, Cornelia},
  year={2021},
  month={Sep}
}

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
dist		dist
etc		etc
resources		resources
src		src
zenbuild @ b94e074		zenbuild @ b94e074
.gitignore		.gitignore
.gitmodules		.gitmodules
CITATION.cff		CITATION.cff
LICENSE.md		LICENSE.md
README.md		README.md
harness.py		harness.py
makefile		makefile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Provenience of discharge summaries

Inclusion in Your Projects

Reproducing Results

Citation

License

About

Releases

Packages

Languages

License

uic-nlp-lab/dsprov

Folders and files

Latest commit

History

Repository files navigation

Provenience of discharge summaries

Inclusion in Your Projects

Reproducing Results

Citation

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages