Diverse Adapters for Knowledge Integration (DAKI)

This repository contains code and pre-trained knowledge adapters for the paper "Parameter-efficient domain knowledge integration from multiple sources for biomedical pre-trained language models ". The framework uses knowledge-specific adapters to enhance the performance of pre-trained language models (e.g., ALBERT, BERT, etc.) on biomedical NLP tasks. These adapters are pre-trained using different biomedical knowledge sources and can be integrated with BERT-like models in a parameter-efficient way.

Figure: DAKI architecture.

Repository Structure

MEDIQA-2019/, MEDNLI/, NCBI/, TRECQA-2017/, i2b2_2006/
Contains dataset-specific evaluation scripts for different biomedical NLP tasks.
pretrained_adaptermodel/
Pre-trained adapter models for knowledge sources like UMLS, Wikipedia diseases, and semantic groupings.
model.py, data.py
Core model and data handling logic for training adapters.
train_disease.py, train_graph.py, train_sgrouping.py
Scripts to pre-train adapters with different domain knowledge sources.
requirements.txt
Dependencies needed to run the scripts.

Installation

Clone this repository:

git clone https://github.com/qiuhaolu/DAKI.git
cd DAKI

Install dependencies:
```
pip install -r requirements.txt
```

Usage

For each dataset, navigate to its corresponding directory and run the evaluation script:

cd MEDIQA-2019
python run_daki.py

You can equip various base PLMs with the pre-trained knowledge adapters.

Pre-trained Adapters

Adapters are small neural networks that have been pre-trained on domain-specific knowledge sources such as:

UMLS Metathesaurus
Wikipedia Disease Articles
Semantic Grouping for Medical Concepts

They are stored in pretrained_adaptermodel/ and integrated into selective transformer layers during fine-tuning.

Training New Adapters

To train adapters for new domain knowledge, you can refer to the following scripts as examples:

train_disease.py
train_graph.py
train_sgrouping.py

Acknowledgement

The evaluation scripts in this repository are adapted from diseaseBERT. We gratefully acknowledge their contributions, which have significantly aided the development of this project.

References

If you find this code useful, please consider citing our works:

@inproceedings{lu2021parameter,
  title={Parameter-efficient domain knowledge integration from multiple sources for biomedical pre-trained language models},
  author={Lu, Qiuhao and Dou, Dejing and Nguyen, Thien Huu},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2021},
  pages={3855--3865},
  year={2021}
}

@article{lu2024enhancing,
  title={Enhancing Clinical Relevance of Pretrained Language Models Through Integration of External Knowledge: Case Study on Cardiovascular Diagnosis From Electronic Health Records},
  author={Lu, Qiuhao and Wen, Andrew and Nguyen, Thien and Liu, Hongfang and others},
  journal={JMIR AI},
  volume={3},
  number={1},
  pages={e56932},
  year={2024},
  publisher={JMIR Publications Inc., Toronto, Canada}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diverse Adapters for Knowledge Integration (DAKI)

Repository Structure

Installation

Usage

Pre-trained Adapters

Training New Adapters

Acknowledgement

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
MEDIQA-2019		MEDIQA-2019
MEDNLI		MEDNLI
NCBI		NCBI
TRECQA-2017		TRECQA-2017
assets		assets
i2b2_2006		i2b2_2006
pretrained_adaptermodel		pretrained_adaptermodel
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
data.py		data.py
extractedQuestionAnswers_total_removeNoisy_maskedLM.txt		extractedQuestionAnswers_total_removeNoisy_maskedLM.txt
model.py		model.py
requirements.txt		requirements.txt
train_disease.py		train_disease.py
train_graph.py		train_graph.py
train_sgrouping.py		train_sgrouping.py

qiuhaolu/DAKI

Folders and files

Latest commit

History

Repository files navigation

Diverse Adapters for Knowledge Integration (DAKI)

Repository Structure

Installation

Usage

Pre-trained Adapters

Training New Adapters

Acknowledgement

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages