Introduction

LMDisorder is a fast and accurate protein disorder predictor that employed embedding generated by unsupervised pretrained language models as features.We showed that LMDisorder essentially surpassed the single-sequence-based methods by more than 6.0% and 18.0% on AUROC in two independent test sets, respectively. Furthermore, LMDisor-der showed equivalent or even better performance than the state-of-the-art profile-based technique SPOT-Disorder2.

System requirement

python 3.7.9
numpy 1.19.1
pandas 1.1.0
pytorch 1.10.0
sentencepiece 0.1.96
transformers 4.18.0
tqdm 4.48.2

Pretrained language model

You need to prepare the pretrained language model ProtTrans to run LMDisorder: Download the pretrained ProtT5-XL-UniRef50 model (guide). # ~ 11.3 GB (download: 5.3 GB)

Run LMDisorder for prediction

Simply run:

python LMDisorder_predict.py --fasta ./example/demo.fasta --device 'cpu' --model_path ./model/model.pkl

And the prediction results will be saved in

./example/result

We also provide the corresponding canonical prediction results in ./example/demo_result for your reference.

Dataset and model

We provide the datasets and the trained LMDisorder models here for those interested in reproducing our paper. The datasets used in this study are stored in ./datasets/. The trained LMDisorder models can be found under ./model/.

Contact

Yidong Song ([email protected])
Yuedong Yang ([email protected])

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
datasets		datasets
example		example
image		image
model		model
script		script
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

System requirement

Pretrained language model

Run LMDisorder for prediction

Dataset and model

Contact

About

Releases 1

Packages

Languages

YidongSong/LMDisorder

Folders and files

Latest commit

History

Repository files navigation

Introduction

System requirement

Pretrained language model

Run LMDisorder for prediction

Dataset and model

Contact

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages