-
Notifications
You must be signed in to change notification settings - Fork 21
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #4 from jrurogers/master
Updates for corrected HSM models
- Loading branch information
Showing
62 changed files
with
26,107 additions
and
45 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
A | ||
C | ||
D | ||
E | ||
F | ||
G | ||
H | ||
I | ||
K | ||
L | ||
M | ||
N | ||
P | ||
Q | ||
R | ||
S | ||
T | ||
V | ||
W | ||
Y | ||
y |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,4 +18,4 @@ T | |
V | ||
W | ||
Y | ||
y | ||
y |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,8 @@ | ||
PTP,phosphosite,146,15,1,PTP.npz | ||
PTB,phosphosite,94,15,1,PTB.npz | ||
Kinase_TK,phosphosite,190,15,1,Kinase_TK.npz | ||
SH3,polyproline,51,16,0,SH3.npz | ||
SH2,phosphosite,92,15,1,SH2.npz | ||
PDZ,c-terminus,75,6,1,PDZ.npz | ||
WW,polyproline,32,13,0,WW.npz | ||
WH1,polyproline,105,8,0,WH1.npz | ||
SH2,phosphosite,92,15,1,SH2.npz | ||
WH1,polyproline,105,8,0,WH1.npz | ||
WW,polyproline,32,13,0,WW.npz | ||
PTB,phosphosite,94,15,1,PTB.npz | ||
PDZ,c-terminus,75,8,1,PDZ.npz | ||
Kinase_TK,phosphosite,192,15,1,Kinase_TK.npz | ||
SH3,polyproline,51,16,0,SH3.npz | ||
PTP,phosphosite,146,15,1,PTP.npz |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# Description | ||
|
||
This directory contains code used for the publication, specifically to evaluate external models and PSSMs, analyze HSM models and HSM/P predictions, and create all figures. Detailed descriptions are provided in each subdirectory. | ||
|
||
## External models | ||
|
||
Code used to evaluate NetPhorest 2.1 predictions on HSM datasets is provided in `netphorest/`. Code used to evaluate PepInt predictions on HSM datasets is provided in `pepint/`. | ||
|
||
## PSSMs | ||
|
||
Code to construct benchmark PSSMs for each PBD family is provided in `pssm/`. | ||
|
||
## Analysis and results presented in manuscript | ||
|
||
All code used to perform analysis presented in the manuscript and create figures is provided in `results/`. Data underlying the figures is released on [figshare (doi:10.6084/m9.figshare.22105529)](https://doi.org/10.6084/m9.figshare.22105529), with a directory structure paralleling that of this directory. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# NetPhorest predictions | ||
|
||
NetPhorest 2.1 was used to make predictions PTB, SH2, PTP, and TKs for comparison to HSM. See `driver.sh` for commands executed. To use the bash script, the following variables need to be set: | ||
|
||
* `domain`: Specify PBD family to make predictions for. Must be one of `PTB`, `SH2`, `Kinase_TK`, and `PTB`. | ||
* `data_path`: Specify path to data to make predictions for. This must be raw, unaligned data provided in csv format for a single PBD and must contain the columns (with header): | ||
``` | ||
Domain UniProt ID,Domain Sequence,Peptidic Sequence,Bound | ||
``` | ||
Raw HSM data (`/data/data_without_processed_duplicates/raw_data/`) can be downloaded from [figshare (doi:10.6084/m9.figshare.22105529)](https://doi.org/10.6084/m9.figshare.22105529). | ||
* `NETPHOR`: Specificy netphorest 2.1 executable. Download `NetPhorest_human_2.1.zip` from [http://netphorest.science/download.shtml](http://netphorest.science/download.shtml) and compile per instructions. | ||
|
||
In order to make predictions with the HSM data, each domain was mapped to its NetPhorest model, and these mappings are provided in `map_domain_netphorest_model`. Either UniProt IDs (if only one domain of a specificied PBD family is found in the protein) or raw domain sequences are used for the mapping. For futher information about the NetPhorest models, see [Miller, et al. Linear Motif Atlas for Phosphorylation-Dependent Signaling (2008)](https://www.science.org/doi/10.1126/scisignal.1159433) and [Horn, et al. KinomeXplorer: an integrated platform for kinome biology studies (2014)](https://www.nature.com/articles/nmeth.2968). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
#!/bin/bash | ||
|
||
domain= # name of PBD to make predictions for | ||
data_path= # path to location of raw (unaligned) data | ||
|
||
NETPHOR= # netphorest 2.1 executable | ||
|
||
if [ ! -d $domain ]; then mkdir $domain; fi | ||
cd $domain | ||
# write fasta for netphorest predictions | ||
# using raw data file since has UniProt IDs & netphorest independently aligns | ||
data=$data_path/$domain.csv | ||
python ../write_fasta.py $data | ||
|
||
# make predictions with netphorest | ||
if [ "$domain" == "Kinase_TK" ]; then | ||
classifier='KIN' | ||
else | ||
classifier=$domain | ||
fi | ||
cat peptides.fasta | $NETPHOR | grep $classifier > netphorest_predictions.tab | ||
|
||
# process predictions | ||
python ../process_netphorest_predictions.py netphorest_predictions.tab $data ../map_domain_netphorest_model/${domain}_netphorest_model.csv | ||
cd .. |
57 changes: 57 additions & 0 deletions
57
publication_analysis/netphorest/map_domain_netphorest_model/Kinase_TK_netphorest_model.csv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
Domain UniProt ID,NetPhorest Model,Use sequence,Sequence | ||
P21709,Eph_group,0, | ||
P29317,Eph_group,0, | ||
P29320,Eph_group,0, | ||
P54764,Eph_group,0, | ||
P54756,Eph_group,0, | ||
Q9UF33,Eph_group,0, | ||
Q15375,Eph_group,0, | ||
P29322,Eph_group,0, | ||
Q5JZY3,Eph_group,0, | ||
P54762,Eph_group,0, | ||
P29323,Eph_group,0, | ||
P54753,Eph_group,0, | ||
P54760,Eph_group,0, | ||
O15197,Eph_group,0, | ||
P00519,Abl_group,0, | ||
P42684,Abl_group,0, | ||
P07333,FLT3_CSF1R_Kit_PDGFR_group,0, | ||
P10721,FLT3_CSF1R_Kit_PDGFR_group,0, | ||
P36888,FLT3_CSF1R_Kit_PDGFR_group,0, | ||
P16234,FLT3_CSF1R_Kit_PDGFR_group,0, | ||
P09619,FLT3_CSF1R_Kit_PDGFR_group,0, | ||
O60674,JAK2,0, | ||
O60674,JAK2,0, | ||
P35968,KDR_FLT1_group,0, | ||
P17948,KDR_FLT1_group,0, | ||
P08581,Met_group,0, | ||
Q04912,Met_group,0, | ||
P43405,Syk_group,0, | ||
P43403,Syk_group,0, | ||
Q06187,Tec_group,0, | ||
P51813,Tec_group,0, | ||
Q08881,Tec_group,0, | ||
P42680,Tec_group,0, | ||
P42681,Tec_group,0, | ||
P04629,Trk_group,0, | ||
Q16620,Trk_group,0, | ||
Q16288,Trk_group,0, | ||
P29597,Tyk2,0, | ||
P00533,EGFR_group,0, | ||
P04626,EGFR_group,0, | ||
P21860,EGFR_group,0, | ||
Q15303,EGFR_group,0, | ||
P06213,InsR_group,0, | ||
P08069,InsR_group,0, | ||
P14616,InsR_group,0, | ||
P51451,Src_group,0, | ||
P08631,Src_group,0, | ||
P07948,Src_group,0, | ||
P06239,Src_group,0, | ||
P09769,Src_group,0, | ||
P42685,Src_group,0, | ||
P06241,Src_group,0, | ||
P12931,Src_group,0, | ||
P07947,Src_group,0, | ||
Q13882,Src_group,0, | ||
Q9H3Y6,Src_group,0, |
7 changes: 7 additions & 0 deletions
7
publication_analysis/netphorest/map_domain_netphorest_model/PTB_netphorest_model.csv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
Domain UniProt ID,NetPhorest Model,Use sequence,Sequence | ||
Q9UKG1,APPL,0, | ||
Q8NEU8,APPL,0, | ||
P29353,SHC1_SHC2_SHC3_group,0, | ||
P98077,SHC1_SHC2_SHC3_group,0, | ||
Q92529,SHC1_SHC2_SHC3_group,0, | ||
Q6S5L8,SHC4,0, |
23 changes: 23 additions & 0 deletions
23
publication_analysis/netphorest/map_domain_netphorest_model/PTP_netphorest_model.csv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
Domain UniProt ID,NetPhorest Model,Use sequence,Sequence | ||
Q12923,PTPN13,0, | ||
Q9H3S7,PTPN23,0, | ||
P26045,PTPN3,0, | ||
P29074,PTPN4,0, | ||
P43378,PTPN9,0, | ||
P23468,R2A_group,0, | ||
P10586,R2A_group,0, | ||
Q13332,R2A_group,0, | ||
P23467,R3_group,0, | ||
Q9HD43,R3_group,0, | ||
Q12913,R3_group,0, | ||
Q16827,R3_group,0, | ||
Q9UMZ3,R3_group,0, | ||
P18433,R4_group,0, | ||
P23469,R4_group,0, | ||
P18031,NT1_group,0, | ||
P17706,NT1_group,0, | ||
P29350,NT2_group,0, | ||
Q06124,NT2_group,0, | ||
Q05209,NT4_group,0, | ||
Q99952,NT4_group,0, | ||
Q9Y2R2,NT4_group,0, |
Oops, something went wrong.