Releases: antigenomics/vdjdb-db
[CONTENT] Mus Musculus TCRs
Description
We've added a large dataset of paired alpha-beta TCR sequences from mouse and human subjects. The database now contains more than a thousand records for Mus Musculus, a species that was nearly absent in previous versions.
Current statistics
species | gene | records | paired.records | epitopes |
---|---|---|---|---|
HomoSapiens | TRA | 1372 | 608 | 78 |
HomoSapiens | TRB | 4991 | 590 | 117 |
MacacaMulatta | TRA | 74 | 0 | 1 |
MacacaMulatta | TRB | 1313 | 0 | 3 |
MusMusculus | TRA | 1286 | 1286 | 22 |
MusMusculus | TRB | 1309 | 1309 | 22 |
[INFRASTRUCTURE] [CONTENT] Getting doi from Zenodo
- First DOI release
- Add datasets submitted by Sewell et al (2k+ sequences)
- Minor change to scoring (lower frequency thresholds)
[INFRASTRUCTURE] [PROOFREADING] Minor fixes & info added
- Proofreading fixes
- Fixed V/J mapping report messages
- Added info on number of samples/papers a given TCR:pMHC was found
[INFRASTRUCTURE] [PROOFREADING] VDJdb web interface compatibility
- Some modifications to database output files added for better interoperability with VDJdb-server
- Several chunks fixed (removed trailing residues after conserved Phe, etc)
[INFRASTRUCTURE] V/J re-mapping
A better algorithm for CDR3AA markup, V/J segment mapping and V/J segment deduction (in case segment was not provided).
[CONTENT] Winter release
Papers added
PMID:15753288 #144
PMID:16982909 #166
PMID:17121793 #152
PMID:18802118 #156
PMID:19349463 #140
PMID:21135165 #142
PMID:21752903 #138
PMID:22278241 #134
PMID:24600035 #146
PMID:27111229 #150
PMID:27645996 #148
Current statistics
species | gene | records | paired.records | epitopes |
---|---|---|---|---|
HomoSapiens | TRA | 619 | 327 | 77 |
HomoSapiens | TRB | 3452 | 325 | 113 |
MacacaMulatta | TRA | 74 | 0 | 1 |
MacacaMulatta | TRB | 1313 | 0 | 3 |
MusMusculus | TRA | 16 | 16 | 16 |
MusMusculus | TRB | 17 | 17 | 16 |
[CONTENT] Monthly release
[INFRASTRUCTURE] Updated VDJdb confidence scoring system
Improvements
- Completely reworked VDJdb confidence scoring (a more transparent record confidence system)
- Four confidence levels: 0 (no data), 1 (moderate), 2 (high) and 3 (very high confidence)
- Now accounts for both sequencing and specificity
- Changed how identification and in-vitro validation methodology is score, extensively validated records have the highest score
- Added several new
method
annotations - More flexible parsing of
method.frequency
(x/X
is now optional) - Small update to patches (replacement of typos and formatting issues in gene and species naming)
See README for more details and additional guidelines.
[CONTENT] [INFRASTRUCTURE] Another five papers added, some additions to infrastructure
Papers added:
PMID_10756006 #23
PMID_10925283 #22
PMID_11756174 #27
PMID_22314361 #75
PMID_24069285 #20
Papers withheld:
PMID_7964506 #4
Infrastructure
- Added "correct" step to correct ambiguous antigen species/gene naming
- Basic database summary generation
- A draft of algorithm to optimize VDJdb search scoring system
[CONTENT] [INFRASTRUCTURE] Five more papers & db management fixes/modifications
Papers added
PMID_12504586.txt
PMID_16326979.txt
PMID_24512815.txt
PMID_27252176.txt
PMID_7964506.txt
Minor
- Additional keyword for "direct validation" records, i.e. single cell affinity measurement & sequencing
- Also fixed a bug that resulted in single-chain records having a non-zero complex id in flattened database