-
Notifications
You must be signed in to change notification settings - Fork 28
Background variants
This page describes the use of "background" variants when building your own panels for AMR and/or lineage calling. This can be essential, depending on the species of interest, to prevent missed calls.
If you are missing calls from mykrobe after making your own panel, the most likely reason is that background SNPs must be used. Read this page to understand and learn how to fix the problem.
Mykrobe works by looking for perfectly matching kmers. Suppose we have a variant of interest C100T
(a C
to T
DNA change at position 100 in the genome). If a sample has a SNP within a kmer of position 100, say G95T
, then this prevents kmers from matching, resulting in a false negative call from mykrobe. Mykrobe solves this problem by allowing you to supply a catalog of "background variants" that are used when running make-probes
. Mykrobe will then generate combinations of probes that have (or do not) have the background variants. For example, supplying the G95T variant would result in two probes for C100T
: one that has a G
at position 95 and the other a T
at position 95.
If you are building a panel for TB, then you can skip the part below about making your own VCF files. Instead, get VCF files with background variants from here: https://figshare.com/articles/dataset/Mykrobe_TB_panel_background_variants/19582597.
You will need mongoDB installed, and your background SNPs in one or more VCF files. The idea is that the SNPs are added to a database, which is then used when running make-probes
.
The VCF file must have the GT
and GT_CONF
fields present in every line. Only records with a non-reference genotype and with GT_CONF
> 1 are used (there may be other requirements - to be documented). Here is an example of a record that will be used:
ref_name 42 12 G T 255 PASS SVTYPE=SNP GT:GT_CONF 1/1:100
That VCF record would add the variant G42T
to the backgrounds database.
You will need mongoDB running in the background, and then run mykrobes variants add
once for each VCF file to add background variants. Here are example commands, where the variants are in variants.vcf
:
# Start the database
ref_fa=NC_012345.fasta
db_name=my_db
db=$PWD/mongo-db/
mkdir $db
mongod --quiet --dbpath $db &
sleep 5
# Add variants. Run this command once for each VCF file.
# Note: you can put anything in the -m option, it is just
# the name of the source of the variants. Here we put 'samtools'
mykrobe variants add -f --db_name $db_name variants.vcf $ref_fa -m samtools
# Run make probes - note the --db_name option
mykrobe variants make-probes \
--db_name $db_name \
-k21 \
-t amino_acid_variants.txt \
-g NC_012345.gbk \
$ref_f > probes.fa
mongod --shutdown --dbpath $db
The final make-probes
command is described in detail in the custom panels help page. Essentially, use it as described there, but add in the --db_name foo
option to include background variants.