You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I give themisto-build a file with just a sequence and its reverse complement, extract-unitigs is generating two different unitigs -- is this the expected behavior?
and then run
themisto build -k 31 -i 80.fna -o 80.k31 --temp-dir .
themisto extract-unitigs -i 80.k31 --colors-out 80.k31.colors --gfa-out 80.k31.gfa
I get a file with two lines in the colors file and two segments in the GFA file
H VN:Z:1.0
S 86 ATCAGCAGCGACATGGCGGTCATCACCGTAGTCGAGGCAAGCAATAATGGACGGCGCCCGACGTGGTCGATGATCGCAGA
S 77 TCTGCGATCATCGACCACGTCGGGCGCCGTCCATTATTGCTTGCCTCGACTACGGTGATGACCGCCATGTCGCTGCTGAT
The text was updated successfully, but these errors were encountered:
Yes, this is expected. Our index structure is not aware of reverse complements.
We could add a flag to extract-unitigs to compute the bidirected de Bruijn graph for better interoperability with other tools. Meanwhile, you can work around this by concatenating the input with its reverse complement before building the index. This will create two copies for each unitig: one for the forward and one for the reverse complement (except for those that are reverse complements of themselves). You can extract the bidirected de Bruijn graph from this with some post processing.
If I give themisto-build a file with just a sequence and its reverse complement, extract-unitigs is generating two different unitigs -- is this the expected behavior?
e.g. if 80.fna contains
and then run
themisto build -k 31 -i 80.fna -o 80.k31 --temp-dir .
themisto extract-unitigs -i 80.k31 --colors-out 80.k31.colors --gfa-out 80.k31.gfa
I get a file with two lines in the colors file and two segments in the GFA file
H VN:Z:1.0
S 86 ATCAGCAGCGACATGGCGGTCATCACCGTAGTCGAGGCAAGCAATAATGGACGGCGCCCGACGTGGTCGATGATCGCAGA
S 77 TCTGCGATCATCGACCACGTCGGGCGCCGTCCATTATTGCTTGCCTCGACTACGGTGATGACCGCCATGTCGCTGCTGAT
The text was updated successfully, but these errors were encountered: