Python program containing classes to solve the problem of shortest common superstring, given a fragmented string.
scs=ShortestCommonSuperstring(list) #e.g. list=['ATGC', 'TGCC', 'GCCA']
obj1=ShortestCommonSuperstring()
obj1.load_seq(sequence, k) #k is an integer for length of k-mers
scs=obj1.scs()
This class provides the functionality to take a sequence and break it down into k-mers, it can also provide unique k-mers.
dna=DNA(sequence)
all_kmers=dna.all_kmer(k) #k is an integer for length of k-mers
unique_kmers=dna.unique_kmers() #all_kmers() must be run in prior
This class provides the functionality to make a directed graph by calculating edge weights from the overlaps of k-mers with each others. It can recursively and greedily merge k-mers with maximum overlap and reduce the list until either the SCS is found ot there are no further overlaps.
scs=ShortestCommonSuperstring(list) #e.g. list=['ATGC', 'TGCC', 'GCCA'] scs.kmers is a list object that contains the shortest common superstring. It could have multiple strings if program is not able to resolve.
obj1=ShortestCommonSuperstring() You will get a message : Warning! No kmers provided. You can load sequences using load_seq() function. obj1.load_seq(sequence, k) #k is an integer for length of k-mers
Finding SCS : scs=obj1.scs()