Skip to content

A generalized Karkkainen-Sanders implementation in Python for building suffix arrays and trees for many strings in O(n)

Notifications You must be signed in to change notification settings

mmaz/FastGeneralizedSuffixArrays

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

FastGeneralizedSuffixArrays

Version 0.1 markmaz.com

A generalized Karkkainen-Sanders implementation in Python for building suffix arrays and trees for many strings in O(n)

Roadmap

(Further scaling exploration (perhaps introduction of Maniscalco and Puglisi, cython porting, distributed implementation, etc) to be attempted in 1.x)

  • 0.5 Optimizations of existing code based on profiling results

  • 0.4 Library consolidation/cleanup for importability. Profiling at memory/swapping levels.

  • 0.3 Building suffix trees from suffix arrays - O(n) naturally

  • 0.2 Adding LCP/log(n) suffix lookups and generalization code (currently to be done by concatenating strings together, separated by monotonically increasing negative numbers)

  • 0.1 Initial Release: Rough and rapid protoyping of algorithm guts to spit out a suffix array for an ASCII/maybe UTF-8 string. Silly long variable names provide a quick mapping to the algorithm description

Contact me for any suggestions/additions/corrections/requests: mark at markmaz.com

About

A generalized Karkkainen-Sanders implementation in Python for building suffix arrays and trees for many strings in O(n)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages