Skip to content

Latest commit

 

History

History

dsa

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Data Structures and Algorithms

Collection of data structures and algorithms implemented in Python.

The minimum python version required is 3.9. Scripts of this project must be run as a module:

cd <project-root>
$ python -m src.<module>.<file> # without .py

$ # examples:
$ python -m src.sorting.quicksort
$ python -m src.tree.avl
$ python -m src.graph.path.ssp

Notes

  • The n value in most asymptotic complexity descriptions refer to the main input size, which may be a list or string size, the absolute value of a numeric parameter, the size of a data structure, etc. Other complexity variables are usually described in the comments or in the code.

The time complexity in big-O notation is shown beside algorithms names. Space complexity is available in algorithms files.


Sorting Algorithms

  • bubblesort - O(n2)
  • insertionsort - O(n2)
  • selectionsort - O(n2)
  • heapsort - O(n*log(n))
  • mergesort - O(n*log(n))
  • quicksort
    • quicksort hoare's partition - average: O(n*log(n)), worst: O(n2)
    • quicksort lomuto's partition - average: O(n*log(n)), worst: O(n2)
    • quicksort dual pivot partition - average: O(n*log(n)), worst: O(n2)
  • treesort (see data structures trees section) - O(n*Tree.put + Tree.__iter__))
    • treesort binary search tree - average: O(n*log(n)), worst: O(n2)
    • treesort avl - O(n*log(n))
    • treesort red-black tree - O(n*log(n))
    • treesort van emde boas - O(n*log(log(u)))
  • shellsort
    • shellsort Shell1959 - O(n2)
    • shellsort FrankLazarus1960 - O(n3/2)
    • shellsort Hibbard1963 - O(n3/2)
    • shellsort PapernovStasevich1965 - O(n3/2)
    • shellsort Knuth1973 - O(n3/2)
    • shellsort Sedgewick1982 - O(n4/3)
    • shellsort Tokuda1992 - unknown
    • shellsort Ciura2001 - unknown
  • countingsort - O(n + k)
  • bucketsort - best: O(n), average: O(n + (n2/k) + k), worst O(n2)
  • radixsort
    • radixsort least-significant-digit - O(n*w)
    • radixsort most-significant-digit - O(n*w)
  • stoogesort - O(n2.7)
  • slowsort - O(T(n)), where T(n) = T(n-1) + T(n/2)*2 + 1
  • bogosort
    • bogosort random - unbounded
    • bogosort deterministic - O((n + 1)!)

Data Structures

  • Linked[T] abstract
    • __str__ - O(Linked.__iter__)
    • __len__ - abstract
    • __iter__ - abstract
    • __contains__ - O(Linked.__iter__)
    • index - O(Linked.__iter__)
    • LinkedList[T] extends Linked[T] - space: O(n)
      • Linked.__len__ - O(1)
      • Linked.__iter__ - O(n)
      • push - O(n)
      • pop (index deletion) - O(n)
      • remove (value deletion) - O(n)
      • get (same as Linked.index, but faster) - O(n)
      • reverse - O(n)
    • Queue[T] extends Linked[T] - space: O(n)
      • Linked.__len__ - O(1)
      • Linked.__iter__ - O(n)
      • offer - O(1)
      • poll - O(1)
      • peek - O(1)
    • Stack[T] extends Linked[T] - space: O(n)
      • Linked.__len__ - O(1)
      • Linked.__iter__ - O(n)
      • push - O(1)
      • pop - O(1)
      • peek - O(1)
  • Priority[T] abstract
    • __str__ - O(Priority.__iter__)
    • __len__ - abstract
    • __iter__ - abstract
    • __contains__ - O(Priority.__iter__)
    • offer - abstract
    • poll - abstract
    • peek - abstract
    • Heap[T] extends Priority[T] - space: O(n)
      • utility
        • sift_up - O(log(n))
        • sift_down - O(log(n))
        • heapify_top_down - O(n*log(n))
        • heapify_bottom_up - O(n)
      • __init__ (top-down) - O(n*log(n))
      • __init__ (bottom-up) - O(n)
      • Priority.__len__ - O(1)
      • Priority.__iter__ - O(n*log(n))
      • Priority.offer - O(log(n))
      • Priority.poll - O(log(n))
      • Priority.peek - O(1)
    • KHeap[T] extends Priority[T] - space: O(n)
      • utility
        • sift_up - O(k*logk(n))
        • sift_down - O(k*logk(n))
        • heapify_top_down - O(n*k*logk(n))
        • heapify_bottom_up - O(n*k)
      • __init__ (top-down) - O(n*k*logk(n))
      • __init__ (bottom-up) - O(n*k)
      • __str__ (override Priority.__str__) - O(Priority.__iter__)
      • Priority.__len__ - O(1)
      • Priority.__iter__ - O(n*k*logk(n))
      • Priority.offer - O(k*logk(n))
      • Priority.poll - O(k*logk(n))
      • Priority.peek - O(1)
    • benchmark - includes trees, see data structures trees section
  • Map[K, V] abstract
    • implemented probers
      • linear probing
      • quadratic prime probing
      • quadratic triangular probing
    • __str__ - O(Map.__iter__)
    • __len__ - abstract
    • __iter__ - abstract
    • __contains__ - O(Map.get)
    • keys - O(Map.__iter__)
    • values - O(Map.__iter__)
    • put - abstract
    • take - abstract
    • get - abstract
    • contains_value - O(Map.__iter__)
    • OpenAddressingHashtable[K, V] extends Map[K, V] - space: O(n)
      • Map.__len__ - O(1)
      • Map.__iter__ - O(n)
      • Map.put - O(1) amortized
      • Map.take - O(1) amortized
      • Map.get - O(1) amortized
    • SequenceChainingHashtable[K, V] extends Map[K, V] - space: O(n)
      • Map.__len__ - O(1)
      • Map.__iter__ - O(n)
      • Map.put - O(1) amortized
      • Map.take - O(1) amortized
      • Map.get - O(1) amortized
    • benchmark - includes trees, see data structures trees section
  • Tree[K = Comparable, V] abstract extends Map[K, V] Heap[K]
    • minimum - abstract
    • maximum - abstract
    • predecessor - abstract
    • successor - abstract
    • Heap.offer - O(Map.put)
    • Heap.poll - O(Tree.minimum + Map.take)
    • Heap.peek - O(Tree.minimum)
    • Binary Search Tree - BST[K = Comparable, V] extends Tree[K, V] - space: O(n)
      • __str__ (override Map.__str__ and Priority.__str__) - O(traverse)
      • Map.__len__ , Priority.__len__ - O(1)
      • Map.__iter__ , Priority.__iter__ - O(traverse)
      • traverse (pre, in, post, breadth) - - O(n)
      • Map.put - average: O(log(n)), worst: O(n)
      • Map.take - average: O(log(n)), worst: O(n)
      • Map.get - average: O(log(n)), worst: O(n)
      • Tree.minimum - average: O(log(n)), worst: O(n)
      • Tree.maximum - average: O(log(n)), worst: O(n)
      • Tree.predecessor - average: O(log(n)), worst: O(n)
      • Tree.successor - average: O(log(n)), worst: O(n)
    • Adelson Velsky and Landis - AVL[K = Comparable, V] extends BST[K, V] - space: O(n)
      • put (override Map.put in BST) - O(log(n))
      • take (override Map.take in BST) - O(log(n))
      • all functions worst case drop to O(log(n))
    • Red-Black Tree - RBT[K = Comparable, V] extends BST[K, V] - space: O(n)
      • put (override Map.put in BST) - O(log(n))
      • take (override Map.take in BST) - O(log(n))
      • all functions worst case drop to O(log(n))
    • van Emde Boas - VEB[V] extends Tree[int, V] - space: O(n*log(log(u)))
      • __str__ (override Map.__str__ and Priority.__str__) - O(traverse)
      • Map.__len__ , Priority.__len__ - O(1)
      • Map.__iter__ , Priority.__iter__ - O(traverse)
      • traverse (pre, in, post, breadth) - O(n*log(log(u)))
      • Map.put - O(log(log(u)))
      • Map.take - O(log(log(u))
      • Map.get - O(log(log(u))
      • Tree.minimum - O(1)
      • Tree.maximum - O(1)
      • Tree.predecessor - O(log(log(u)))
      • Tree.successor - O(log(log(u)))
    • benchmark
  • DisjointSet and HashDisjointSet[T] - space: O(n)
    • __init__ - O(n)
    • __str__ - O(n)
    • __len__ - O(1)
    • sets (number of unique sets) - O(1)
    • set_size (size of a set) - O(1)
    • make_set O(1)
    • find - O(1)
    • union - O(1)
    • connected - O(1)
  • Binary Index Tree (Fenwick Tree) - BIT - space: O(n)
    • __init__ - O(n)
    • __str__ - O(n)
    • __len__ - O(1)
    • sum (prefix sum) - O(log(n))
    • sum_range (prefix sum range) - O(log(n))
    • add - O(log(n))
    • set - O(log(n))
  • Graph[V, E] (adjacency list) - see graph theory algorithms section - space: O(v + e)
    • __init__ - O(1)
    • __str__ - O(v + e)
    • __len__ - O(1)
    • __iter__ - O(v)
    • traverse (depth, breadth) - O(v + e)
    • vertices (not traverse) - O(v)
    • edges (not traverse) - O(v + e)
    • vertices_count - O(1)
    • edges_count - O(1)
    • unique_edges_count - O(1)
    • is_undirected - O(1)
    • is_directed - O(1)
    • has_directed_edges - O(1)
    • has_edge_cycles - O(1)
    • make_vertex - O(1)
    • make_edge - O(1)
    • get_vertex - O(1)
    • get_vertices - O(v)
    • get_edges - O(v + e)
    • copy - O(v + e)
    • transposed - O(v + e)
    • adjacency_matrix - O(v2)
    • factory
      • complete
      • random undirected
      • random directed
      • random undirected paired (all vertices have even degree)
      • random directed paired (all vertices have out-degree - in-degree = 0)
      • random directed acyclic
      • random flow (for max-flow/min-cut)

Graph Theory

  • minimum spanning tree
    • prim - O(e*log(v))
    • kruskal - O(e*log(v))
    • boruvka - O(e*log(v))
  • connectivity
    • undirected graphs
      • connected depth first search - O(v + e)
      • connected breadth first search - O(v + e)
      • connected disjoint set - O(v + e)
      • articulations, bridges and biconnected tarjan - O(v + e)
    • directed graphs
      • strongly connected tarjan - O(v + e)
      • strongly connected kosaraju - O(v + e)
  • topological sorting
    • khan - O(v + e)
    • depth first search - O(v + e)
    • strongly connected tarjan (from connectivity algorithms, used as topsort algorithm) - O(v + e)
    • strongly connected kosaraju (from connectivity algorithms, used as topsort algorithm) - O(v + e)
  • path finding
    • shortest path
      • directed acyclic graphs
        • single source - O(v + e)
        • single source (longest) - O(v + e)
      • all graphs
        • single source dijkstra - O(e*log(v))
        • single source dijkstra (optimized, visited + skip stale) - O(e*log(v))
        • single source bellman ford - O(v*e)
        • all pairs floyd warshall - O(v3)
    • traveling salesman problem
      • brute force - O(v!)
      • held-karp bitset - O(2v*v2)
      • held-karp hashset - O(2v*v2)
      • nearest neighbors (heuristic) - O(v2)
    • eulerian cycle/path
      • undirected graphs
        • fleury - O(e2)
        • hierholzer recursive - O(v + e)
        • hierholzer iterative - O(v + e)
      • directed graphs
        • hierholzer recursive - O(v + e)
        • hierholzer iterative - O(v + e)
    • hamiltonian cycle/path
      • brute force (from traveling salesman, used as hamiltonian path algorithm) - O(v!)
      • held-karp bitset (from traveling salesman, used as hamiltonian path algorithm) - O(2v*v2)
      • held-karp hashset (from traveling salesman, used as hamiltonian path algorithm) - O(2v*v2)
  • max-flow/min-cut
    • ford fulkerson
      • depth first search - O(f*e)
      • edmonds karp - O(v*e2)
      • depth first search with capacity scaling - O(e2*log(u))
      • dinic - O(v2*e)
  • cover
    • vertex cover
      • brute force - O(2k*v*e)
      • greedy (heuristic) - O(v + e)
      • greedy double (heuristic) - O(v + e)
      • weighted brute force - O(2v*v*e)
      • weighted greedy (heuristic) - O(v + e)
      • weighted pricing method (heuristic) - O(v + e)
      • weighted pricing sorted method (heuristic) - O(e*log(e) + v)

Enumeration Combinatorics

  • factorial
    • factorial recursive - O(n)
    • factorial iterative - O(n)
    • stirling's factorial approximation - O(1)
    • ramanujan's factorial approximation - O(1)
  • permutations
    • count permutations - O(n)
    • permutations cycles - O(nk) ~> O(n!) when k ~ n
    • permutations heap - O(n!)
  • combinatorics
    • count combinations recursive - O(min(nk, nn-k))
    • count combinations iterative - O(n)
    • combinations - O(n choose k)
    • bit combinations range - O(n choose k)
    • bit combinations branch - O(n choose k)

Searching Algorithms

  • array search
    • binary search - O(log(n))
    • k-ary search - O(k*logk(n))
    • interpolation search - O(log(log(n))) uniformly distributed arrays, worst: O(n)
    • exponential search - O(log(i))
  • range minimum query (rmq) <-> lowest common ancestor (lca) RangeMinimumQuery[T = Comparable] abstract
  • exact string search
    • brute force - O(n*p)
    • rabin karp - O(n + p), worst: O(n*p)
    • knuth morris pratt - O(n + p)
    • baeza yates gonnet (shift-or) - O(n + p)
    • boyer moore - O(n + p)
    • boyer moore (optimized, extended bad char table) - O(n + p)
    • aho corasick - O(n + p)
  • string edit distance
    • brute force - O(3n + m)
    • wagner fischer - O(n*m)
    • wagner fischer (optimized, reuse distance table) - O(n*m)
  • approximate/fuzzy string search
    • sellers - O(n*p)
    • ukkonen - O(n + (p*min(p, d)*c))
    • wu manber - O(n*min(p, d) + p)
  • offline string search StringOffline abstract
    • __init__ (naive) - abstract
    • __str__ - abstract
    • occurrences - abstract
    • occurrences_count - abstract
    • longest_repeated_substring - abstract
    • longest_common_prefix - abstract
    • utility
      • match_slices - O(min(a, b))
      • compare_slices - O(min(a, b))
    • SuffixArray extends StringOffline - space: O(n)
      • StringOffline.__init__ (naive) - O(n2*log(n))
      • StringOffline.__init__ (karp miller rosenberg) - O(n*log(n)2)
      • StringOffline.__init__ (karp miller rosenberg optimized) - O(n*log(n)) TODO
      • StringOffline.__init__ (lcp: naive) - + O(n2)
      • StringOffline.__init__ (lcp: kasai) - + O(n*log(n))
      • StringOffline.__str__ - O(n2)
      • StringOffline.occurrences TODO
      • StringOffline.occurrences_count TODO
      • StringOffline.longest_repeated_substring TODO
      • StringOffline.longest_common_prefix TODO
    • SuffixTree extends StringOffline - space: O(n)
      • StringOffline.__init__ (naive) - O(n2)
      • StringOffline.__init__ (ukkonen) - O(n)
      • StringOffline.__str__ - O(n2)
      • StringOffline.occurrences - O(p+q)
      • StringOffline.occurrences_count - O(p)
      • StringOffline.longest_repeated_substring - O(n)
      • StringOffline.longest_common_prefix (constant time achieved using RangeMinimumQueryV4) - O(1)

Encoding and Compression

  • base coding
    • base64 rfc4648 (printable) - O(n)
    • base32 rfc4648 (printable) - O(n)
    • base16 rfc4648 (printable) - O(n)
    • ascii85 (printable) - O(n)
    • base85 rfc1924 (printable) - O(n)
    • base85 zeromq (printable) - O(n)
  • integer coding
    • alphabet base (printable) - O(loga(n))
    • little endian base 128 variable (non-printable, semi-compression) - O(log128(n))
  • huffman coding
    • huffman coding (non-printable, compression) - O(n)

TODO

  • knapsack
  • trees: b-tree
  • linear programming: simplex
  • graph: maximum matching, edge cover, facility location
  • heaps: fibonacci heap, pairing heap
  • compression: lz77, lz78
  • string search: suffix array