-
Notifications
You must be signed in to change notification settings - Fork 4
Benchmark
This benchmark compares the memory usage and the performance of the trie when storing n entries in the trie then trimming the trie/optimizing it for read-only operations, an then performing a lookup of every key in an order which is different from the insertion order. Times are given in milliseconds and memory usage is given in bytes.
All implementations are instantiated with the default constructor.
The CompositeTrie implementation uses a FastArrayTrie for the first 2 levels and CompactArrayTries then.
The BloomFilteredTrie implementation wraps a CompactArrayTrie and uses the same hashing function as Strings with a bloom filter of 1024*64 bits.
Since keys are looked-up as Strings, HashMap is given an advantage because the hash is pre-computed, but the hash computation is very fast, so HashMap would still be faster if it had to compute the hash every time. If you want to have an idea of how slower HashMap would be without caching the hash code, you can look at the difference in time between CompactArrayTrie and BloomFilteredTrie given that a lookup costs the same plus one hash computation and one BitSet lookup.
This test has been done on the aspell dictionary for the english language (generated with aspell -l en dump master | aspell expand | sort | uniq
then randomized) which contains 138349 words.
Insertion time | Trimming/read optimization time | Read time | Size in memory after insertion | Size in memory after trimming/optimization | |
---|---|---|---|---|---|
HashMap | 33 | 0 | 54 | 8590936 | 8590936 |
TreeMap | 231 | 0 | 242 | 9961192 | 9961192 |
CompactArrayTrie | 129 | 180 | 81 | 7297072 | 4365696 |
BloomFilteredTrie | 142 | 185 | 92 | 7319728 | 4388352 |
FastArrayTrie | 174 | 397 | 129 | 23628576 | 16583784 |
CompositeTrie | 95 | 89 | 75 | 9145248 | 4544416 |
CompactArrayRadixTrie | 247 | 218 | 112 | 8143592 | 2876896 |
As you can see, the best trie implementations (CompositeTrie and CompactArrayTrie) are not that slow compared to HashMap. Regarding memory, except FastArrayTrieMap, Trie implementations are 2 to 3 times as memory efficient as HashMap.
The same test has been done with a dictionary whose content is A AA AAA AAB AAC ... 997 998 999
It contains 242234 words whose length varies from 1 to 3.
Insertion time | Trimming/read optimization time | Read time | Size in memory after insertion | Size in memory after trimming/optimization | |
---|---|---|---|---|---|
HashMap | 163 | 0 | 193 | 16288152 | 16288152 |
TreeMap | 435 | 0 | 447 | 17440912 | 17440912 |
CompactArrayTrie | 203 | 157 | 97 | 3655792 | 3655792 |
BloomFilteredTrie | 213 | 155 | 103 | 3664048 | 3664048 |
FastArrayTrie | 66 | 99 | 49 | 4999808 | 4641664 |
CompositeTrie | 111 | 56 | 96 | 14286384 | 3949792 |
CompactArrayRadixTrie | 604 | 347 | 123 | 9991640 | 3877304 |
This time, most of the trie implementations behave much better than HashMap while being more memory-efficient at the same time.
This dictionary has been generated with all the combinations of 1 to 6 words from the list: "the", "brown", "fox", "jumps", "over", "the", "lazy" and "dog". This dictionary contains 299592 entries.
Insertion time | Trimming/read optimization time | Read time | Size in memory after insertion | Size in memory after trimming/optimization | |
---|---|---|---|---|---|
HashMap | 121 | 0 | 150 | 8551184 | 8551264 |
TreeMap | 643 | 0 | 670 | 9882496 | 9882496 |
CompactArrayTrie | 453 | 368 | 248 | 14622832 | 7686448 |
BloomFilteredTrie | 486 | 378 | 278 | 14631088 | 7680304 |
FastArrayTrie | 685 | 895 | 406 | 33132752 | 19451232 |
CompositeTrie | 424 | 315 | 256 | 12799800 | 7687704 |
CompactArrayRadixTrie | 760 | 272 | 227 | 8256464 | 2510136 |
HashMap becomes the fastest lookup structure again, but CompactArrayRadixTrie, which is the fastest Trie implementation is only 1.5 times slower, but 3.4 times as memory-efficient and is then a good compromise.