-
Notifications
You must be signed in to change notification settings - Fork 153
Eviction Strategies
In Twemcache, all slabs are of the same size, while items can be different in size. While memory allocation always happen at the slab level, data eviction can happen at two levels, item or slab. Each has its merits and drawbacks. Item eviction eliminates the minimum amount of data possible to fulfill a request, but since it cannot change the slab class a slab belongs to, a slab can never be used to store data not belonging to the same slab class. Slab eviction has the freedom to re-assign a slab to a different class, but before doing so it wipes out all items within it to fulfill just one storage request, although the other slots may be used in the future.
LRU item eviction (-M 1) works at the item level. When a new item needs to be stored and cannot be accommodated by recycling an expired item or allocating more memory, the least recently used item in the same slab class the new item belongs to gets evicted.
Random slab eviction (-M 2) works at the slab level. When a new item needs to be inserted and it cannot be accommodated by recycling an expired item or allocating more memory, a random slab is picked from the list of all allocated slabs, all items within it evicted, and reassigned to the slab class that fits the new item.
LRA (least recently accessed, also known as Slab LRU) slab eviction (-M 4) works at the slab level. It updates slab ordering the same way LRU item eviction updates item ordering, but the objects being ordered are slabs instead of items, and there is only one global, slab-class agonostic list. It evicts the slab at the head of the list when a new item cannot be accommondated otherwise.
LRC (least recently created) slab eviction (-M 8) works at the slab level. It tracks slab ordering by the time they were created or reassigned. One noticeable behavior of LRC slab eviction is that it disables reusing memory at the item level- it no longer tries to find an expired or deleted item. As a result, it evicts the slab at the head of the list when a new item cannot be accommodated by allocating more memory.
The best choice among these eviction strategies is heavily use-case dependent. The two most important aspects are size distribution over time, and whether updates to the cache happen on the write or read path.
If the size distribution of all key-value pairs are very stable over time, LRU item eviction should work great as long as temporal locality applies to the workload, and regardless of whether updates happen on the write or read path. On the other hand, if the size distribution changes, the lack of slab-level adjustment leads to slab calcification- a mismatch between slab distribution and item size distribution among slab classes. The result of this is the loss of effective capacity, lower hit rate, and in some cases, even server errors (https://github.com/twitter/twemcache/wiki/Random-Eviciton-vs-Slab-Automove).
When slab calcification becomes a real threat to caching effectiveness in the long term, slab eviction is a better choice, but which one of them? This leads us to the second aspect- update paradigms. When key-value pairs are inserted into cache when they are created, the cache is called a write-update cache; when they are inserted after being requested and missed in the cache, the cache is called a read-update cache. For write-update caches, the assumption behind the paradigm is that data that are recently created are more likely to be visited, one can imagine anything that resembles news fits nicely into this assumption. For such cases, LRC slab eviction is likely to be a very good choice. For read-update caches, the assumption is that data that are recently accessed are more likely to be revisited, regardless of when they were created. For such workload, LRA slab eviction might be good, although LRA slab eviction has some drawbacks that might defeat the purpose (see Notes section).
So far we have ignored random slab eviction, but as it turns out, this is an extremely applicable algorithm. It is a dumb strategy, but that is also its strength- by being workload agnostic, it is ressilient to unknown patterns and changes in the workload. It is also strongly recommended to start with random slab eviction to learn about the size distribution over time (statistically, slab distribution will converge to data size distribution over time), and to establish a base line for hit rate before trying out more workload-specific eviction strategies.
Some curious readers may wonder why we chose to implement LRC eviction so differently. The problem with item-level memory recycle is, it hurts the temporal locality of items within a slab. If we only insert item to the end of the most recently allocated slab, going through the list of slabs, and their items, gives us a list of increasingly recent items. However, when we insert a new item into a slab of very old items with a free 'hole' in it, the uniformity in item age no longer holds for that slab. This makes it hard to make a decision when it comes to eviction. In fact, on of the drawbacks of LRA slab eviction is that it often promotes a slab with a few new, but mostly old items to the end of list, resulting in slabs that are younger overall being evicted, losing large number of more recent items, hurting the hit rate in many cases even compared to random eviction. But the choice also comes with a small price- if a high percentage of items are randomly deleted before expiration, LRC will leave them in place until the entire slab is evicted, and the majority of slabs in their life time will have many holes. The net result is a lower effective capacity compared to eviction strategies in which items are reused in every possible way.