Add CacheMaintainer class to perform pending cache maintenance every minute #2308

owenhalpert · 2024-12-03T22:50:05Z

Description

Adds a CacheMaintainer class that takes in a generic Guava cache and calls cleanUp periodically (every minute). This will perform any pending maintenance (such as evicting expired entries) which was previously only performed when the cache was accessed. The maintenance thread is created whenever a NativeMemoryCache or QuantizationStateCache is instantiated or rebuilt and can be shut down with either class's close method. Relevant logic for cleanup was added to some testing base classes.

Related Issues

Resolves #2239

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…minute Signed-off-by: owenhalpert <[email protected]>

kotwanikunal · 2024-12-04T10:22:07Z

@owenhalpert The changes look good in general. What I'd be interested in is testing under load for a resource constrained system - can you verify if this adds to the latency or impact the performance in any way?

We did implement a force evict before writes with #2015. Can you also enable this feature flag and run the above tests to ensure it behaves well?

kotwanikunal · 2024-12-04T10:36:47Z

src/main/java/org/opensearch/knn/index/CacheMaintainer.java

+ * for more details. Thus, to perform any pending maintenance, the cleanUp method will be called periodically from a CacheMaintainer instance.
+ */
+public class CacheMaintainer<K, V> implements Closeable {
+    private final Cache<K, V> cache;


You can also avoid maintaining the Cache object reference here by using a functional interface. That would also get rid of the generification of this class.

Simply pass and store the runnable reference instead of the cache as new CacheMaintainer(() -> cache.cleanUp());

Possibly also move this class to the util package and call it a ScheduledExecutor with Runnable ref and interval as parameters.

public class ScheduledExecutor implements Closeable { ... public ScheduledExecutor(Runnable reference, long scheduleMillis) { ... } ... }

@kotwanikunal what do you think about creating the executor and calling scheduleAtFixedRate within each cache class instead of creating a new ScheduledExecutor class?

…an accept a Runnable and an interval Signed-off-by: owenhalpert <[email protected]>

owenhalpert · 2024-12-17T23:28:47Z

@kotwanikunal I've completed the benchmarking on a single node cluster limited to 3GB of memory.

Before benchmarking my code, I validated the CacheMaintainer was actually running by inspecting the OpenSearch process status in the Docker container and found that after about a minute, the RssAnon value would decrease by about 0.5gb without any interference on my part, signaling the maintainer successfully cleaned up the expired entries. This is aligned with what I saw on the standard maintenance on the clean code (which I triggered manually by accessing the cache).

Results below:

Performance Summary for Resource-Constrained Testing (3GB Memory Limit)

Run 1: Clean 2.18 code

Indexing:

p50 latency: 16.96 ms

p90 latency: 32.47 ms

Search:

p50 latency: 330.48 ms

p90 latency: 437.78 ms

Run 2: Clean 2.18 code, Force evict ON

Indexing:

p50 latency: 17.27 ms

p90 latency 33.82 ms

Search:

p50 latency 337.43 ms

p90 latency 428.70 ms

Run 3: PR changes added, Force evict ON:

Indexing:

p50 latency: 16.89 ms

p90 latency: 32.24 ms

Search:

p50 latency: 341.38 ms

p90 latency: 414.06 ms

Run 4: PR changes added, Force evict OFF:

Indexing:

p50 latency: 17.43 ms

p90 latency: 34.63 ms

Search:

p50 latency: 346.83 ms

p90 latency: 430.04 ms

This suggests there is no significant impact on latency with my code changes. I've included the full results of these test runs here:

https://gist.github.com/owenhalpert/05ad4f5ae9577f717f2c59f2039d52e4

Add CacheMaintainer class to perform pending cache maintenance every …

8392b1d

…minute Signed-off-by: owenhalpert <[email protected]>

owenhalpert force-pushed the cache-maintenance branch from 06852a4 to 8392b1d Compare December 3, 2024 22:52

kotwanikunal reviewed Dec 4, 2024

View reviewed changes

owenhalpert added 2 commits December 4, 2024 14:10

Update CacheMaintainer to be a generic ScheduledExecutor class that c…

601a24a

…an accept a Runnable and an interval Signed-off-by: owenhalpert <[email protected]>

Update CacheMaintainer to be a generic ScheduledExecutor class that c…

b10f30f

…an accept a Runnable and an interval Signed-off-by: owenhalpert <[email protected]>

owenhalpert force-pushed the cache-maintenance branch from 558e7ea to b10f30f Compare December 9, 2024 17:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CacheMaintainer class to perform pending cache maintenance every minute #2308

Add CacheMaintainer class to perform pending cache maintenance every minute #2308

owenhalpert commented Dec 3, 2024

kotwanikunal commented Dec 4, 2024

kotwanikunal Dec 4, 2024

owenhalpert Dec 12, 2024

owenhalpert commented Dec 17, 2024

Add CacheMaintainer class to perform pending cache maintenance every minute #2308

Are you sure you want to change the base?

Add CacheMaintainer class to perform pending cache maintenance every minute #2308

Conversation

owenhalpert commented Dec 3, 2024

Description

Related Issues

Check List

kotwanikunal commented Dec 4, 2024

kotwanikunal Dec 4, 2024

Choose a reason for hiding this comment

owenhalpert Dec 12, 2024

Choose a reason for hiding this comment

owenhalpert commented Dec 17, 2024