Skip to content
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.

feat: add memory cache #12

Merged
merged 20 commits into from
Apr 3, 2024
Merged

feat: add memory cache #12

merged 20 commits into from
Apr 3, 2024

Conversation

lanlou1554
Copy link
Collaborator

@lanlou1554 lanlou1554 commented Mar 29, 2024

  1. Remove DataStore, add DataStoreCache, including DataStore + LRU/LRUK...
  2. Rename LRU/LRUK from cache to replacer
  3. Add memory cache

This resolves #14


(Disk cache and memory cache should accept `Bytes` as an input parameter for write_data in this case)

For me, the first design is very clear, and the second design can save the cost of requesting S3 to get the size. But I think the cost is very low. Maybe the major difference is the API for write_data?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to send an additional HTTP request for every S3 request to get the file size, then the cost may not be neglectable. But I do vote for the first design cuz it's more straightforward and less prone to bugs...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to API constraints, I currently adopt method 2.
As expected, the code becomes very ugly : ( Let me think about how to refine it...

2. Solve the read-write & write-write conflict for disk operations. Let's say we have 2 requests, and they have same key for the cache, the first request has a cache miss, then it has to go to S3 to fetch data. But now second request comes, and the first request hasn't pulled all the data from S3 and stored them in the disk. So the disk file exists, but the content is not complete. Here is one possible way: when a cache miss happens, we record this key into our cache structure, but with a status `uncompleted`, and when the data is fully pulled and stored, the status is updated to `completed`. When another same request comes and tries to `get` disk file from cache, our cache finds the key exists but its status is `uncompleted`. So the cache should wait until its status turns to `completed`. (The async design to make the cache not simply dry wait is worth discussing)
3. In async approach, we should support read & output data at the same time, both S3 read & disk write(or network write) and disk read & network write. To support this, we cannot use one single fixed buffer, which is the current implementation of disk_manager. Here is 2 ways to support this:
1. For cache policy like `lru.rs` and `lru_k.rs`, we can use fine-grained locks instead of one big lock to lock it. If we use fine-grained locks (at least after getting the value from cache, we don't need to hold its lock), we should add unpin/pin in `lru.rs` and `lru_k.rs`.
2. Discuss the key for the cache. Currently it is raw S3 request string, but maybe we can make it as `bucket + one key`, since if one key represents a file, then for different requests, keys may overlap, and the current implementation will waste disk space.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will raw S3 request string be different for the same file? Isn't it something like aws s3 cp s3://bucket-name/path/to/file.txt /local/path/file.txt? In other words, if keys overlap, won't the requests also overlap? (Not sure, maybe wrong)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left this comment based on the current s3.rs test, so I am not sure what is the raw S3 request string... But if it the same as the s3.rs test (one S3 request string can be converted to one bucket + many keys, and each key represents one file), we should consider the cache key more carefully in the future. I added this todo just to remind us double check the format of raw S3 request string.
Feel free to correct me, since I almost know nothing about S3 requests. I think it depends on one raw S3 request string containing only one file or multiple files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will raw S3 request string be different for the same file?

Yes. When we make a request to an S3 object, we must specify its bucket AND its key. So we need separate requests if a table spans multiple keys in a bucket.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me correct myself. The raw S3 request string means the input param from storage manager.
So a quick question: for the raw S3 request string in storage manager, can it be transformed to one bucket and multiple keys?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we have to talk to the catalog team to finalize the data scheme

@codecov-commenter
Copy link

codecov-commenter commented Mar 30, 2024

Codecov Report

Attention: Patch coverage is 92.82297% with 45 lines in your changes are missing coverage. Please review.

Project coverage is 84.68%. Comparing base (724bdd0) to head (2529358).

Files Patch % Lines
...rc/cache/data_store_cache/memdisk/cache_manager.rs 87.36% 11 Missing and 23 partials ⚠️
storage-node/src/storage_manager.rs 96.29% 3 Missing and 2 partials ⚠️
storage-node/src/disk/disk_manager.rs 96.29% 1 Missing and 2 partials ⚠️
.../cache/data_store_cache/memdisk/data_store/disk.rs 93.75% 1 Missing ⚠️
...ache/data_store_cache/memdisk/data_store/memory.rs 98.80% 0 Missing and 1 partial ⚠️
storage-node/src/cache/policy/lru_k.rs 95.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #12      +/-   ##
==========================================
+ Coverage   81.70%   84.68%   +2.98%     
==========================================
  Files          15       17       +2     
  Lines        1536     2005     +469     
  Branches     1536     2005     +469     
==========================================
+ Hits         1255     1698     +443     
- Misses        210      215       +5     
- Partials       71       92      +21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines 85 to 86
// TOOD(lanlou): Now S3 stream returns small amount of data each time, so it would be expensive
// in the current implementation since one disk I/O one S3 stream next (too many disk I/O!)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compared to chunk_size = 1024, the current S3 stream returns significantly more amounts of data (approx 10^5 bytes per poll)

@unw9527
Copy link
Member

unw9527 commented Mar 31, 2024

I would suggest move that TODO.md to a GitHub issue rather than keeping it as a local file.

@lanlou1554
Copy link
Collaborator Author

I would suggest move that TODO.md to a GitHub issue rather than keeping it as a local file.

Sure, I will do it after I finish this PR.

@lanlou1554 lanlou1554 marked this pull request as ready for review March 31, 2024 21:49
@lanlou1554
Copy link
Collaborator Author

@unw9527 @xx01cyx
PTAL. Sorry that it is a large PR since I forgot to split the refactor and feature. I think you guys can start reading from storage-node/src/cache/data_store_cache/memdisk/cache_manager.rs.

storage-node/src/cache/policy/lru.rs Outdated Show resolved Hide resolved
storage-node/src/cache/policy/mod.rs Outdated Show resolved Hide resolved
storage-node/src/disk/disk_manager.rs Show resolved Hide resolved
@@ -50,13 +50,19 @@ impl DataStoreCacheValue for ParpulseDataStoreCacheValue {
///
/// There are different cache policies for the data store cache, such as LRU, LRU-K, etc. See
/// other files in this module for more details.
pub trait DataStoreCache {
pub trait DataStoreReplacer: Send + Sync {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also wondering if there is any use case for the rest of the methods in this trait...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can refine it in the future!

storage-node/src/cache/data_store_cache/mod.rs Outdated Show resolved Hide resolved
storage-node/src/storage_manager.rs Outdated Show resolved Hide resolved
Copy link
Member

@xx01cyx xx01cyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM!!

if mem_max_file_size > replacer_max_capacity {
// TODO: better log.
println!("The maximum file size > replacer's max capacity, so we set maximum file size = 1/5 of the maximum capacity.");
// By default in this case, replacer can at least 5 files.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// By default in this case, replacer can at least 5 files.
// By default in this case, replacer can store at least 5 files.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx a lot yx!!!
我落泪了

@lanlou1554 lanlou1554 merged commit d3721ef into main Apr 3, 2024
1 check passed
@lanlou1554 lanlou1554 deleted the ll/memory_cache branch April 3, 2024 20:13
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add memory cache
4 participants