-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
embedded cache refactoring #10989
embedded cache refactoring #10989
Conversation
…tring to []byte. Also, the new implementation allows passing the callback that will be called for the removed entry. Signed-off-by: Vladyslav Diachenko <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work @vlad-diachenko! Left a couple thoughts
c.entries = make(map[string]*list.Element) | ||
if c.onEntryRemoved != nil { | ||
for _, entry := range c.entries { | ||
castedEntry := entry.Value.(*cacheEntry[K, V]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check that this cast succeeds, otherwise the next line will panic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hhmm, do we really want to call this function for every item once Stop()
is called? That could be quite slow. Is it necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if the value must be properly closed, we need to call on onEntryRemoved
callback. In our case, we will use this function to remove the data from the disk.
It's not always necessary if you run your pod against ephemeral storage, but if you run Loki on our local machine, then we need to clean up the data from the disk.
If it becomes unnecessary for some cases in the future, we can add an additional toggle to disable calling this function during the stop.
wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you storing on disk?
One way you could solve this efficiently is to generate a unique ID, and store all your objects on disk under a directory with that ID. When the cache is Stop()
-ed, you can delete the directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you storing on disk?
we will store extracted bloom-blocks on the disk. and we need to be able to delete the directory with this block if entry is evicted from the cache.
One way you could solve this efficiently is to generate a unique ID, and store all your objects on disk under a directory with that ID. When the cache is Stop()-ed, you can delete the directory.
yes, we could do it but it will require having different callbacks, one for onEntryRemoved
and one more for onStop
... anyway, for now, such cache with onEntryRemoved
callback will be used only for bloom-blocks so it does not affect rest of the implementation because the callback will be nil
there...
Also, even if it will be a thousand of the directories to delete, it should not take a lot of time.
wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use memcached from the beginning? memcached is an LRU...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what will be better to store the blocks on the disk or download them from the Memcache every time. we expect these blocks to be big enough, so we want to store them on a local disk because it might be just faster...
Let's say if the query touches 100GB of data, then it might be necessary to download about 1 GB of blocks, it might take some time ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want, we can jump on a call tomorrow or later today to discuss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, let's discuss it tomorrow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed execution onEntryRemoved callback during the cache stop a6f7a43
c.lock.Lock() | ||
defer c.lock.Unlock() | ||
|
||
for k, v := range c.entries { | ||
entry := v.Value.(*cacheEntry) | ||
entry := v.Value.(*cacheEntry[K, V]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though the cache value should not be of a different type, we should still check the cast result:
entry := v.Value.(*cacheEntry[K, V]) | |
entry, ok := v.Value.(*cacheEntry[K, V]) | |
if ok && time.Since(entry.updated) > ttl { | |
c.remove(k, v, expiredReason) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asked question below ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the type is not expected, a panic seems appropriate
hey @chaudum , @dannykopping |
@chaudum @dannykopping what do you think about the PR? can we merge it? |
Signed-off-by: Vladyslav Diachenko <[email protected]>
**What this PR does / why we need it**: refactored embedded cache to allow using it for any types, not only for `string -> []byte`. Also, the new implementation allows passing the callback that will be called for the removed entry. --------- Signed-off-by: Vladyslav Diachenko <[email protected]>
What this PR does / why we need it:
refactored embedded cache to allow using it for any types, not only for
string -> []byte
.Also, the new implementation allows passing the callback that will be called for the removed entry.