-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-2799 avoid OOM through HashMap resizing by setting initial size #2872
Conversation
There's a bit of a false assumption in this change I just realized, due to poor variable/constant naming: although the Alternative is that we move the "10% of the heap is still free" requirement up a bit, to, say, 15%? |
I think we should start with a test, or a benchmark if that's easier. Essentially limit the memory available and load in a big file. I'm also curious if this scales as it should, eg. does loading in a bigger file while giving more memory trigger the overflow as consistently as loading a smaller file with less memory available? |
Not at my desk right now, but I believe there's an existing compliance test for the overflow that we could extend if necessary. Update: the test I was thinking of is |
I'm not aware of a simple way to control the available heap space for a single junit test - so we may want to "abuse" a jmh benchmark for this purpose. I'm currently out of spare time, so won't continue on this immediately. @hmottestad if you have an idea and time/will to set up such a test feel free to add it on to this branch. |
After I commented here I set out to google if maybe junit 5 could fork the JVM for a test to configure the amount of memory. Could not find anything. I guess abusing JMH might be the only way unfortunately. |
To be fair even with benchmarking in place there's only so much we can do to make this work better: at the end of the day we are relying on an estimate of available heap space and predicted usage to trigger disk overflow. In real life situations that estimate could be way off because of circumstances beyond our control (e.g. some different Java object suddenly consuming a lot of memory or something as basic as the data currently being uploaded suddenly containing a few massive literal values). We make a best effort, but we can't guarantee the process won't run out of memory. |
Btw a reason disk syncing on memory overflow is giving us such a performance hit was given by Arjohn Kampman on the mailinglist:
That is not directly related to this fix, but if we can address some of that, we can have more confidence that setting a lower overflow threshold won't cause a massive performance penalty. |
Do you mind @jeenbroekstra if I force push this branch? I would like to add a commit with a benchmark as the first commit in this branch so that we can easily test before and after. |
Go for it |
e850bb4
to
590dad6
Compare
ok, I've just force pushed, but will need to force push at least one more time before I'm done. |
590dad6
to
b76dde0
Compare
I've added two benchmarks, one synthetic and one with a real world file. The synthetic one I've got failing with very low memory. The real world file I've got failing with higher memory. Both are failing at different places. And to make things even more complicated the real world file is failing only within a certain memory range (500-700 MB), outside of that range it either overflows correctly or doesn't need to overflow. |
The synthetic benchmark only fails on java 8 for me. Probably due to using G1GC. Btw. I tried using the parallel gc, and it failed differently:
EDIT: Yet another different failure point (Java 8 + G1GC):
|
Related: #2998 |
Not going to dig any further now. I've spent like 3 hours on this because IntelliJ was acting up and running the benchmarks takes a long time. |
I'm becoming more and more convinced that we should invest in replacing the current MemoryOverflowModel with a completely different implementation, something that just relies on standard Java Object Serialization or some simple abstraction like MapDB for disk syncing. I will try and spike something, see what we get. |
Maybe for the time being just try to adjust the point where the model overflows to disk. Perhaps you can add some logging to show what it currently thinks is going on memory wise so you can see what numbers it's comparing. Also maybe try without Xms in the benchmark. The real world benchmark is best to start with I think. |
1e55523
to
1716d06
Compare
Signed-off-by: Håvard Ottestad <[email protected]>
…flowing, regardless of block size Signed-off-by: Håvard Ottestad <[email protected]>
1716d06
to
508e9ee
Compare
I did some logging and found out that sometimes the algorithm would decide that the max block size was something like 10 MB. This is when it would overflow. So I've just added a hard limit of 32MB of required available memory. This works fine for both my tests (benchmarks) and I feel it's a reasonable tradeoff. How many users give their NativeStore 64MB or 128MB and then load large files? |
This seems reasonable to me as well, nice workaround. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, happy to have this merged. I'll leave it to you.
GitHub issue resolved: #2799
Briefly describe the changes proposed in this PR:
size * 2
for the statement set to compensate for the load factor, so using the exact block size should be fine for our purposesPR Author Checklist (see the contributor guidelines for more details):
mvn process-resources
to format from the command line)