OpenSearch Data Nodes memory exhaustion after upgrade from 2.9 to 2.12 (JDK 21 upgrade) #12454

rlevytskyi · 2024-02-26T08:19:56Z

Describe the bug

Hello OpenSearch Team,
We’ve just updated our OpenSearch cluster from version 2.9.0 to 2.12.0.
Among other issues, we’ve noticed that Opensearch is now consuming waaay more memory than previous version, i.e. it became unusable with the same configuration, even after providing it with 15% more RAM. To make it responsive again, we had to close many indices.

Related component

Other

To Reproduce

Have a 2.9 cluster of 4 data nodes with 112GB of Xmx RAM and 13.6 TB of storage
Fill it with 5500 indices (mostly small of 1 shards, but several big of 4 shards) up to 75% of capacity
Update 2.9 to 2.12 and add RAM to make it 128GB
See many GC messages at logs and almost inoperable cluster
Close 2000 indices to make it work again

Expected behavior

We didn't expect significant memory usage increase at version upgrade

Additional Details

Plugins
Security pluging for SAML authn and authz

Screenshots
Please note almost horizontal Heap usage before upgrade, increase after upgrade, and horizontal again after closing some indices.

Host/Environment (please complete the following information):

OS: Oracle Linux 8.9
OS: Docker image opensearchproject/opensearch:2.12.0

shwetathareja · 2024-02-27T08:30:01Z

Thanks @rlevytskyi for reporting the issue. did you try a heap dump? It will help us debug further here. (You can try with smaller heap, the issue might generate faster in that case).

Couple of questions:

Are you running non dedicated cluster manager cluster?
What is the cluster state size, you can check via _cluster/state API output?
How many shards are there overall?
When you observed JVM heap spiking, was it during upgrade from 2.9 to 2.12 or post upgrade as well, it was consistently high?

rlevytskyi · 2024-02-27T13:27:30Z

Thank you @shwetathareja for your reply!
Here are clarifications:

Yes we are running non-dedicated manager cluster, we have four nodes running both data and master-eligible nodes and two coordinating nodes:
% curl logs:9200/_cat/nodes\?s=name
d data - v480-data.company.com
m master - v480-master.company.com
d data - v481-data.company.com
m master * v481-master.company.com
d data - v482-data.company.com
m master - v482-master.company.com
d data - v483-data.company.com
m master - v483-master.company.com
- - - v484-coordinator.company.com
- - - v485-coordinator.company.com
Quite a lot of output:
% curl logs:9200/_cluster/state | wc
0 4989 193053405
26696 reported by _cat/shards
It was hitting the top during upgrade and also post upgrade.

rlevytskyi · 2024-02-27T13:43:03Z

Re heap dump, where should we collect it and when?
Right now, we see nothing at data nodes.
Coordinating nodes sometimes telling something like
[INFO ][o.o.i.b.HierarchyCircuitBreakerService] [v484-coordinator.company.com] attempting to trigger G1GC due to high heap usage [8204216264]
[INFO ][o.o.i.b.HierarchyCircuitBreakerService] [v484-coordinator.company.com] GC did bring memory usage down, before [8204216264], after [3248648136], allocations [71], duration [62]
but will it's heap dump useful?

reta · 2024-02-27T17:59:02Z

@rlevytskyi one of the major changes in 2.12 is that it is bundled with JDK-21 by default, any chances you could downgrade JDK to 17 for your deployment (may need altering Docker image) to eliminate the JDK version change as a suspect? Thank you.

rlevytskyi · 2024-02-27T19:32:56Z

Thank you Andriy for your reply.
I've searched the https://github.com/opensearch-project/OpenSearch and was unable to find appropriate Dockerfile.
Could you please point me to the right one?

reta · 2024-02-27T20:31:01Z

I think you need those https://github.com/opensearch-project/opensearch-build/tree/main/docker/release/dockerfiles, but may be simpler way is to "inherit" from 2.12 image and install/replace JDK version to run with.

peternied · 2024-02-28T16:42:18Z

[Triage - attendees 1 2 3 4 5]
@rlevytskyi Thanks for filing - we will keep this issue untriaged for 1 week and if it does not have a root cause we will close the issue.

The following were some recent investigations in the security plugin for your consideration.

[BUG] DLS performance has regressed with new serialization format security#3776
[BUG] Field masking has inconsistent memory issues with certain queries security#4031

rlevytskyi · 2024-02-29T10:16:49Z

I am unable to build OpenSearch image yet.
Moreover, Dockerfile ( https://github.com/opensearch-project/opensearch-build/blob/main/docker/release/dockerfiles/opensearch.al2.dockerfile ) is telling that

This dockerfile generates an AmazonLinux-based image containing an OpenSearch installation (1.x Only).
Dockerfile for building an OpenSearch image.
It assumes that the working directory contains these files: an OpenSearch tarball (opensearch.tgz), log4j2.properties, opensearch.yml, opensearch-docker-entrypoint.sh, opensearch-onetime-setup.sh.`

First of all, it tells "1.x Only"
Second, it tells that I have to put some files there but I see no way to make sure I use exactly the same files you use.

So my question is if there is a way to build image exactly as yours to make sure we have the same configuration?

peternied · 2024-03-01T00:25:21Z

@rlevytskyi I believe the new file is right next to the that one dockerfile. Take a look at the readme.md, maybe that will help if you are looking to construct a docker image from a custom configuration

Note; following "inherit" from 2.12 image and install/replace JDK version to run with. seems easier IMO

peternied · 2024-03-01T00:30:16Z

@rlevytskyi I'm not sure if you've managed to capture and investigate a heap dump of the OpenSearch process, see this guide
to capture that information in a docker environment [1]. This will steer the investigation towards what is causing memory to be consumed. They can also be used to compare a 2.9 vs 2.12 versions for the difference.

[1] https://iceburn.medium.com/thread-and-heap-dumps-in-docker-containers-9aada82226fb

rlevytskyi · 2024-03-01T07:37:10Z

Thank you Peter,
However, neither I am a Java programmer nor a Docker enthusiast, and that "inherit" from 2.12 image and install/replace JDK version to run with" doesn't seem to be clear to me.
As far as I understand, it can be achieved by changing "ENTRYPOINT" to "/bin/bash", starting a container, install new Java inside, set JAVA_HOME and run Opensearch.
However, you need to rebuild the image to change ENTRYPOINT, and we got in recursion...

rlevytskyi · 2024-03-01T07:41:17Z

Re Heap Dump, I managed to get and even sanitize it using Paypal's tool https://github.com/paypal/heap-dump-tool .
However, it's not feasible to get it right now because cluster is running smoothly now.

rlevytskyi · 2024-03-01T13:07:58Z

Thank you again @peternied Peter for pointing out the https://github.com/opensearch-project/opensearch-build/blob/main/docker/release/README.md
I managed to build the 2.12 image with JDK17 from 2.11.1.
Have a nice weekend!

rlevytskyi · 2024-03-05T13:17:26Z

I managed to create an image based on 2.12 using the following Dockerfile:
FROM opensearchproject/opensearch:2.12.0
USER root
RUN dnf install -y java-17-amazon-corretto
USER opensearch
ENV JAVA_HOME=/usr
`
Running it at the test installation doesn't reveal any memory usage difference.
Looking forward to run a big (prod) installation with it.
Do you guys think if it is safe?

peternied · 2024-03-06T16:23:38Z

[Triage - attendees 1 2 3 4 5]

Do you guys think if it is safe?

@rlevytskyi Without a root cause / and bugfix it is hard to qualify what next steps to take. I would recommend doing testing and have a mitigation plan if something happens, but your mileage my vary.

Thanks for filing - we will keep this issue untriaged for 1 week and if it does not have a root cause we will close the issue.

Since it has been a week and there is no root cause, we are closing out this issue. Feel free to open a new issue if you find a proximal cause from a heap analysis or a way to reproduce the leak.

tophercullen · 2024-05-05T00:33:26Z

Want to chime in and say we were running into something similar after upgrading to 2.12. Suddenly all sorts of previously normal operations were causing the overall parent circuit breakers to trip, and there were significantly more GC logs emitted by opensearch overall. This problem was most exacerbated by the snapshot and reindex APIs.

I applied the image changes from @rlevytskyi to use JDK17 and it has completely solved the issues and symptoms we were seeing. Average heap dropped considerably and is much more stable.

dblock · 2024-05-05T21:06:23Z

Sounds like upgrading to JDK 21 is the change that caused this. Seems like a real problem. I am going to reopen this and edit the title to say something to this effect. @tophercullen do you think you can help us debug what's going on? There are a few suggestions above to take some heap dumps and compare.

tophercullen · 2024-05-06T03:06:21Z

Using the above paypal tool that sanitizes them, I've generated heap dumps from all nodes in a new standalone cluster (nothing else using it) while taking a full cluster snapshot at 1x JDK17 and 2x JDK21. This is 24 files and ~5GB compressed. I'm unsure what I'm supposed to be comparing between them.

From the stdout logging for the cluster, there were no GC logs with JDK17, and a bunch with JDK21. So it seems to be repeatable in an otherwise idle cluster, assuming that is not just a red herring.

Might also consider the reproducer in #12694. That seems fairly similar to our real use case, and the operations we were seeing/getting circuit breakers tripped. Snaphots never directly tripped breakers and/or failed, and were seemingly just exacerbating the problem

dblock · 2024-05-06T11:32:34Z

Maybe @backslasht has some ideas about what to do with this next?

reta · 2024-05-06T12:05:16Z

Using the above paypal tool that sanitizes them, I've generated heap dumps from all nodes in a new standalone cluster (nothing else using it) while taking a full cluster snapshot at 1x JDK17 and 2x JDK21. This is 24 files and ~5GB compressed. I'm unsure what I'm supposed to be comparing between them.

May be sharing class histogram first could help (even as a screenshot) , thanks @tophercullen

dblock · 2024-05-06T20:24:00Z

#12694 could be related

ansjcy · 2024-05-06T23:22:11Z

This might be related to this issue in JDK: https://bugs.openjdk.org/browse/JDK-8297639
The G1UsePreventiveGC was introduced and set to true by default in JDK17 (introduced in this commit, renamed in this commit ) The related issue is https://bugs.openjdk.org/browse/JDK-8257774. This was introduced to solve

...bursts of short lived humongous object allocations. These bursts quickly consume all of the G1ReservePercent regions and then the rest of the free regions

In JDK 20, this flag was set to false by default and in JDK 21 it was completely removed in https://bugs.openjdk.org/browse/JDK-8293861.

Summarizing the observations and reproducing efforts by the community around this JDK issue: removing this flag might have caused memory increase when sending and receiving document with chunks > 2MB. In JDK 20 we can add the G1UsePreventiveGC flag back to bypass this issue but in JDK21 it is not an option anymore :( We either need go back to JDK 20 with that flag enabled, or we need to explore other possible ways to fix this.

reta · 2024-05-06T23:32:00Z

@ansjcy that was suggested before (I think on the forum) but we did not use -XX:+G1UsePreventiveGC (AFAIK)

dblock · 2024-05-07T06:59:15Z

@rlevytskyi @tophercullen Do you still have your repro. Care you try with JDK 21 and -XX:+G1UsePreventiveGC, please?

tophercullen · 2024-05-07T07:12:22Z

@dblock I can do what I did before: create a new cluster and populate it with data, run snapshots.

However based on what @ansjcy provided, that option is no longer available in JDK21. The issue tracker for openJDK links to a similar issue with Elasticsearch in this regard, which also has no solution using JDK21.

dblock · 2024-05-07T09:02:59Z

However based on what @ansjcy provided, that option is no longer available in JDK21.

Yes, my bad for not reading carefully enough.

ansjcy · 2024-05-08T23:44:42Z

but we did not use -XX:+G1UsePreventiveGC

No, but if I'm understanding correctly, this flag was enabled by default in g1_globals.hpp for G1GC in JDK 17.

Also, today I did some more experiments using https://github.com/kroepke/opensearch-jdk21-memory (Thanks, @kroepke! ). I ran bulk (20MB workload per request, ~5MB each document) with docker-based set up, each for 1 hour in the following scenarios:

2.11 with JDK 17, G1UsePreventiveGC flags enabled [1].
2.11 with JDK 17, G1UsePreventiveGC flags disabled [2].
2.11 with JDK 21 [3]

captured the jvm usage results in the 1 hour run:

for [1], the average jvm usage is 191707377 bytes
for [2], the average jvm usage is 196708634 bytes
for [3], the average jvm usage is 201973645 bytes

The results shows certain but not significant impact from disabling the flag G1UsePreventiveGC in JDK 17, but there might be some unknown factors impacting the jvm usage in JDK 21 as well. We need to run even longer and heavier benchmark tests to better understand this.

backslasht · 2024-05-18T15:52:07Z

@ansjcy - Do you think G1UsePreventiveGC is the root cause or it is something else?

@tophercullen - Can you please share the heap dumps?

@dblock - Is there a common share location where these heap dumps can be uploaded?

dblock · 2024-05-20T15:17:10Z

@dblock - Is there a common share location where these heap dumps can be uploaded?

AFAIK no, we don't have a place to host outputs from individual runs - I would just make an S3 bucket and give access to the folks in this thread offline if they don't have a place to put these

zakisaad · 2024-06-18T01:45:38Z

We're seeing background memory use climb over time (pointing to some kind of GC/memory leak as described) on our AWS managed OpenSearch clusters since the 2.13 upgrade. We went from 2.11 (where the issue was not manifesting), to 2.13. We've had to bump all our nodes from 8GB memory instances to 32GB memory instances just to keep the cluster from falling over every night.

Apart from the version upgrade, there have been no other changes.

Attached can be seen climbing min/avg/max JVM mem pressure over the last week (we've been on 2.13 for >1 week, some adverse cluster events can be seen on this chart too).

Anything we can pull from our managed clusters to help resolve this? We're sorely over-provisioned now, so we're willing to put in some legwork to solve this.

tophercullen · 2024-06-18T01:59:45Z

@zakisaad Since downgrading the JVM version over a month ago, we haven't had any more issues. I would check the JVM version AWS is using. If its 21, you'll likely need to get in tough with AWS support to escalate this issue because its has not been tested thoroughly enough for actual production use from what we've found first hand (see also opensearch-project/performance-analyzer-rca#545 (comment)).

If AWS is unable or unwilling to escalate this, I think your only options is to (somehow) revert to a previous version of the hosted service.

zakisaad · 2024-06-18T02:07:35Z

We'll be reaching out to AWS support to get this resolved as it's essentially unusable in current state (we're rolling out a cluster reboot cron to mask GC issues until resolved). Upgrades to managed OS are one-way only, so downgrading our cluster will require a restore from a snapshot -- we may attempt this if AWS can't provide a remediation timeline.

Thanks for confirming the JVM downgrade sorted this out for you, if we were self-hosting I'd jump on it. One day, we'll have the bandwidth to internally manage our OS cluster 🙇‍♂️

tophercullen · 2024-06-18T02:14:50Z

@zakisaad Yeah, there are pros and cons to the hosted service. My advice: don't hold your breath for AWS and/or this issue to be resolved. Create a new (older) cluster and determine if a snapshot restore is even possible, and plan an alternative data migration accordingly.

reta · 2024-06-19T14:45:37Z

@tophercullen sadly JDK-21 provides no workaround for this issue (#12454 (comment)), downgrading is the best option as suggested by @zakisaad

hogesako · 2024-06-23T10:41:26Z

The issue seemed to have been alleviated in Elasticsearch by stopping unnecessary copying of byte arrays.

elastic/elasticsearch#99592
elastic/elasticsearch#104692
elastic/elasticsearch#105712

dblock · 2024-06-25T19:13:50Z

@hogesako Appreciate any fixes you can make to OpenSearch. Please make sure to no look at / copy non-APLv2-open-source code.

zakisaad · 2024-06-26T03:46:03Z

Hi @dblock

This issue is adversely affecting our clusters in production -- as I understand it, AWS maintains OpenSearch (and provides a managed OpenSearch service to monetise the product). As it stands, the default configuration of a fully up to date managed OS cluster on AWS exhibits memory-leak like behaviour. There are hacky fixes such as scheduled cluster reboots ~once a week (with over-provisioned nodes to accomodate the leaking memory...), but this is for sure a short term fix with various shortcomings.

Our clusters aren't even that large, so I can bet other clients are seeing this issue for sure.

As Amazon has forked ES specifically to be able to continue monetising the product via their managed service, I assume it is expected that AWS fixes or at least addresses this issue as important.

We haven't bothered considering self-managed clusters yet as we assumed AWS would fix an issue of this magnitude, but if AWS won't prioritise it we'll be moving off the managed service for sure. If we were self-managed, we'd be able to downgrade the JVM and avoid this issue entirely, for instance.

42wim · 2024-07-01T08:16:41Z

We're seeing the same issue here, I've rebuild an image with JDK17 as specified above, but this didn't solve it for us, also tested on 2.14.0

Even with JDK17 and increasing memory with 400% we need to restart our cluster every few days because of the memory issues.

So it's not just the JVM, maybe there's a memory leak and a GC problem.

kroepke · 2024-07-01T09:44:09Z

@zakisaad @42wim Are you aware of the other memory issue affecting many projects, as described in #13927?

If you use 2.14 with Java 17 and still see memory leaks, you might be looking at that instead, if so, give 2.15 a shot keeping at Java 17 if possible.

zakisaad · 2024-07-01T09:46:51Z

@kroepke unfortunately as far as I know, Amazon managed OpenSearch doesn't allow us to specify/pin JDKs. We're at the mercy of whatever the development team Amazon has rolled out as part of the managed service.

shwetathareja · 2024-07-04T04:13:01Z

@zakisaad please reach out to AWS support to follow up on the fix for your clusters.

42wim · 2024-07-04T11:32:02Z

I've updated the 2.14.0 image with the jackson-core to 2.17.1, keeping the default java (openjdk version "21.0.3" 2024-04-16 LTS) of that image.

I've downscaled the cluster back to the original 100% and it is now running for 72h without issues.

Next week we'll upgrade to 2.15.0

reta · 2024-07-04T13:31:30Z

Next week we'll upgrade to 2.15.0

Thanks for the update @42wim , please share the outcomes , it would help us to pinpoint if the issue is still there (JDK related) or gone (Jackson related)

42wim · 2024-07-12T22:41:40Z

Running 2.15.0 now for > 72h, issues are gone, so seems Jackson related.

dhwanilpatel · 2024-09-16T16:00:23Z

[Indexing Triage 09/16]

Thanks @42wim for confirmation. Closing the issue now.

rlevytskyi added bug Something isn't working untriaged labels Feb 26, 2024

github-actions bot added the Other label Feb 26, 2024

rlevytskyi changed the title ~~OpenSearch Data Nodes memory exhaustion after upgrade from 2.9 to 2.12[BUG] <title>~~ OpenSearch Data Nodes memory exhaustion after upgrade from 2.9 to 2.12 Feb 26, 2024

rlevytskyi mentioned this issue Feb 27, 2024

OpenSearch Dashboards failures after upgrade 2.9 to 2.12 opensearch-project/OpenSearch-Dashboards#5939

Open

peternied mentioned this issue Feb 28, 2024

OpenSearch Coordinating Nodes memory exhaustion after upgrade 2.9 to 2.12 #12455

Closed

peternied closed this as completed Mar 6, 2024

kroepke mentioned this issue Mar 15, 2024

[BUG] Higher memory consumption when running with JDK 21 #12694

Open

dblock reopened this May 5, 2024

dblock changed the title ~~OpenSearch Data Nodes memory exhaustion after upgrade from 2.9 to 2.12~~ OpenSearch Data Nodes memory exhaustion after upgrade from 2.9 to 2.12 (JDK 21 upgrade) May 5, 2024

dblock added Indexing:Performance and removed untriaged labels May 6, 2024

tophercullen mentioned this issue May 13, 2024

[BUG] Performance Analyzer webserver on port 9600 not responding to any API calls (caused by JDK upgrade?) opensearch-project/performance-analyzer-rca#545

Open

Mavtti mentioned this issue Sep 16, 2024

[BUG] High memory consumption #15934

Closed

dhwanilpatel closed this as completed Sep 16, 2024

OpenSearch Data Nodes memory exhaustion after upgrade from 2.9 to 2.12 (JDK 21 upgrade) #12454

OpenSearch Data Nodes memory exhaustion after upgrade from 2.9 to 2.12 (JDK 21 upgrade) #12454

Comments

rlevytskyi commented Feb 26, 2024

Describe the bug

Related component

To Reproduce

Expected behavior

Additional Details

shwetathareja commented Feb 27, 2024 • edited Loading

rlevytskyi commented Feb 27, 2024 • edited Loading

rlevytskyi commented Feb 27, 2024

reta commented Feb 27, 2024 • edited Loading

rlevytskyi commented Feb 27, 2024

reta commented Feb 27, 2024

peternied commented Feb 28, 2024

rlevytskyi commented Feb 29, 2024 • edited Loading

peternied commented Mar 1, 2024 • edited Loading

peternied commented Mar 1, 2024 • edited Loading

rlevytskyi commented Mar 1, 2024

rlevytskyi commented Mar 1, 2024

rlevytskyi commented Mar 1, 2024

rlevytskyi commented Mar 5, 2024 • edited Loading

peternied commented Mar 6, 2024

tophercullen commented May 5, 2024 • edited Loading

dblock commented May 5, 2024

tophercullen commented May 6, 2024 • edited Loading

dblock commented May 6, 2024

reta commented May 6, 2024

dblock commented May 6, 2024

ansjcy commented May 6, 2024 • edited Loading

reta commented May 6, 2024

dblock commented May 7, 2024

tophercullen commented May 7, 2024 • edited Loading

dblock commented May 7, 2024

ansjcy commented May 8, 2024

backslasht commented May 18, 2024

dblock commented May 20, 2024

zakisaad commented Jun 18, 2024 • edited Loading

tophercullen commented Jun 18, 2024 • edited Loading

zakisaad commented Jun 18, 2024

tophercullen commented Jun 18, 2024

reta commented Jun 19, 2024

hogesako commented Jun 23, 2024

dblock commented Jun 25, 2024

zakisaad commented Jun 26, 2024

42wim commented Jul 1, 2024

kroepke commented Jul 1, 2024

zakisaad commented Jul 1, 2024

shwetathareja commented Jul 4, 2024 • edited Loading

42wim commented Jul 4, 2024 • edited Loading

reta commented Jul 4, 2024

42wim commented Jul 12, 2024

dhwanilpatel commented Sep 16, 2024 • edited Loading

shwetathareja commented Feb 27, 2024 •

edited

Loading

rlevytskyi commented Feb 27, 2024 •

edited

Loading

reta commented Feb 27, 2024 •

edited

Loading

rlevytskyi commented Feb 29, 2024 •

edited

Loading

peternied commented Mar 1, 2024 •

edited

Loading

peternied commented Mar 1, 2024 •

edited

Loading

rlevytskyi commented Mar 5, 2024 •

edited

Loading

tophercullen commented May 5, 2024 •

edited

Loading

tophercullen commented May 6, 2024 •

edited

Loading

ansjcy commented May 6, 2024 •

edited

Loading

tophercullen commented May 7, 2024 •

edited

Loading

zakisaad commented Jun 18, 2024 •

edited

Loading

tophercullen commented Jun 18, 2024 •

edited

Loading

shwetathareja commented Jul 4, 2024 •

edited

Loading

42wim commented Jul 4, 2024 •

edited

Loading

dhwanilpatel commented Sep 16, 2024 •

edited

Loading