Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenSearch Dashboards failures after upgrade 2.9 to 2.12 #5939

Open
rlevytskyi opened this issue Feb 23, 2024 · 5 comments
Open

OpenSearch Dashboards failures after upgrade 2.9 to 2.12 #5939

rlevytskyi opened this issue Feb 23, 2024 · 5 comments
Assignees
Labels
bug Something isn't working needs research

Comments

@rlevytskyi
Copy link

rlevytskyi commented Feb 23, 2024

Dashboards Suddenly Dies
Hello OpenSearch Team,
We’ve just updated our OpenSearch cluster from version 2.9.0 to 2.12.0.
Among other issues, we’ve noticed that Opensearch Dashboards container sometimes get unexpectedly stopped. There is no error message at it’s log but several entries at system log like these (I’ve reduced them slightly):

vm85 dockerd[1011]: msg="ignoring event" container=e490 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
vm85 containerd[902]: msg="shim disconnected" id=e490
vm85 containerd[902]: msg="cleaning up after shim disconnecte d" id=e490 namespace=moby
vm85 containerd[902]: msg="cleaning up dead shim"
vm85 containerd[902]: msg="cleanup warnings time=\"2024-02-23 T14:27:19Z\" level=info msg=\"starting signal loop\" namespace=moby pid=11722 runtime=io.containerd.runc.v2\n" 
vm85 dockerd[1011]: msg="ShouldRestart failed, container will not be restarted" container=e490 daemonShuttingDown=false error="restart c anceled" execDuration=10m7.524639324s exitStatus="{0 2024-02-23 14:27:18.998984252 +0000 UTC}" hasBeenManuallyStopped=true restartCount =4
vm85 containerd[902]: msg="loading plugin \"io.containerd.event.  v1.publisher\"..." runtime=io.containerd.runc.v2 type=io.containerd.event.v1
vm85 containerd[902]: msg="loading plugin \"io.containerd.intern al.v1.shutdown\"..." runtime=io.containerd.runc.v2 type=io.containerd.internal.v1
vm85 containerd[902]: msg="loading plugin \"io.containerd.ttrpc.  v1.task\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1
vm85 containerd[902]: msg="starting signal loop" namespace=moby path=/run/containerd/io.containerd.runtime.v2.task/moby/e490 pid=11753 runt ime=io.containerd.runc.v2

I managed to fix this by uncommenting and changing the string at the node.options configuration file:
--max-old-space-size=6100

My questions are:

  • What is a default value for memory limit?
  • Are there any recommended values?

To Reproduce
Steps to reproduce the behavior:

  1. Open any complex dashboard consisting of multiple items.

Expected behavior
In 2.9, our dashboards were rendering properly.

OpenSearch Version
2.12 using Docker image opensearchproject/opensearch:2.12.0

Dashboards Version
2.12 using Docker image opensearchproject/opensearch-dashboards:2.12.0

Plugins
Default list that came with distribution.

Screenshots
Not applicable.

Host/Environment (please complete the following information):

  • Oracle Linux Server release 8.9
  • Ubuntu 23.10
  • Google Chrome Version 121.0.6167.160 (Official Build) (64-bit)

Additional context
No additional context yet.

@rlevytskyi rlevytskyi added bug Something isn't working untriaged labels Feb 23, 2024
@abbyhu2000
Copy link
Member

Spike task: look into the performance issue from 2.9 to 2.11. @kavilla @manasvinibs

@wbeckler
Copy link

@rlevytskyi would you be willing to share any more details about your settings/plugins/indexes to help us reproduce and diagnose?

@rlevytskyi
Copy link
Author

rlevytskyi commented Feb 27, 2024

Thank you @wbeckler for your reply!
We are running non-dedicated manager cluster, we have four nodes running both data and master-eligible nodes and two coordinating nodes.

  • 4 data nodes with 112GB of Xmx RAM and 13.6 TB of storage
  • 5500 indices (mostly small of 1 shards, but several big of 4 shards) up to 75% of capacity
  • 26600 shards
  • upgraded 2.9 to 2.12 and add Xmx RAM to make it 128GB
  • had to close 2000 indices to make cluster operable again

Some details at the neighbor topic opensearch-project/OpenSearch#12454

@wbeckler
Copy link

wbeckler commented Mar 1, 2024

Is it possible that the memory issue for your data nodes is starving resources from your dashboard containers?

@rlevytskyi
Copy link
Author

@wbeckler absolutely no, we have 32GB of RAM for VM running this Dashboards and Coordinating node with 12GB heap.
No OOM or something at system logs.
Adding some memory to Kibana helped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs research
Projects
None yet
Development

No branches or pull requests

5 participants