You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The newly added unit should start up without error.
Actual behavior
$ juju status --storage
Model Controller Cloud/Region Version SLA Timestamp
dev opensearch localhost/localhost 3.1.8 unsupported 06:52:18Z
App Version Status Scale Charm Channel Rev Exposed Message
opensearch active 1 opensearch 1 no
self-signed-certificates active 1 self-signed-certificates stable 72 no
Unit Workload Agent Machine Public address Ports Message
opensearch/2* error idle 5 10.27.170.244 hook failed: "leader-elected"
self-signed-certificates/0* active idle 2 10.27.170.141
Machine State Address Inst id Base AZ Message
2 started 10.27.170.141 juju-622e8b-2 [email protected] Running
5 started 10.27.170.244 juju-622e8b-5 [email protected] Running
Storage Unit Storage ID Type Pool Mountpoint Size Status Message
opensearch-data/1 filesystem opensearch-storage 1.0 GiB detached
opensearch/2 opensearch-data/0 filesystem opensearch-storage /var/snap/opensearch/common 1.0 GiB attached
unit-opensearch-2: 06:53:05 ERROR unit.opensearch/2.juju-log Uncaught exception while in charm code:
Traceback (most recent call last):
File "/var/lib/juju/agents/unit-opensearch-2/charm/./src/charm.py", line 267, in <module>
main(OpenSearchOperatorCharm)
File "/var/lib/juju/agents/unit-opensearch-2/charm/venv/ops/main.py", line 544, in main
manager.run()
File "/var/lib/juju/agents/unit-opensearch-2/charm/venv/ops/main.py", line 520, in run
self._emit()
File "/var/lib/juju/agents/unit-opensearch-2/charm/venv/ops/main.py", line 509, in _emit
_emit_charm_event(self.charm, self.dispatcher.event_name)
File "/var/lib/juju/agents/unit-opensearch-2/charm/venv/ops/main.py", line 143, in _emit_charm_event
event_to_emit.emit(*args, **kwargs)
File "/var/lib/juju/agents/unit-opensearch-2/charm/venv/ops/framework.py", line 352, in emit
framework._emit(event)
File "/var/lib/juju/agents/unit-opensearch-2/charm/venv/ops/framework.py", line 851, in _emit
self._reemit(event_path)
File "/var/lib/juju/agents/unit-opensearch-2/charm/venv/ops/framework.py", line 941, in _reemit
custom_handler(event)
File "/var/lib/juju/agents/unit-opensearch-2/charm/lib/charms/opensearch/v0/opensearch_base_charm.py", line 302, in _on_leader_elected
self._put_or_update_internal_user_leader(user)
File "/var/lib/juju/agents/unit-opensearch-2/charm/lib/charms/opensearch/v0/opensearch_base_charm.py", line 1244, in _put_or_update_internal_user_leader
self.user_manager.update_user_password(user, hashed_pwd)
File "/var/lib/juju/agents/unit-opensearch-2/charm/lib/charms/opensearch/v0/opensearch_users.py", line 268, in update_user_password
resp = self.opensearch.request(
File "/var/lib/juju/agents/unit-opensearch-2/charm/lib/charms/opensearch/v0/opensearch_distro.py", line 266, in request
raise OpenSearchHttpError(
charms.opensearch.v0.opensearch_exceptions.OpenSearchHttpError: HTTP error self.response_code=None
self.response_text='Host 10.27.170.244:9200 and alternative_hosts: [] not reachable.'
unit-opensearch-4: 06:53:06 ERROR juju.worker.uniter.operation hook "leader-elected" (via hook dispatching script: dispatch) failed: exit status 1
Additional context
I assume the issue is with security_index_initialised, this is not in the peer data anymore:
## Issue
When attaching an existing storage to a new unit, 2 issues happen:
- Snap install failed because of permissions / ownership of directories
- snap_common gets completely deleted
## Solution
- bump snap version, use the fixed one (the fixed revision is 47, this
is already outdated as a newer version of the snap is already available
and merged to main prior to this PR)
- enhance test coverage for integration tests
## Integration Testing
Tests for attaching existing storage can be found in
integration/ha/test_storage.py. There are now three test cases:
1. test_storage_reuse_after_scale_down: remove one unit from the
deployment, afterwards add a new one re-using the storage from the
removed unit. check if the continuous writes are ok and a testfile that
was created intially is still there.
2. test_storage_reuse_after_scale_to_zero: remove both units from the
deployment, keep the application, add two new units using the storage
again. check the continuous writes.
3. test_storage_reuse_in_new_cluster_after_app_removal: from a cluster
of three units, remove all of them and remove the application. deploy a
new application (with one unit) to the same model, attach the storage,
then add two more units with the other storage volumes. check the
continuous writes.
## Other Issues
- As part of this PR, another issue is addressed:
#306. It is
resolved with this commit:
19f843c
- Furthermore problems with acquiring the OpenSearch lock are worked around with this PR, especially when the shards for the locking index within OpenSearch are not assigned to a new primary when removing the former primary. This was also reported in #243 and will be further investigated in #327.
Steps to reproduce
Expected behavior
The newly added unit should start up without error.
Actual behavior
Versions
Operating system: Ubuntu 24.04 LTS, Ubuntu 22.04 LTS
Juju CLI: 3.1.8-genericlinux-amd64
Juju agent: 3.1.8
Charm revision: 47
LXD: 5.21.1 LTS
Log output
Additional context
I assume the issue is with
security_index_initialised
, this is not in the peer data anymore:This is where an adjustment might be necessary: https://github.com/canonical/opensearch-operator/blob/main/lib/charms/opensearch/v0/opensearch_base_charm.py#L271
The text was updated successfully, but these errors were encountered: