From 7b94ce834471d0445ff53cee4339c6169901e237 Mon Sep 17 00:00:00 2001 From: Gernot Feichter Date: Tue, 6 Aug 2024 11:21:40 +0200 Subject: [PATCH] feat: autorecover from stuck situations Signed-off-by: Gernot Feichter --- hips/hip-9999.md | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/hips/hip-9999.md b/hips/hip-9999.md index 47e8bcf2..fe34b3f9 100644 --- a/hips/hip-9999.md +++ b/hips/hip-9999.md @@ -44,10 +44,14 @@ After implementation, the --timout parameter will be stored in the helm release have an indirect impact on possible parallel processes. `helm ls -a` shows two new columns, regular `helm ls` does NOT show those: -- LOCKED TILL - calculated by the helm client: k8s server time + timeout parameter value -- SESSION ID - Unique, random session id generated by the client +- `LOCKED TILL` + datetime calculated by the helm client: current time + timeout parameter value + Originally k8s server time was intended as "current time", but since helm exclusively uses the + client time everywhere else, we do not change that via this HIP, such a refactoring would need + to be performed via a separate HIP against the entire codebase. +- `SESSION ID` + + Unique, random session id generated by the client. Furthermore, if the helm client process gets killed (SIGTERM), it tries to clear the LOCKED TILL value, SESSION ID and sets the release into a failed state before terminating in order to free the lock. @@ -88,7 +92,16 @@ None - [ ] HIP status `accepted' - [x] Reference implementation - [x] Test for concurrent upgrade (valid lock should still block concurrent upgrade attempts) -- [ ] Test for kill scenario (forever stuck in pending) -- [ ] Backwards compatibility check (looking good already) +- [x] Test for upgrading from pending state +- [x] Test for upgrading from failed state +- [ ] Decision: Helm ls -> which flag should show the new fields `LOCKED TILL` and `SESSION ID`? +- [ ] Decision: k8s Lease object vs helm relesae secret for storing the `LOCKED TILL` and `SESSION ID` +- [x] Backwards compatibility check (part of acceptance tests repo, looking good already, even when storing the state in the release object) + +## References + +https://github.com/helm/helm/issues/7476 + https://github.com/rancher/rancher/issues/44530 + https://github.com/helm/helm/issues/11863