Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAG DA does not uninstall sometimes #165

Open
gmendel opened this issue Aug 7, 2024 · 7 comments
Open

RAG DA does not uninstall sometimes #165

gmendel opened this issue Aug 7, 2024 · 7 comments

Comments

@gmendel
Copy link

gmendel commented Aug 7, 2024

There are many reasons why an Uninstall will not work:

  • SecretMgr free plan laps
  • Resources are out of sync
  • Config is out of sync
  • ....

It does not matter what the reason is, the "bar"/expectation is that the DA can ALWAYS uninstall and clean up.
This implies potentially to run pre-uninstall script/s and sync up the state.

@ocofaigh
Copy link
Member

@gmendel FYI, There is a step here that says:

  1. Delete Resources Created by the CI toolchain
    Those resources are not destroyed automatically as part of undeploying the stack in Project:
  • Code Engine Project: Delete the code engine project created for the sample application.
  • Container Registry Namespace: Delete the container registry namespace created by the CI tookchain.

And actions are being taken on addressing those (such as using a standalone Code Engine DA so the code engine project is in the terraform state and can be destroyed).

But interesting point on SecretMgr free plan laps. We may need to find out how to handle that use case

@vburckhardt
Copy link
Member

Example of error when undeploying the RAG DA if the secret manager instance has been destroyed before running the undeploy.

 2024/09/26 09:54:19 Terraform refresh | 
 2024/09/26 09:54:19 Terraform refresh | Error: GetSecretWithContext failed Get "https://5c724fcf-5e0c-47a1-b8a7-71d6301657f4.private.eu-de.secrets-manager.appdomain.cloud/api/v2/secrets/cddf891b-7f05-190b-728f-e8e401d153bd": dial tcp: lookup 5c724fcf-5e0c-47a1-b8a7-71d6301657f4.private.eu-de.secrets-manager.appdomain.cloud on 172.21.0.10:53: no such host
 2024/09/26 09:54:19 Terraform refresh | null
 2024/09/26 09:54:19 Terraform refresh | 
 2024/09/26 09:54:19 Terraform refresh | 
 2024/09/26 09:54:19 Terraform refresh |   with module.secrets_manager_secret_ibm_iam[0].ibm_sm_arbitrary_secret.arbitrary_secret[0],
 2024/09/26 09:54:19 Terraform refresh |   on .terraform/modules/secrets_manager_secret_ibm_iam/main.tf line 37, in resource "ibm_sm_arbitrary_secret" "arbitrary_secret":
 2024/09/26 09:54:19 Terraform refresh |   37: resource "ibm_sm_arbitrary_secret" "arbitrary_secret" {
 2024/09/26 09:54:19 Terraform refresh | 
 2024/09/26 09:54:19 Terraform refresh | 
 2024/09/26 09:54:19 Terraform refresh | Error: GetSecretWithContext failed Get "https://5c724fcf-5e0c-47a1-b8a7-71d6301657f4.private.eu-de.secrets-manager.appdomain.cloud/api/v2/secrets/4a630cd7-ccfc-75e6-a0ed-1f7eb652e28e": dial tcp: lookup 5c724fcf-5e0c-47a1-b8a7-71d6301657f4.private.eu-de.secrets-manager.appdomain.cloud on 172.21.0.10:53: no such host
 2024/09/26 09:54:19 Terraform refresh | null
 2024/09/26 09:54:19 Terraform refresh | 
 2024/09/26 09:54:19 Terraform refresh | 
 2024/09/26 09:54:19 Terraform refresh |   with module.secrets_manager_secret_signing_key[0].ibm_sm_arbitrary_secret.arbitrary_secret[0],
 2024/09/26 09:54:19 Terraform refresh |   on .terraform/modules/secrets_manager_secret_signing_key/main.tf line 37, in resource "ibm_sm_arbitrary_secret" "arbitrary_secret":
 2024/09/26 09:54:19 Terraform refresh |   37: resource "ibm_sm_arbitrary_secret" "arbitrary_secret" {
 2024/09/26 09:54:19 Terraform refresh | 
 2024/09/26 09:54:19 �[1m�[31mTerraform REFRESH error: Terraform REFRESH errorexit status 1�[39m�[0m
 2024/09/26 09:54:19 �[1m�[31mCould not execute job: Error : Terraform REFRESH errorexit status 1�[39m�[0m

@ocofaigh
Copy link
Member

ocofaigh commented Sep 26, 2024

@vburckhardt what can we do to solve this? I guess the instance would remain in reclamation for a period of time, and the user would have to request for it to be recovered, and at the same time purchase a standard plan?

@hmagph
Copy link

hmagph commented Oct 10, 2024

Bit of similar error for me, when trying to redeploy the SM DA 1.18.7 after its SM instance got discarded (accidentally).
In my case, it was coming from DevSecOps ALM stack.
https://cloud.ibm.com/projects/71ebf194-18f9-45c2-bc22-9d7c62e4ef54/configurations/49710c65-3a60-4ddf-8a6c-7b226ddec42f/edit

 2024/10/09 15:28:21 Terraform plan | Changes to Outputs:
 2024/10/09 15:28:21 Terraform plan |   ~ secrets_manager_crn    = "crn:v1:bluemix:public:secrets-manager:eu-de:a/ab0571c606236c08ccd5471e264911a2:d1e1f9b9-b359-4bf4-8964-92eac0cca836::" -> (known after apply)
 2024/10/09 15:28:21 Terraform plan |   ~ secrets_manager_guid   = "d1e1f9b9-b359-4bf4-8964-92eac0cca836" -> (known after apply)
 2024/10/09 15:28:21 Terraform plan |   ~ secrets_manager_id     = "crn:v1:bluemix:public:secrets-manager:eu-de:a/ab0571c606236c08ccd5471e264911a2:d1e1f9b9-b359-4bf4-8964-92eac0cca836::" -> (known after apply)
 2024/10/09 15:28:21 Terraform plan | 
 2024/10/09 15:28:21 Terraform plan | Warning: Argument is deprecated
 2024/10/09 15:28:21 Terraform plan | 
 2024/10/09 15:28:21 Terraform plan |   with module.kms[0].module.kms_key_rings["devsecops-sm-cos-key-ring"].ibm_kms_key_rings.key_ring,
 2024/10/09 15:28:21 Terraform plan |   on .terraform/modules/kms.kms_key_rings/main.tf line 9, in resource "ibm_kms_key_rings" "key_ring":
 2024/10/09 15:28:21 Terraform plan |    9:   force_delete  = var.force_delete
 2024/10/09 15:28:21 Terraform plan | 
 2024/10/09 15:28:21 Terraform plan | force_delete is now deprecated. Please remove all references to this field.
 2024/10/09 15:28:21 Terraform plan | 
 2024/10/09 15:28:21 Terraform plan | Error: GetNotificationsRegistrationWithContext failed Get "https://d1e1f9b9-b359-4bf4-8964-92eac0cca836.private.eu-de.secrets-manager.appdomain.cloud/api/v2/notifications/registration": dial tcp: lookup d1e1f9b9-b359-4bf4-8964-92eac0cca836.private.eu-de.secrets-manager.appdomain.cloud on 172.21.0.10:53: no such host
 2024/10/09 15:28:21 Terraform plan | null
 2024/10/09 15:28:21 Terraform plan | 
 2024/10/09 15:28:21 Terraform plan | 
 2024/10/09 15:28:21 Terraform plan |   with module.secrets_manager.ibm_sm_en_registration.sm_en_registration[0],
 2024/10/09 15:28:21 Terraform plan |   on ../../main.tf line 139, in resource "ibm_sm_en_registration" "sm_en_registration":
 2024/10/09 15:28:21 Terraform plan |  139: resource "ibm_sm_en_registration" "sm_en_registration" {
 2024/10/09 15:28:21 Terraform plan | 

@hiltol
Copy link

hiltol commented Oct 29, 2024

I am also encountering errors when trying to undeploy the stack.
Screenshot 2024-10-29 at 4 13 54 PM

Screenshot 2024-10-29 at 4 18 51 PM

Workspace Logs:
workspace-logs.txt

@ocofaigh
Copy link
Member

ocofaigh commented Nov 8, 2024

@hiltol Your errors seem to be related to permissions: Error: DeleteTektonPipelinePropertyWithContext failed Forbidden (cc @padraic-edwards @huayuenh)

@huayuenh
Copy link

huayuenh commented Nov 8, 2024

@ocofaigh a continuous delivery service is a hard requirement for the ALM. Must have been in place when ALM was stood up but deleted before attempting to remove the ALM. Was the CD service stood using the ALM?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants