Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release Manager: issue when Release Manager pipeline try to retrieve a existing pod #201

Open
jssnull opened this issue Sep 1, 2020 · 2 comments
Labels
bug Something isn't working

Comments

@jssnull
Copy link

jssnull commented Sep 1, 2020

When deploying an RShiny app in QA, the app gets deployed successfully in OpenShift by the creation of a ReplicationController of 1 replica and the Pod gets deployed successfully and it's up and running. However, the Jenkins pipeline is marked as failed. The reason for this is that there's a step where it tries to find the Pod that got created and it cannot find it. We have the suspicion that it could be related to a timing issue - when it tries to find the Pod, the Pod has not been created as such in the cluster.

This would be the error reported by jenkins:
image

We can see that the pod was successfully created, but wasn’t recognized in the last log:
image

We can see the final status of the pipeline graphically when it fails:
image

This issue happens when we try to deploy to qa and prod,
If we try to launch the pipeline again it will work, but wouldn’t create a new pod.
This issue could be produced by a timinig problem.

Maybe a sleep function could be added here to solve the issue:
https://github.com/opendevstack/ods-mro-jenkins-shared-library/blob/2.x/src/org/ods/service/OpenShiftService.groovy#L280-L310

In ODS 3.x there is a similar sleep function implemented, it could be helpful.
https://github.com/opendevstack/ods-jenkins-shared-library/blob/3.x/src/org/ods/services/OpenShiftService.groovy#L433-L459

@michaelsauter michaelsauter added the bug Something isn't working label Sep 1, 2020
@segator
Copy link

segator commented Sep 1, 2020

I think we should update the code to wait for pod readiness before try to extract the information we can use labels for that for exemple o rdeployment reference. even we can use podname but if our deployment object have multiple replicas best to wait for all pods been runing.
Something like

oc get pods -l app=hello -o 'jsonpath={..status.conditions[?(@.type=="Ready")].status}') != "True"

@clemensutschig
Copy link
Member

@jssnull - can you try the same with ODS3 .. and see if that solves it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants