Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloudrun Failed Creation #2775

Open
jjhuff opened this issue Dec 20, 2024 · 3 comments
Open

cloudrun Failed Creation #2775

jjhuff opened this issue Dec 20, 2024 · 3 comments
Labels
awaiting-feedback Blocked on input from the author kind/bug Some behavior is incorrect or out of spec

Comments

@jjhuff
Copy link

jjhuff commented Dec 20, 2024

Describe what happened

I'm using Pulumi to deploy a Cloud Run services for preview builds. This generally works great until someone open a PR with a container that doesn't start up:

gcp:cloudrunv2:Service (backend):
      error:   sdk-v2/provider2.go:520: sdk.helper_schema: Error waiting to create Service: Error waiting for Creating Service: Error code 13, message: Revision 'backend-gh-69-00001-pmg' is not ready and cannot serve traffic. The user-provided container failed the configured startup probe checks. Logs for this revision might contain more information.

The resource failed to be created -- but the the service was still created on Google's side. That means that once the bug is fixed, subsequent deployments fail like:

gcp:cloudrunv2:Service (backend):
  error: 1 error occurred:
	* Error creating Service: googleapi: Error 409: Resource 'backend-gh-69' already exists.

Sample program

n/a

Log output

No response

Affected Resource(s)

No response

Output of pulumi about

$ pulumi about
CLI          
Version      3.138.0
Go Version   go1.23.2
Go Compiler  gc

Plugins
KIND      NAME    VERSION
language  nodejs  unknown

Host     
OS       ubuntu
Version  24.10
Arch     x86_64

This project is written in nodejs: executable='/nix/store/hnkyz55vndmvwhg6nzpliv86gh6sxg7h-nodejs-22.10.0/bin/node' version='v22.10.0'

Backend        
Name           pulumi.com
URL            https://app.pulumi.com/jjhuff
User           jjhuff
Organizations  jjhuff, playful, restraint_social
Token type     personal

pulumi about doesn't seem to support pnpm, so here's my deps:

{
  "name": "ops",
  "private": true,
  "devDependencies": {
    "@types/node": "^22.10.2",
    "typescript": "5.7.2"
  },
  "dependencies": {
    "@pulumi/command": "1.0.1",
    "@pulumi/docker": "4.5.8",
    "@pulumi/gcp": "8.8.0",
    "@pulumi/google-native": "0.32.0",
    "@pulumi/pulumi": "3.143.0"
  },
  "packageManager": "[email protected]",
  "engines": {
    "node": ">=18"
  }
}

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

@jjhuff jjhuff added kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team labels Dec 20, 2024
@guineveresaenger guineveresaenger removed the needs-triage Needs attention from the triage team label Dec 20, 2024
@guineveresaenger
Copy link
Contributor

Hi @jjhuff - thank you for filing this issue! Having inconsistencies between your stack state and the actual cloud resource state is not ideal.

Disclaimer that most of the team will be out for the remainder of the year so responses may be a little slower than usual - thank you in advance for your patience 🙏 🌟

In the meantime, I have a couple suggestions and questions.

First off - can you tell us more about the interactions between the container and the Cloudrunv2.Service? It sounds like Service creation wait time timed out on the provider side but not the GCP side, but where does the container come into play?

Next - this is obviously not something you want to do routinely, but on the subsequent run with resource already exists is running pulumi refresh an option for you? See documentation here - it should reconcile your stack with the current state of the cloud resources.

Have you taken a look at Pulumi custom timeouts? This may not be helpful to you, as you mention a container dependency, but thought I'd throw it out there.

Please let us know if these suggestiong are useful to you. If not - in order for us to help you best and fastest, could you provide us with a minimal fully runnable repro with step by step instructions? Thank you so much!

@guineveresaenger guineveresaenger added the awaiting-feedback Blocked on input from the author label Dec 21, 2024
@jjhuff
Copy link
Author

jjhuff commented Dec 21, 2024

Hi @jjhuff - thank you for filing this issue! Having inconsistencies between your stack state and the actual cloud resource state is not ideal.

Disclaimer that most of the team will be out for the remainder of the year so responses may be a little slower than usual - thank you in advance for your patience 🙏 🌟
No worries! I'm going to be out too:)

First off - can you tell us more about the interactions between the container and the Cloudrunv2.Service? It sounds like Service creation wait time timed out on the provider side but not the GCP side, but where does the container come into play?
Yes, Service creation timed out because it's waiting for the Container to start on the GCP side. Separately, it'd be nice to have an option to skip that wait. If the container never successfully starts (failing a startupprobe, for example), the provider will time out.

When it times out, the Service resource is never created.

Next - this is obviously not something you want to do routinely, but on the subsequent run with resource already exists is running pulumi refresh an option for you? See documentation here - it should reconcile your stack with the current state of the cloud resources.
I do a refresh each time, but that doesn't do anything for the problem above since the Service resource was never created -- there's nothing to refresh.

@pulumi-bot pulumi-bot added needs-triage Needs attention from the triage team and removed awaiting-feedback Blocked on input from the author labels Dec 21, 2024
@VenelinMartinov
Copy link
Contributor

@jjhuff thanks for reporting. Can you please post a short program which reproduces the problem?

Also can you please try on the latest GCP provider? We had a very similar issue which was recently fixed - I suspect this one might be too.

@VenelinMartinov VenelinMartinov removed the needs-triage Needs attention from the triage team label Dec 23, 2024
@VenelinMartinov VenelinMartinov self-assigned this Dec 23, 2024
@VenelinMartinov VenelinMartinov added the awaiting-feedback Blocked on input from the author label Jan 14, 2025
@VenelinMartinov VenelinMartinov removed their assignment Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-feedback Blocked on input from the author kind/bug Some behavior is incorrect or out of spec
Projects
None yet
Development

No branches or pull requests

4 participants