Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error messages around AWS config #3310

Merged
merged 28 commits into from
Jan 31, 2024

Conversation

VenelinMartinov
Copy link
Contributor

Should fully address #2285 after pulumi/pulumi-terraform-bridge#1640

This makes the error messages when the user has no credentials or no region configured better and more actionable:

Before, no credentials configured:

error: pulumi:providers:aws resource 'default_6_18_2' has a problem: could not validate provider configuration: unable to validate AWS credentials.
    Details: No valid credential sources found. Please see https://www.pulumi.com/registry/packages/aws/installation-configuration/
    for more information about providing credentials.

    Error: failed to refresh cached credentials, no EC2 IMDS role found, operation error ec2imds: GetMetadata, request canceled, context deadline exceeded

    Make sure you have set your AWS region, e.g. `pulumi config set aws:region us-west-2`.

The line about the region is irrelevant here.

After, no credentials configured:

Diagnostics:
  pulumi:providers:aws (default_6_18_2):
    error: pulumi:providers:aws resource 'default_6_18_2' has a problem: could not validate provider configuration: unable to validate AWS credentials.
    Details: No valid credential sources found. Please see https://www.pulumi.com/registry/packages/aws/installation-configuration/
    for more information about providing credentials.

    Error: failed to refresh cached credentials, no EC2 IMDS role found, operation error ec2imds: GetMetadata, request canceled, context deadline exceeded

Before, no region configured:

Diagnostics:
  pulumi:providers:aws (default_6_18_2):
    error: pulumi:providers:aws resource 'default_6_18_2' has a problem: could not validate provider configuration: unable to validate AWS credentials.
    Details: validating provider credentials: retrieving caller identity from STS: operation error STS: GetCallerIdentity, https response error StatusCode: 0, RequestID: , request send failed, Post "https://sts..amazonaws.com/": dial tcp: lookup sts..amazonaws.com: no such host
    Make sure you have set your AWS region, e.g. `pulumi config set aws:region us-west-2`.

Here, it is not at all clear that it is the region at fault, since the note about setting the region shows up every time.

After, no region configured:

Diagnostics:
  pulumi:providers:aws (default_6_18_2):
    error: pulumi:providers:aws resource 'default_6_18_2' has a problem: could not validate provider configuration: missing region information
    Make sure you have set your AWS region, e.g. `pulumi config set aws:region us-west-2`.
    Details: validating provider credentials: retrieving caller identity from STS: operation error STS: GetCallerIdentity, https response error StatusCode: 0, RequestID: , request send failed, Post "https://sts/..amazonaws.com/": dial tcp: lookup sts..amazonaws.com: no such host

The note about config set aws:region only shows up in this error case, so clearly actionable.

For comparison, upstream, no credentials configured:

│ Error: configuring Terraform AWS Provider: no valid credential sources for Terraform AWS Provider found.
│
│ Please see https://registry.terraform.io/providers/hashicorp/aws
│ for more information about providing credentials.
│
│ AWS Error: failed to refresh cached credentials, no EC2 IMDS role found, operation error ec2imds: GetMetadata, http response error StatusCode: 404, request to EC2 IMDS failed
│
│
│   with provider["registry.terraform.io/hashicorp/aws"],
│   on main.tf line 12, in provider "aws":
│   12: provider "aws" {
│
╵

Copy link

Does the PR have any schema changes?

Does the PR have any schema changes?

Looking good! No breaking changes found.
No new resources/functions.

Maintainer note: consult the runbook for dealing with any breaking changes.

Copy link
Member

@iwahbe iwahbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

While I would love to have a test, I don't see an easy way to do that. I think this is good to go as is.

@t0yv0
Copy link
Member

t0yv0 commented Jan 24, 2024

Diagnostics:
  pulumi:providers:aws (default_6_18_2):
    error: pulumi:providers:aws resource 'default_6_18_2' has a problem: could not validate provider configuration: unable to validate AWS credentials.
    Details: No valid credential sources found. Please see https://www.pulumi.com/registry/packages/aws/installation-configuration/
    for more information about providing credentials.

    Error: failed to refresh cached credentials, no EC2 IMDS role found, operation error ec2imds: GetMetadata, request canceled, context deadline exceeded

This is still very verbose for this simple condition... why should I know about EC2 IMDS as a user.

Sorry for being difficult but I think it's worth to spend an hour or two to get this to a very nice place for our users and pin the behavior with a test since we keep kind of changing how this presents. Ideally there's something that's easy for our friends from the docs team to help edit. Appreciate it!

I realize the ask would put us ahead of what Terraform does :)

@t0yv0
Copy link
Member

t0yv0 commented Jan 24, 2024

I don't see an easy way to do that

ProgramTest w/o credentials that expects to fail and than asserts against stdout/stderr might work.

@VenelinMartinov
Copy link
Contributor Author

VenelinMartinov commented Jan 25, 2024

Updated errors:

Error with no credentials:

Diagnostics:
  pulumi:providers:aws (default_6_18_2):
    error: pulumi:providers:aws resource 'default_6_18_2' has a problem: No valid credential sources found.
    Please see https://www.pulumi.com/registry/packages/aws/installation-configuration/for more information about providing credentials.
    NEW: You can use Pulumi ESC to set up dynamic credentials with AWS OIDC to ensure the correct and valid credentials are used.
    Learn more: https://www.pulumi.com/registry/packages/aws/installation-configuration/#dynamically-generate-credentials

Error with no region:

Diagnostics:
  pulumi:providers:aws (default_6_18_2):
    error: pulumi:providers:aws resource 'default_6_18_2' has a problem: Missing region information
    Make sure you have set your AWS region, e.g. `pulumi config set aws:region us-west-2`.

Error with region but invalid credentials:

Diagnostics:
  pulumi:providers:aws (default_6_18_2):
    error: pulumi:providers:aws resource 'default_6_18_2' has a problem: Invalid credentials configured.
    Please see https://www.pulumi.com/docs/intro/cloud-providers/aws/setup/ for more information about providing credentials.
    NEW: You can use Pulumi ESC to set up dynamic credentials with AWS OIDC to ensure the correct and valid credentials are used.
    Learn more: https://www.pulumi.com/registry/packages/aws/installation-configuration/#dynamically-generate-credentials

Expired SSO credentials:

Diagnostics:
  pulumi:providers:aws (default_6_18_2):
    error: pulumi:providers:aws resource 'default_6_18_2' has a problem: Failed to refresh cached SSO credentials.
    Please refresh SSO login.
    NEW: You can use Pulumi ESC to set up dynamic credentials with AWS OIDC to ensure the correct and valid credentials are used.
    Learn more: https://www.pulumi.com/registry/packages/aws/installation-configuration/#dynamically-generate-credentials

Copy link
Member

@t0yv0 t0yv0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢 🚀

@VenelinMartinov VenelinMartinov enabled auto-merge (squash) January 25, 2024 16:36
@t0yv0
Copy link
Member

t0yv0 commented Jan 25, 2024

error: pulumi:providers:aws resource 'default_6_18_2' has a problem: Invalid credentials configured. may be considered slightly redundant but I actually like it in cases where there is more than one provider in play.

@VenelinMartinov VenelinMartinov enabled auto-merge (squash) January 27, 2024 12:56
@VenelinMartinov VenelinMartinov force-pushed the vvm/improve_beginner_error_messages branch from 240efbf to 50bcc1a Compare January 29, 2024 11:51
Comment on lines 142 to 144
// We need to reset the credentialsValidationRun flag to false, since the
// previous test might have set it to true.
credentialsValidationRun.Store(false)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behaviour is unfortunate and not very obvious. It took me a while to realise that the flag was the reason only the first test in a test run would succeed.

LMK if anyone has a better idea how to fix this - the issue is that the credential validation code is run only once because of

if credentialsValidationRun.CompareAndSwap(false, true) {

This is normally the behaviour we want, except that during tests we run the same "provider" binary multiple times and the flag needs to be reset between runs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The easiest way to do it would be to make preConfigureCallback a factory that takes a reference to atomic.Bool and returns func(vars resource.PropertyMap, config shim.ResourceConfig) error. Then you could put credentialsValidationRun inside of the Provider() function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would require a third interface for preConfigureCallback. If we are doing that why not implement a preConfigureOnce and then have the logic in the bridge around making sure we only call it once?

Not sure if it's worth doing for the sake of the tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is fine-ish this is what we have to do to test global state. To get rid of global state we'd need to move this to a parameter of provider somehow so tests can set something else in there

// We should only run the validation once to avoid duplicating the reported errors.
var credentialsValidationRun atomic.Bool

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would require a third interface for preConfigureCallback.

No it wouldn't. I'm not suggesting making tfbridge.ProviderInfo.PreConfigureCallback a factory. I'm suggesting that we make the local preConfigureCallback a factory to capture a reference to a atomic.Bool on Provider()'s stack:

 provider/resources.go | 52 ++++++++++++++++++++++++++-------------------------
 1 file changed, 27 insertions(+), 25 deletions(-)

diff --git a/provider/resources.go b/provider/resources.go
index a0705e4db1..8d598f6b73 100644
--- a/provider/resources.go
+++ b/provider/resources.go
@@ -731,36 +731,35 @@ func validateCredentials(vars resource.PropertyMap, c shim.ResourceConfig) error
 	return nil
 }
 
-// We should only run the validation once to avoid duplicating the reported errors.
-var credentialsValidationRun atomic.Bool
-
 // preConfigureCallback validates that AWS credentials can be successfully discovered. This emulates the credentials
 // configuration subset of `github.com/terraform-providers/terraform-provider-aws/aws.providerConfigure`.  We do this
 // before passing control to the TF provider to ensure we can report actionable errors.
-func preConfigureCallback(vars resource.PropertyMap, c shim.ResourceConfig) error {
-	skipCredentialsValidation := boolValue(vars, "skipCredentialsValidation",
-		[]string{"AWS_SKIP_CREDENTIALS_VALIDATION"})
-
-	// if we skipCredentialsValidation then we don't need to do anything in
-	// preConfigureCallback as this is an explicit operation
-	if skipCredentialsValidation {
-		log.Printf("[INFO] pulumi-aws: skip credentials validation")
-		return nil
-	}
+func preConfigureCallback(alreadyRun *atomic.Bool) func(resource.PropertyMap, shim.ResourceConfig) error {
+	return func(vars resource.PropertyMap, c shim.ResourceConfig) error {
+		skipCredentialsValidation := boolValue(vars, "skipCredentialsValidation",
+			[]string{"AWS_SKIP_CREDENTIALS_VALIDATION"})
+
+		// if we skipCredentialsValidation then we don't need to do anything in
+		// preConfigureCallback as this is an explicit operation
+		if skipCredentialsValidation {
+			log.Printf("[INFO] pulumi-aws: skip credentials validation")
+			return nil
+		}
 
-	var err error
-	if credentialsValidationRun.CompareAndSwap(false, true) {
-		log.Printf("[INFO] pulumi-aws: starting to validate credentials. " +
-			"Disable this by AWS_SKIP_CREDENTIALS_VALIDATION or " +
-			"skipCredentialsValidation option")
-		err = validateCredentials(vars, c)
-		if err == nil {
-			log.Printf("[INFO] pulumi-aws: credentials are valid")
-		} else {
-			log.Printf("[INFO] pulumi-aws: error validating credentials: %v", err)
+		var err error
+		if alreadyRun.CompareAndSwap(false, true) {
+			log.Printf("[INFO] pulumi-aws: starting to validate credentials. " +
+				"Disable this by AWS_SKIP_CREDENTIALS_VALIDATION or " +
+				"skipCredentialsValidation option")
+			err = validateCredentials(vars, c)
+			if err == nil {
+				log.Printf("[INFO] pulumi-aws: credentials are valid")
+			} else {
+				log.Printf("[INFO] pulumi-aws: error validating credentials: %v", err)
+			}
 		}
+		return err
 	}
-	return err
 }
 
 // managedByPulumi is a default used for some managed resources, in the absence of something more meaningful.
@@ -785,6 +784,9 @@ func ProviderFromMeta(metaInfo *tfbridge.MetadataInfo) *tfbridge.ProviderInfo {
 	ctx := context.Background()
 	upstreamProvider := newUpstreamProvider(ctx)
 
+	// We should only run the validation once to avoid duplicating the reported errors.
+	var credentialsValidationRun atomic.Bool
+
 	p := pftfbridge.MuxShimWithDisjointgPF(ctx, shimv2.NewProvider(upstreamProvider.SDKV2Provider, shimv2.WithDiffStrategy(shimv2.PlanState)), upstreamProvider.PluginFrameworkProvider)
 
 	prov := tfbridge.ProviderInfo{
@@ -840,7 +842,7 @@ func ProviderFromMeta(metaInfo *tfbridge.MetadataInfo) *tfbridge.ProviderInfo {
 				},
 			},
 		},
-		PreConfigureCallback: preConfigureCallback,
+		PreConfigureCallback: preConfigureCallback(&credentialsValidationRun),
 		Resources: map[string]*tfbridge.ResourceInfo{
 			// AWS Certificate Manager
 			"aws_acm_certificate_validation": {

@VenelinMartinov
Copy link
Contributor Author

This needs a release with pulumi/pulumi-terraform-bridge#1654

@VenelinMartinov
Copy link
Contributor Author

VenelinMartinov commented Jan 31, 2024

I think we are hitting issues around PF and SDKV2 bridge applying config defaults - I believe that when the sdkv2 CheckConfig runs, we get the defaults and the TestOtherFailureErrorMessage test succeeds. When the PF one wins, it doesn't apply the defaults before failing and the test fails.

I don't believe this makes a difference for correctness here but I don't think we can express "match string or missing" in our json matching library. I'll change the test with explicit inputs for the config, hopefully that fixes the inconsistency.

Related: pulumi/pulumi-terraform-bridge#1603

@VenelinMartinov VenelinMartinov enabled auto-merge (squash) January 31, 2024 19:00
@VenelinMartinov VenelinMartinov merged commit be20793 into master Jan 31, 2024
18 checks passed
@VenelinMartinov VenelinMartinov deleted the vvm/improve_beginner_error_messages branch January 31, 2024 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants