Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't persist the provider's assumeRole attribute within the state #3149

Open
oboukili opened this issue Dec 16, 2023 · 9 comments
Open

Don't persist the provider's assumeRole attribute within the state #3149

oboukili opened this issue Dec 16, 2023 · 9 comments
Labels
area/credentials Authenticating the provider awaiting/core Blocked on a missing bug or feature in pulumi/pulumi (except codegen) kind/bug Some behavior is incorrect or out of spec

Comments

@oboukili
Copy link

oboukili commented Dec 16, 2023

This explicitly breaks refreshes where the cached assume role attribute value is being used instead of the actual currently configured value.

A typical broken example, in my case, would be a 2 steps preview (PR) / release (push) CI pipeline where the PR workflows would assume a read only IAM role, while the Release workflows would assume a read-write IAM role.

I'm new to Pulumi, but more generally, I can't see why would any of the provider attributes' values actually be persisted in the state, nor favored over any other current values during refreshes, so it may be a (not easily modifiable) Pulumi-wide design issue rather than only scoped to this provider.

Disabling refreshes bypasses the issue, but does not solve it.

Thanks for any insights you could provide.

@oboukili oboukili added kind/enhancement Improvements or new features needs-triage Needs attention from the triage team labels Dec 16, 2023
@t0yv0
Copy link
Member

t0yv0 commented Dec 20, 2023

Hi @oboukili thank you for reporting this and sorry that Pulumi is not doing what you need here.

It sounds like you are running Pulumi against the same stack in two different contexts with different assumed IAM roles, and at some point Pulumi ignores the IAM role you have provided and instead picks up the IAM role from the statefile for the stack, which breaks your intent. There are some scenarios in Pulumi that benefit from saving provider config in the state, such as managing deletion of existing resources by the version of the provider that provisioned them, and there may be more, but sounds like this is surprising in the context of assumed IAM roles.

It would help my team a lot if we had a solid repro here to narrow your use case to a concrete sequence of steps. We can then try to find solutions - whether there is something that can be fixed locally in the provider or taken to a broader conversation.

Could you help us out with - (1) a minimal Pulumi program that uses the provider with the assume role, in particular whether you use explicit providers or pulumi config. (2) exact sequence of pulumi invocations leading up to the issue; (3) expected/actual results.

I would also find it very helpful if you could elaborate "Disabling refreshes bypasses the issue, but does not solve it.", are you asking pulumi to do refreshes explicitly, and how do you disable it?

Thanks for your patience?

@t0yv0 t0yv0 added awaiting-feedback Blocked on input from the author and removed kind/enhancement Improvements or new features needs-triage Needs attention from the triage team labels Dec 20, 2023
@oboukili
Copy link
Author

Hi @t0yv0, thanks for your reply.

There are some scenarios in Pulumi that benefit from saving provider config in the state, such as managing deletion of existing resources by the version of the provider that provisioned them, and there may be more, but sounds like this is surprising in the context of assumed IAM roles.

Thanks for clarifying, persisting the provider version would indeed be a valid use case for resource deletion, however I would rather treat that data as informational, to be used only should an issue arise (similar to what Kubernetes "last-applied" annotation is), but I digress.

Could you help us out with - (1) a minimal Pulumi program that uses the provider with the assume role, in particular whether you use explicit providers or pulumi config. (2) exact sequence of pulumi invocations leading up to the issue; (3) expected/actual results.

(1) I'm afraid I can't share the program I am using, but here's a minimal example (apologies for the automatic tab indent). Note that I explicitly disable all default providers within Pulumi.yaml through pulumi:disable-default-providers: ["*"].

package main

import (
	servicecatalogtypes "github.com/aws/aws-sdk-go-v2/service/servicecatalog/types"
	"github.com/pulumi/pulumi-aws/sdk/v6/go/aws"
	"github.com/pulumi/pulumi-aws/sdk/v6/go/aws/servicecatalog"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi/config"
)

const configAssumeRole = "assumeRole"

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
		cfg := config.New(ctx, "myconfig")
		cfg.Require(configAssumeRole)
		p, err := aws.NewProvider(ctx, "explicitProvider", &aws.ProviderArgs{
			AssumeRole: &aws.ProviderAssumeRoleArgs{
				Duration:    pulumi.StringPtr("900s"),
				RoleArn:     pulumi.StringPtr(cfg.Get(configAssumeRole)),
				SessionName: pulumi.StringPtr("minimal-test"),
			},
			Region: pulumi.StringPtr("eu-west-3"),
		})
		if err != nil {
			return err
		}
		_, err = servicecatalog.NewProduct(ctx, "test",
			&servicecatalog.ProductArgs{
				Distributor: pulumi.StringPtr("test"),
				Name:        pulumi.StringPtr("test"),
				Owner:       pulumi.String("test"),
				SupportUrl:  pulumi.StringPtr("test"),
				Type:        pulumi.String(servicecatalogtypes.ProductTypeCloudFormationTemplate),
				ProvisioningArtifactParameters: &servicecatalog.ProductProvisioningArtifactParametersArgs{
					Name:        pulumi.StringPtr("v0"),
					TemplateUrl: pulumi.StringPtr("https://s3-us-gov-west-1.amazonaws.com/cloudformation-templates-us-gov-west-1/IAM_Users_Groups_and_Policies.template"),
					Type:        pulumi.StringPtr(string(servicecatalogtypes.ProductTypeCloudFormationTemplate))},
			},
			pulumi.Provider(p),
		)
		return err
	})
}

Pulumi.stackname.yaml

config:
  myconfig:assumeRole: "arn:aws:iam::1234567890:role/pr"

(2) GIven 2 to-be-assumed IAM roles arn:aws:iam::1234567890:role/pr and arn:aws:iam::1234567890:role/release, and assuming the following:

  • the current stack has previously been run with pulumi up, with the following configuration set myconfig:assumeRole: "arn:aws:iam::1234567890:role/release", the provider state is therefore persisted in the stack state.
  • the current stack configuration sets myconfig:assumeRole: "arn:aws:iam::1234567890:role/pr"
  • the currently loaded AWS credentials within the shell only allow assuming arn:aws:iam::1234567890:role/pr
pulumi refresh

(3)
Expected result
the explicit pulumi-aws provider assumes the role set in the configuration: arn:aws:iam::1234567890:role/pr, and proceeds successfully.

Actual result
the explicit pulumi-aws provider tries assuming the role set in the state arn:aws:iam::1234567890:role/release, and fails as the current context credentials don't allow it to.

I would also find it very helpful if you could elaborate "Disabling refreshes bypasses the issue, but does not solve it.", are you asking pulumi to do refreshes explicitly, and how do you disable it?

I was a bit too concise here, I meant not systematically refreshing upon every pulumi action (update or preview), through the following flag in Pulumi.yaml

options:
  refresh: always

Digging further, I now realize there's already been quite a long design debate over the importance of the state (it would seem Pulumi differs heavily from, say, Terraform here as the state is not just considered as a managed resource tracking data and resource cache) and thus the non-anecdotal impact of refreshes pulumi/pulumi#2247, which is unrelated to the current issue.

@iwahbe iwahbe added kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team and removed awaiting-feedback Blocked on input from the author labels Dec 20, 2023
@iwahbe iwahbe removed the needs-triage Needs attention from the triage team label Dec 28, 2023
@iwahbe
Copy link
Member

iwahbe commented Dec 28, 2023

Hi @oboukili. I think this is effectively a special case of pulumi/pulumi#13860. I'm not sure what a workaround would be for this scenario, beyond state surgery to change the IAM role.

@fitz-vivodyne
Copy link

I'm assuming noone has found a workaround for this yet?
We've got a central account we run pulumi in that assume roles in other child accounts via explicitly configured providers.

I was planning on having separate roles for preview/up phases, but that plan is currently blocked due to it always trying to use the up role from the state.

@t0yv0 t0yv0 added the area/credentials Authenticating the provider label Apr 16, 2024
@ryanpodonnell1
Copy link

this is biting me right now as I was told we need to have separate roles for plan vs apply. Basically will have to run refresh only for the apply step to ensure state is up to snuff:

func handleDeployment(ctx context.Context, stack auto.Stack, action string) error {
	switch action {
	case "plan":
		_, err := stack.Preview(ctx, optpreview.ProgressStreams(os.Stdout), optpreview.Diff(), colorAlwaysPreview{}, optpreview.Diff())
		if err != nil {
			return err
		}

	case "apply":
		// Refresh only on apply due to https://github.com/pulumi/pulumi-aws/issues/3149
		_, err := stack.Up(ctx, optup.ProgressStreams(os.Stdout), colorAlwaysUp{}, optup.ErrorProgressStreams(os.Stderr), optup.Diff(), optup.Refresh())
		if err != nil {
			return err
		}

	case "destroy":
		_, err := stack.Destroy(ctx, optdestroy.ProgressStreams(os.Stdout), colorAlwaysDestroy{})
		if err != nil {
			log.Fatal(err)
		}

	default:
		return fmt.Errorf("unknown action")
	}

	return nil
}

Using the plan i'm able to swap out the role but obv doesn't really do anything because a refresh isn't happening. Just hoping that nothing has changed in the environment compared to state.

@fitz-vivodyne
Copy link

fitz-vivodyne commented May 14, 2024

FYI, I was able to come up with a super janky workaround for this using transitive session tags.

If you set a transitive session tag (say, pulumi-up=<true>) outside of Pulumi it propagates and isn't stored in the state.

With that capability, we created a single IAM role for both preview and up and gated all mutating permissions behind a condition to check that the session tag was set.

@oboukili
Copy link
Author

Very smart @fitz-vivodyne thanks ! ❤️

@corymhall
Copy link
Contributor

related to pulumi/pulumi#4981

@gunzy83
Copy link

gunzy83 commented Aug 8, 2024

We manage preview and up deployments using OIDC in Github actions using profile files for each action and one for deployments deployments from engineering workstations (AWS IAM Identity Centre) so I think this could be adapted to assumed roles as well.

We use AWS_CONFIG_FILE to point the SDK to the correct file (in the context of the run) containing profiles for each of our accounts configured for a Pulumi project.

For previews in Github Actions on our PRs we point to a ./.aws/github-preview file in the project which has read only preview roles that are assumed via web identity (OIDC). These roles can be run without a Github Repo Environment.

For up operations in Github Actions we point to a ./.aws/github-deploy file in the project which has the real roles, also assumed via OIDC. These roles require a Github Repo Environment and therefore can be subject to approval.

Engineer workstations use Taskfile.dev where we use a .env file to set AWS_CONFIG_FILE to ./.aws/profiles which are AWS IAM Identity Centre (SSO) permission sets

Our stacks specify the account/profile name and region which allows us to deploy anywhere the role is valid for the principal running the operation. When a provider is created the profile name and region are stored. By always setting AWS_CONFIG_FILE at the runner (Github Actions shared workflows, Taskfile.dev for engineer workstations), we can do preview, up, refresh and delete operations with no issues on any valid runner. Hope that helps.

@t0yv0 t0yv0 added the awaiting/core Blocked on a missing bug or feature in pulumi/pulumi (except codegen) label Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/credentials Authenticating the provider awaiting/core Blocked on a missing bug or feature in pulumi/pulumi (except codegen) kind/bug Some behavior is incorrect or out of spec
Projects
None yet
Development

No branches or pull requests

7 participants