Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(packer-images): Add Garbage Collector for AWS #4355

Closed
3 tasks
jayfranco999 opened this issue Oct 14, 2024 · 10 comments
Closed
3 tasks

(packer-images): Add Garbage Collector for AWS #4355

jayfranco999 opened this issue Oct 14, 2024 · 10 comments
Assignees

Comments

@jayfranco999
Copy link
Collaborator

Due to the complexity of this PR – jenkins-infra/packer-images#1430

We are splitting the tasks into 4 stages, this issue is related to handling the Garbage collector scripts foe old AWS AMI Images.

We are reintroducing the amazon login that we were using with our previous AWS jenkins account – jenkins-infra/packer-images#734

  • Cleanup AWS images – ensure staging images older than 7 days are removed
  • Cleanup AWS images – collect EC2 snapshots older than 1 year on us-east-2
  • Cleanup AWS images – add aws.sh as daily cleanup task

Originally posted by @jayfranco999 in #4316 (comment)

@dduportal
Copy link
Contributor

Update:

@jayfranco999
Copy link
Collaborator Author

Update:

jenkins-infra/kubernetes-management#5892 and jenkins-infra/packer-images#1527 adds the aws credentials and aws-gc stage to the Jenkisnfile_gc pipeline.

Once merged, we can merge jenkins-infra/packer-images#1528 which makes the garbage-collectora distinct job and speed up themainbuild onpacker-images`

@jayfranco999
Copy link
Collaborator Author

jayfranco999 commented Nov 20, 2024

Update:

Task list to follow:

This first version works well!

Next steps:

garbage collector now exists as a distinct job Jenkinsfile_gc and runs on a pod agent with dryrun on PRs (but not on tags) and doesn't send github checks anymore

@jayfranco999
Copy link
Collaborator Author

jayfranco999 commented Nov 20, 2024

    • Add AWS garbage collection

On adding the aws garbage collection, we noticed that the aws.sh script exited with an error An error occurred (UnauthorizedOperation) when calling the DescribeNetworkInterfaces operation: You are not authorized to perform this operation. User: arn:aws:iam::326712726440:user/terraform-packer-user is not authorized to perform: ec2:DescribeNetworkInterfaces because no identity-based policy allows the ec2:DescribeNetworkInterfaces action

ec2:DescribeNetworkInterfaces permission needs to be added for the IAM user terraform-packer-user

Even though the script failed, the logs reported the build step as successful, which is unexpected behaviour on infra.ci (https://infra.ci.jenkins.io/job/infra-tools/job/packer-images-gc/job/main/65/pipeline-console/?selected-node=19)

image

Also there was an unexpected behaviour with the aws_snapshots.sh script that could not delete snapshots with the error The snapshot snap-0999857b5cac1fd3a is currently in use by ami-0cfe784c66xxxx

image

Further steps will involve fixing these errors and unexpected behaviour of infra.ci

@jayfranco999
Copy link
Collaborator Author

  • Disable the GitHub checks report from infra.ci to the repository

jenkins-infra/kubernetes-management#5895 partially works as the configuration was applied on garbage-collector for packer images folder on infra.ci

image

While tag discovery was successfully disabled, the github checks report of the garbage collector is still published on packer-images repository, we are working to fix this issue

image

@dduportal
Copy link
Contributor

  • Disable the GitHub checks report from infra.ci to the repository

Explanation: your changes were good and did disable the usual "Status checks" (e.g. one status check per stage by default, pipeline reporting and custom status checks).
But what we see in your screenshot above are "Notifications", which are a subset of "Status checks" (if any) but not only: the GitHub Branch Source plugin also sends a notification as part of the pipeline report.

If you look at the configuration below, you see the checkbox "Skip publishing Status check" is selected , which was not the case before your changes 👍

Capture d’écran 2024-11-20 à 14 01 23

However, we now need to have the "Suppress progress updates in job check" and "Skip GitHub Branch Source notifications" checbox selected to be sure that:

  • No notifications are sent at all when pipeline finished
  • Any "build in progress" event should also not send any notification (when pipeline starts)

We need to allow these to be configured in the "Jenkins Jobs" helm chart in https://github.com/jenkins-infra/helm-charts/blob/575396eb4f6c813f59a36610594c12446222935c/charts/jenkins-jobs/templates/_jobDSL_jobs_multibranch.tpl#L38-L40 => let me do this right now.

@dduportal
Copy link
Contributor

Update on the job configuration:

  • Added a new feature on the jenkins-jobs helm chart to totally disable notifications.
    • NOTE: this feature also has a bugfix which sync. the "progress updates checks" with other checks => there are now disabled or enabled together (which fixes partailly our problem here).
    • This new feature is not enabled (yet?) on the job though (non regrerssion)

@dduportal
Copy link
Contributor

Update:

@dduportal
Copy link
Contributor

dduportal commented Nov 22, 2024

  • We checked if aws.sh was effective. For (dangling) instances, it is not finding any with the command aws ec2 describe-instances --filters 'Name=tag:Name,Values=*Packer*' --query 'Reservations[].Instances[?LaunchTime<=2024-11-21][].InstanceId'. However, we see ~15 dangling instance (terminated)

The root cause comes from a recent change in the Packer Amazon EBS build:

The builder no longer adds a "Name": "Packer Builder" entry to the tags.

=> we need to add a tag Name with a value matching the GC filter in https://github.com/jenkins-infra/packer-images/blob/2a473a06f70ada1bc48cd985fb836119bc09ee31/sources.pkr.hcl#L25-L32

Ref. https://developer.hashicorp.com/packer/integrations/hashicorp/amazon/latest/components/builder/ebs#tags

@dduportal
Copy link
Contributor

Last tasks have been done (see #4355)

@jayfranco999 reported we don't have dangling instances no more \o/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants