Skip to content

Commit

Permalink
Merge pull request #1343 from puppetlabs/cat1777_add_workflow_restarter
Browse files Browse the repository at this point in the history
Add github `workflow-restarter` and `workflow-restarter-test`
  • Loading branch information
jordanbreen28 authored Apr 23, 2024
2 parents 15e65d7 + dac56cb commit 0797a04
Show file tree
Hide file tree
Showing 11 changed files with 367 additions and 0 deletions.
88 changes: 88 additions & 0 deletions .github/actions/workflow-restarter-proxy/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# workflow-restarter

## Description

Although GitHub provides built-in programatic mechanisms for retrying individual steps within a workflow, it doesn't provide one for retrying entire workflows. One possible reason for this limitation may be to prevent accidental infinite retry loops around failing workflows. Any workflow that fails, however, can be manually re-started from the failed workflow on the `Actions` tab of the repository. For more information on restarting github worklows see [Re-running workflows and jobs](https://docs.github.com/en/actions/managing-workflow-runs/re-running-workflows-and-jobs).

Nevertheless, it is possible to programmatically restart a workflow after it fails and the section below shows how to restart a failing workflow 3 times using the `workflow-restarter` re-usable workflow.

## Usage

If setting up the the `workflow-restarter` for the first time, then make sure to initialize it first and then configure another workflow to programmatically restart on failure.

### Initialize the `Workflow Restarter`

First, configure the `workflow-restarter-proxy` custom action by copying this `workflow-restarter-proxy` directory beneath the `.github/actions` directory in your repository.

Second, configure the `workflow-restarter` re-usable workflow (and it's test workflow `workflow-restarter-test`):

```bash
# cd to the `workflow-restarter-proxy` custom action directory
cd .github/actions/workflow-restarter-proxy`
# copy the `workflow-restarter` re-usable workflow to the appropriate directory
cp workflow-restarter-test.yml.sample ../../workflows/workflow-restarter-test.yml
cp workflow-restarter.yml.using_gh ../../workflows/workflow-restarter.yml
```

Third, commit the above to the `main` branch of your repository. I've called this "priming" the workflows because if you don't commit these workflows to the `main` branch initially, then they won't appear on the github "Actions" tab of your repository.
Committing these new workflows should not interfere with any existing github workflow
Finally, verify that the `workflow-restarter.yml` performs as expected: Kick off the `workflow-restarter-test` and it should fail and be re-started 3 times. For example output see the [appendix below](#verify-workflow-restarter-with-workflow-restarter-test).
### Configure an existing workflow to use `on-failure-workflow-restarter`
Now add something like the following `yaml` job at the end of your workflow, changing only the `needs` section to suit.
For example, the following will trigger a restart if either the `acceptance` or the `unit` jobs preceeding it fail. A restart of the failing jobs will be attempted 3 times at which point if the failing jobs continue to fail, then the workflow will be marked as failed. If, however, at any point the `acceptance` and `unit` both pass fine then the restarted workflow will be marked as successful
```yaml
on-failure-workflow-restarter-proxy:
# (1) run this job after the "acceptance" job and...
needs: [acceptance, unit]
# (2) continue ONLY IF "acceptance" fails
if: always() && needs.acceptance.result == 'failure' || needs.unit.result == 'failure'
runs-on: ubuntu-latest
steps:
# (3) checkout this repository in order to "see" the following custom action
- name: Checkout repository
uses: actions/checkout@v2
# (4) "use" the custom action to retrigger the failed "acceptance job" above
# NOTE: pass the SOURCE_GITHUB_TOKEN to the custom action because (a) it must have
# this to trigger the reusable workflow that restarts the failed job; and
# (b) custom actions do not have access to the calling workflow's secrets
- name: Trigger reusable workflow
uses: ./.github/actions/workflow-restarter-proxy
env:
SOURCE_GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
repository: ${{ github.repository }}
run_id: ${{ github.run_id }}
```

## Appendix

### Verify `Workflow Restarter` with `Workflow Restarter TEST`

The following shows 3 `Workflow Restarter` occuring after the `Workflow Restarter TEST`, which is set to fail continuously.

![alt text](image.png)

Looking closer at the `Workflow Restarter TEST` reveals

* that the workflow includes 2 jobs `unit` and `acceptance`; and
* that the workflow has been re-run 3 times, e.g.,

![alt text](image-1.png)

Further, the following sequence of screenshots shows that only failed jobs are re-run.

* The `on-failure-workflow-restarter` job **(1)** is triggered by the failure of the `unit` job and **(2)** successfully calls the `workflow-restarter` workflow
* The `workflow-restarter` in turn triggers a re-run of the `unit` job **(3)** and the `Workflow Restarter TEST` shows this as an incremented attempt count at **(4)**.

![alt text](image-2.png)

![alt text](image-3.png)

![alt text](image-4.png)
55 changes: 55 additions & 0 deletions .github/actions/workflow-restarter-proxy/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
name: 'Workflow Restarter Proxy'
description: |
This custom action acts as a proxy to trigger the reusable workflow that restarts a failed job.
NOTE: This action cannot itself do the re-start because in effect it's composite steps get folded
into the source workflow, the one that "uses" this custom action. Since github does not allow a workflow
to retrigger itself, then the source workflow must be triggered not by this but by another workflow.
Therefore, this custom action triggers that other workflow.
inputs:
repository:
description: 'Should be set to github.repository via the calling workflow'
required: true
run_id:
description: 'Should be set to github.run_id via the calling workflow'
required: true
runs:
using: 'composite'
steps:
# ABORT if not SOURCE_GITHUB_TOKEN environment variable set
- name: Check for presence of SOURCE_GITHUB_TOKEN environment variable
shell: bash
run: |
if [[ -z "${{ env.SOURCE_GITHUB_TOKEN }}" ]]; then
echo "ERROR: \$SOURCE_GITHUB_TOKEN must be set by the calling workflow" 1>&2 && exit 1
fi
# checkout the repository because I want bundler to have access to my Gemfile
- name: Checkout repository
uses: actions/checkout@v2

# setup ruby including a bundle install of my Gemfile
- name: Set up Ruby and install Octokit
uses: ruby/setup-ruby@v1
with:
ruby-version: '3'
bundler-cache: true # 'bundle install' will be run and gems cached for faster workflow runs

# Trigger the reusable workflow
- name: Trigger reusable workflow
shell: bash
run: |
bundle exec ruby -e "
require 'octokit'
client = Octokit::Client.new(:access_token => '${{ env.SOURCE_GITHUB_TOKEN }}')
client.post(
'/repos/${{ inputs.repository }}/actions/workflows/workflow-restarter.yml/dispatches',
{
ref: 'main',
inputs: {
repo: '${{ inputs.repository }}',
run_id: '${{ inputs.run_id }}'
}
}
)
"
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
name: Workflow Restarter TEST

on:
workflow_dispatch:
inputs:
fail:
description: >
For (acceptance, unit) jobs:
'true' = (fail, succeed) and
'false' = (succeed, fail)
required: true
default: 'true'

jobs:
unit:
runs-on: ubuntu-latest
steps:
- name: Check outcome
run: |
if [ "${{ github.event.inputs.fail }}" = "true" ]; then
echo "'unit' job succeeded"
exit 0
else
echo "'unit' job failed"
exit 1
fi
acceptance:
runs-on: ubuntu-latest
steps:
- name: Check outcome
run: |
if [ "${{ github.event.inputs.fail }}" = "true" ]; then
echo "'acceptance' job failed"
exit 1
else
echo "'acceptance' job succeeded"
exit 0
fi

on-failure-workflow-restarter-proxy:
# (1) run this job after the "acceptance" job and...
needs: [acceptance, unit]
# (2) continue ONLY IF "acceptance" fails
if: always() && needs.acceptance.result == 'failure' || needs.unit.result == 'failure'
runs-on: ubuntu-latest
steps:
# (3) checkout this repository in order to "see" the following custom action
- name: Checkout repository
uses: actions/checkout@v2

# (4) "use" the custom action to retrigger the failed "acceptance job" above
# NOTE: pass the SOURCE_GITHUB_TOKEN to the custom action because (a) it must have
# this to trigger the reusable workflow that restarts the failed job; and
# (b) custom actions do not have access to the calling workflow's secrets
- name: Trigger reusable workflow
uses: ./.github/actions/workflow-restarter-proxy
env:
SOURCE_GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
repository: ${{ github.repository }}
run_id: ${{ github.run_id }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
name: Workflow Restarter
on:
workflow_dispatch:
inputs:
repo:
description: "GitHub repository name."
required: true
type: string
run_id:
description: "The ID of the workflow run to rerun."
required: true
type: string
retries:
description: "The number of times to retry the workflow run."
required: false
type: number
default: 3
secrets:
GITHUB_TOKEN:
required: true

jobs:
rerun:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2

- name: Check retry count
id: check-retry
run: |
# IF `--attempts` returns a non-zero exit code, then keep retrying
status_code=$(gh run view ${{ inputs.run_id }} --repo ${{ inputs.repo }} --attempt ${{ inputs.retries }} --json status) || {
echo "Retry count is within limit"
echo "::set-output name=should_retry::true"
exit 0
}

# ELSE `--attempts` returns a zero exit code, so stop retrying
echo "Retry count has reached the limit"
echo "::set-output name=should_retry::false"
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

- name: Re-run failed jobs
if: ${{ steps.check-retry.outputs.should_retry == 'true' }}
run: gh run rerun --failed ${{ inputs.run_id }} --repo ${{ inputs.repo }}
continue-on-error: true
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
61 changes: 61 additions & 0 deletions .github/workflows/workflow-restarter-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
name: Workflow Restarter TEST

on:
workflow_dispatch:
inputs:
fail:
description: >
For (acceptance, unit) jobs:
'true' = (fail, succeed) and
'false' = (succeed, fail)
required: true
default: 'true'

jobs:
unit:
runs-on: ubuntu-latest
steps:
- name: Check outcome
run: |
if [ "${{ github.event.inputs.fail }}" = "true" ]; then
echo "'unit' job succeeded"
exit 0
else
echo "'unit' job failed"
exit 1
fi
acceptance:
runs-on: ubuntu-latest
steps:
- name: Check outcome
run: |
if [ "${{ github.event.inputs.fail }}" = "true" ]; then
echo "'acceptance' job failed"
exit 1
else
echo "'acceptance' job succeeded"
exit 0
fi
on-failure-workflow-restarter-proxy:
# (1) run this job after the "acceptance" job and...
needs: [acceptance, unit]
# (2) continue ONLY IF "acceptance" fails
if: always() && needs.acceptance.result == 'failure' || needs.unit.result == 'failure'
runs-on: ubuntu-latest
steps:
# (3) checkout this repository in order to "see" the following custom action
- name: Checkout repository
uses: actions/checkout@v2

# (4) "use" the custom action to retrigger the failed "acceptance job" above
# NOTE: pass the SOURCE_GITHUB_TOKEN to the custom action because (a) it must have
# this to trigger the reusable workflow that restarts the failed job; and
# (b) custom actions do not have access to the calling workflow's secrets
- name: Trigger reusable workflow
uses: ./.github/actions/workflow-restarter-proxy
env:
SOURCE_GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
repository: ${{ github.repository }}
run_id: ${{ github.run_id }}
51 changes: 51 additions & 0 deletions .github/workflows/workflow-restarter.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
name: Workflow Restarter
on:
workflow_dispatch:
inputs:
repo:
description: "GitHub repository name."
required: true
type: string
run_id:
description: "The ID of the workflow run to rerun."
required: true
type: string
retries:
description: "The number of times to retry the workflow run."
required: false
type: number
default: 3
secrets:
GITHUB_TOKEN:
required: true

jobs:
rerun:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2

- name: Check retry count
id: check-retry
run: |
# IF `--attempts` returns a non-zero exit code, then keep retrying
status_code=$(gh run view ${{ inputs.run_id }} --repo ${{ inputs.repo }} --attempt ${{ inputs.retries }} --json status) || {
echo "Retry count is within limit"
echo "::set-output name=should_retry::true"
exit 0
}
# ELSE `--attempts` returns a zero exit code, so stop retrying
echo "Retry count has reached the limit"
echo "::set-output name=should_retry::false"
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

- name: Re-run failed jobs
if: ${{ steps.check-retry.outputs.should_retry == 'true' }}
run: gh run rerun --failed ${{ inputs.run_id }} --repo ${{ inputs.repo }}
continue-on-error: true
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

0 comments on commit 0797a04

Please sign in to comment.