Skip to content

Bandwidth/datadog-actions-metrics

 
 

Repository files navigation

datadog-actions-metrics ts e2e

This is an action to send metrics of GitHub Actions to Datadog on an event.

Forked from https://github.com/int128/datadog-actions-metrics

Purpose

Improve the reliability and experience of CI/CD pipeline

Below the action you can use to collect the metrics when a workflow is completed - it will send the metrics for all the workflows you have:

---
# ================================================================================== #
# DataDog Metrics
# The goal of this workflow is collect the github actions metrics and send to DataDog
# the metrics are send using the API KEY. 
# ================================================================================== #

name: CI - GHA DataDog Metrics
on:
  workflow_run:
    workflows: 
      - '**'
    branches:
      - '**'
    types:
      - completed

permissions:
  actions: read
  checks: read
  contents: read
  pull-requests: read

jobs:
  send:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    steps:
      - name: Send GHA metrics to DataDog
        uses: Bandwidth/[email protected]
        with:
          datadog-api-key: ${{ secrets.DATADOG_API_KEY_PRODUCTION }}
          collect-job-metrics: true
          collect-step-metrics: true
...

Here is an example of screenshot in Datadog.

image

For developer experience, you can analyze the following metrics:

  • Time to test an application
  • Time to deploy an application

For reliability, you can monitor the following metrics:

  • Success rate of the main branch
  • Rate limit of built-in GITHUB_TOKEN

Improve the reliability and experience of self-hosted runners

For the self-hosted runners, you can monitor the following metrics for reliability and experience:

Improve your team development process

You can analyze your development activity such as number of merged pull requests. It helps the continuous process improvement of your team.

To collect the metrics when a pull request is opened or closed:

---
# ================================================================================== #
# DataDog Metrics
# The goal of this workflow is collect the Pull request metrics and send to Datadog.
# the metrics are send using the API KEY. 
# ================================================================================== #

name: CI - PR DataDog Metrics
on:
  pull_request:
    types:
      - opened
      - closed

permissions:
  actions: read
  checks: read
  contents: read
  pull-requests: read

jobs:
  send:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    steps:
      - name: Send PR count to DataDog
        uses: Bandwidth/[email protected]
        with:
          datadog-api-key: ${{ secrets.DATADOG_API_KEY_PRODUCTION }}
          collect-job-metrics: true
          collect-step-metrics: true
...

Overview

This action handles the following events:

  • workflow_run event
  • pull_request event
  • push event
  • schedule event

It ignores other events.

Metrics for workflow_run event

Workflow run

This action sends the following metrics.

  • github.actions.workflow_run.total
  • github.actions.workflow_run.conclusion.{CONCLUSION}_total
    • e.g. github.actions.workflow_run.conclusion.success_total
    • e.g. github.actions.workflow_run.conclusion.failure_total
    • See the official document for the possible values of CONCLUSION field
  • github.actions.workflow_run.duration_second
    • Time from a workflow is started until it is updated
  • github.actions.workflow_run.queued_duration_second
    • Time from a workflow is started until the first job is started
    • This metric is suitable for monitoring only if it is ensured the first job runs on a self-hosted runner

It has the following tags:

  • repository_owner
  • repository_name
  • workflow_name
  • workflow_id
  • run_attempt
    • Attempt number of the run, 1 for first attempt and higher if the workflow was re-run
  • event
  • sender
  • sender_type = either Bot, User or Organization
  • branch
  • default_branch = true or false
  • pull_request_number
    • Pull request(s) which triggered the workflow
  • conclusion

Job

This action sends the following metrics if collect-job-metrics is enabled.

  • github.actions.job.total
  • github.actions.job.conclusion.{CONCLUSION}_total
    • e.g. github.actions.job.conclusion.success_total
    • e.g. github.actions.job.conclusion.failure_total
  • github.actions.job.duration_second
    • Time from a job is started to completed
  • github.actions.job.lost_communication_with_server_error_total
    • Count of "lost communication with the server" errors of self-hosted runners. See the issue #444 for details
  • github.actions.job.received_shutdown_signal_error_total
    • Count of "The runner has received a shutdown signal" errors of self-hosted runners.

It has the following tags:

  • repository_owner
  • repository_name
  • workflow_name
  • workflow_id
  • event
  • sender
  • sender_type = either Bot, User or Organization
  • branch
  • default_branch = true or false
  • pull_request_number
    • Pull request(s) which triggered the workflow
  • job_name
  • job_id
  • conclusion
  • status
  • runs_on
    • Runner label inferred from the workflow file if available
    • e.g. ubuntu-latest

Step

This action sends the following metrics if collect-step-metrics is enabled.

  • github.actions.step.total
  • github.actions.step.conclusion.{CONCLUSION}_total
    • e.g. github.actions.step.conclusion.success_total
    • e.g. github.actions.step.conclusion.failure_total
  • github.actions.step.duration_second

It has the following tags:

  • repository_owner
  • repository_name
  • workflow_name
  • workflow_id
  • event
  • sender
  • sender_type = either Bot, User or Organization
  • branch
  • default_branch = true or false
  • pull_request_number
    • Pull request(s) which triggered the workflow
  • job_name
  • job_id
  • step_name
  • step_number = 1, 2, ...
  • conclusion
  • status
  • runs_on
    • Runner label inferred from the workflow file if available
    • e.g. ubuntu-latest

Enable job or step metrics

To send the metrics of jobs and steps:

    steps:
      - uses: Bandwidth/datadog-actions-metrics@v1
        with:
          datadog-api-key: ${{ secrets.DATADOG_API_KEY }}
          collect-job-metrics: true
          collect-step-metrics: true

To send the metrics of jobs and steps on the default branch only:

    steps:
      - uses: Bandwidth/datadog-actions-metrics@v1
        with:
          datadog-api-key: ${{ secrets.DATADOG_API_KEY }}
          collect-job-metrics: ${{ github.event.workflow_run.head_branch == github.event.repository.default_branch }}
          collect-step-metrics: ${{ github.event.workflow_run.head_branch == github.event.repository.default_branch }}

This action calls GitHub GraphQL API to get jobs and steps of the current workflow run. Note that it may cause the rate exceeding error if too many workflows are run.

If the job or step metrics is enabled, this action requires the following permissions:

    permissions:
      actions: read
      checks: read
      contents: read

Metrics for pull_request event

Pull request (opened)

This action sends the following metrics on opened type.

  • github.actions.pull_request_opened.total
  • github.actions.pull_request_opened.commits
  • github.actions.pull_request_opened.changed_files
  • github.actions.pull_request_opened.additions
  • github.actions.pull_request_opened.deletions

It has the following tags:

  • repository_owner
  • repository_name
  • sender
  • sender_type = either Bot, User or Organization
  • user
  • pull_request_number
  • draft = true or false
  • base_ref
  • head_ref

Pull request (closed)

This action sends the following metrics on closed type.

  • github.actions.pull_request_closed.total
  • github.actions.pull_request_closed.since_opened_seconds
    • Time from a pull request is opened to closed
  • github.actions.pull_request_closed.since_first_authored_seconds
    • Time from the authored time of the first commit until closed
  • github.actions.pull_request_closed.since_first_committed_seconds
    • Time from the committed time of the first commit until closed
  • github.actions.pull_request_closed.commits
  • github.actions.pull_request_closed.changed_files
  • github.actions.pull_request_closed.additions
  • github.actions.pull_request_closed.deletions

It has the following tags:

  • repository_owner
  • repository_name
  • sender
  • sender_type = either Bot, User or Organization
  • user
  • pull_request_number
  • draft = true or false
  • base_ref
  • head_ref
  • merged = true or false
  • requested_team
    • Team(s) of requested reviewer(s)
  • label
    • Label(s) of a pull request
    • Available if send-pull-request-labels is set

Permissions

For pull_request event, this action requires the following permissions:

    permissions:
      pull-requests: read

Metrics for push event

This action sends the following metrics.

  • github.actions.push.total

It has the following tags:

  • repository_owner
  • repository_name
  • sender
  • sender_type = either Bot, User or Organization
  • ref
  • created = true or false
  • deleted = true or false
  • forced = true or false
  • default_branch = true or false

Metrics for schedule event

Workflow run

This action sends the following metrics:

  • github.actions.schedule.queued_workflow_run.total (gauge)

It has the following tags:

  • repository_owner
  • repository_name

It is useful for monitoring self-hosted runners.

Permissions

For schedule event, this action requires the following permissions:

    permissions:
      actions: read

Metrics for all supported events

Rate limit

This action always sends the following metrics of the built-in GITHUB_TOKEN rate limit.

  • github.actions.api_rate_limit.remaining
  • github.actions.api_rate_limit.limit

It has the following tags:

  • repository_owner
  • repository_name
  • resource = core, search and graphql

This does not affect the rate limit of GitHub API because it just calls /rate_limit endpoint.

Specification

You can set the following inputs:

Name Default Description
github-token github.token GitHub token to get jobs and steps if needed
github-token-rate-limit-metrics github.token GitHub token for rate limit metrics
datadog-api-key - Datadog API key. If not set, this action does not send metrics actually
datadog-site - Datadog Server name such as datadoghq.eu, ddog-gov.com, us3.datadoghq.com
send-pull-request-labels false Send pull request labels as Datadog tags
collect-job-metrics false Collect job metrics
collect-step-metrics false Collect step metrics

Proxy

To connect to Datadog API via a HTTPS proxy, set https_proxy environment variable. For example,

    steps:
      - uses: Bandwidth/[email protected]
        with:
          datadog-api-key: ${{ secrets.DATADOG_API_KEY }}
        env:
          https_proxy: http://proxy.example.com:8080

Breaking changes

collect-step-metrics is explicitly required to send the step metrics.

collect-job-metrics-for-only-default-branch is no longer supported. Use collect-job-metrics instead.

...