Skip to content

Commit

Permalink
Notes on monitoring
Browse files Browse the repository at this point in the history
Signed-off-by: Alex Ellis (OpenFaaS Ltd) <[email protected]>
  • Loading branch information
alexellis committed Nov 6, 2024
1 parent 7d5b30b commit 977aec9
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 5 deletions.
14 changes: 9 additions & 5 deletions docs/tasks/monitoring.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
# Task: Monitoring Actuated
# Task: Monitoring Your Actuated Usage

!!! info "Our team monitors actuated around the clock, on your behalf"
The actuated team proactively monitors your servers and build queue for issues. We remediate them on your behalf and for anything cannot be fixed remotely, we'll be in touch via Slack or email.

## Monitoring with the CLI

The [actuated CLI](/tasks/cli) should be used for support, to query the agent's logs, or the logs of individual VMs.

## Monitoring with Grafana

If you would also like to do your own monitoring, you can purchase a monitoring add-on, which will expose metrics for your own Prometheus instance. You can then set up a Grafana dashboard to view the metrics.

The monitoring add-on provides:
Expand All @@ -14,13 +18,13 @@ The monitoring add-on provides:

To opt-in, follow the instructions in the dashboard.

## Scrape the metrics
### Scrape the metrics

Metrics are currently made available through [Prometheus](https://prometheus.io/) federation. Prometheus can be run with Docker, as a Kubernetes deployment, or as a standalone binary.

You can add a scrape target to your own Prometheus instance, or you can use the Grafana Agent to do that and ship off the metrics to Grafana Cloud.

Here is a sample scrape config for Prometheus:
Here is an example sample scrape config for Prometheus:

```yaml
scrape_configs:
Expand All @@ -34,14 +38,14 @@ scrape_configs:
scrape_interval: 60s
scrape_timeout: 5s
static_configs:
- targets: ["tbc:443"]
- targets: ["actuated-controlplane.example.com:443"]
```
The `bearer_token` is a secret, and unique per customer. Only a bcrypt hash is stored in the control-plane, along with a mapping between GitHub organisations and the token.

The `scrape_interval` must be `60s`, or higher to avoid rate-limiting.

Contact the support team on Slack for the value for the `targets` field.
The value `actuated-controlplane.example.com` is a placeholder, you can request the endpoint from the actuated team.

### Control-plane metrics

Expand Down
2 changes: 2 additions & 0 deletions docs/tasks/right-sizing-vm.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ In the blog post [Right sizing VMs for GitHub Actions](https://actuated.com/blog

A range of metrics are collected in addition to the standard ones like CPU & RAM consumption, vmmeter also shows contention on I/O, whether a job is running out of disk apce, and how many open files are in use. Non-obvious metrics like entropy and I/O contention are also collected, which can also be linked to degraded performance.

See also: [Custom VM sizes](/custom-vm-size/)

## Try it out

Add the following to the top of your worklflow:
Expand Down
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ nav:
- Install the Agent: install-agent.md
- Run a test build: test-build.md
- Tasks:
- Right Size the VM: tasks/right-size-vm.md
- Setup a Registry Mirror: tasks/registry-mirror.md
- Debug a job with SSH: tasks/debug-ssh.md
- Set-up the CLI: tasks/cli.md
Expand Down

0 comments on commit 977aec9

Please sign in to comment.