Notes on monitoring

Signed-off-by: Alex Ellis (OpenFaaS Ltd) <[email protected]>
self-actuated · Nov 6, 2024 · 977aec9 · 977aec9
1 parent 7d5b30b
commit 977aec9
Show file tree

Hide file tree

Showing 3 changed files with 12 additions and 5 deletions.
diff --git a/docs/tasks/monitoring.md b/docs/tasks/monitoring.md
@@ -1,10 +1,14 @@
-# Task: Monitoring Actuated
+# Task: Monitoring Your Actuated Usage
 
 !!! info "Our team monitors actuated around the clock, on your behalf"
     The actuated team proactively monitors your servers and build queue for issues. We remediate them on your behalf and for anything cannot be fixed remotely, we'll be in touch via Slack or email.
 
+## Monitoring with the CLI
+
 The [actuated CLI](/tasks/cli) should be used for support, to query the agent's logs, or the logs of individual VMs.
 
+## Monitoring with Grafana
+
 If you would also like to do your own monitoring, you can purchase a monitoring add-on, which will expose metrics for your own Prometheus instance. You can then set up a Grafana dashboard to view the metrics.
 
 The monitoring add-on provides:
@@ -14,13 +18,13 @@ The monitoring add-on provides:
 
 To opt-in, follow the instructions in the dashboard.
 
-## Scrape the metrics
+### Scrape the metrics
 
 Metrics are currently made available through [Prometheus](https://prometheus.io/) federation. Prometheus can be run with Docker, as a Kubernetes deployment, or as a standalone binary.
 
 You can add a scrape target to your own Prometheus instance, or you can use the Grafana Agent to do that and ship off the metrics to Grafana Cloud.
 
-Here is a sample scrape config for Prometheus:
+Here is an example sample scrape config for Prometheus:
 
 ```yaml
 scrape_configs:
@@ -34,14 +38,14 @@ scrape_configs:
     scrape_interval: 60s
     scrape_timeout: 5s
     static_configs:
-    - targets: ["tbc:443"]
+    - targets: ["actuated-controlplane.example.com:443"]
 ```
 
 The `bearer_token` is a secret, and unique per customer. Only a bcrypt hash is stored in the control-plane, along with a mapping between GitHub organisations and the token.
 
 The `scrape_interval` must be `60s`, or higher to avoid rate-limiting.
 
-Contact the support team on Slack for the value for the `targets` field.
+The value `actuated-controlplane.example.com` is a placeholder, you can request the endpoint from the actuated team.
 
 ### Control-plane metrics
 

diff --git a/docs/tasks/right-sizing-vm.md b/docs/tasks/right-sizing-vm.md
@@ -10,6 +10,8 @@ In the blog post [Right sizing VMs for GitHub Actions](https://actuated.com/blog
 
 A range of metrics are collected in addition to the standard ones like CPU & RAM consumption, vmmeter also shows contention on I/O, whether a job is running out of disk apce, and how many open files are in use. Non-obvious metrics like entropy and I/O contention are also collected, which can also be linked to degraded performance.
 
+See also: [Custom VM sizes](/custom-vm-size/)
+
 ## Try it out
 
 Add the following to the top of your worklflow:

diff --git a/mkdocs.yml b/mkdocs.yml
@@ -98,6 +98,7 @@ nav:
       - Install the Agent: install-agent.md
       - Run a test build: test-build.md
   - Tasks:
+    - Right Size the VM: tasks/right-size-vm.md
     - Setup a Registry Mirror: tasks/registry-mirror.md
     - Debug a job with SSH: tasks/debug-ssh.md
     - Set-up the CLI: tasks/cli.md