From 528c6f9a3a602f4255c237f34cdfd09055952ee4 Mon Sep 17 00:00:00 2001 From: andersh Date: Thu, 18 Jul 2024 12:01:49 +0200 Subject: [PATCH 1/3] docs: rename all naming of the aci-exporter to its correct name. Minor grammar fixes --- README.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index fe910e6..68d0330 100644 --- a/README.md +++ b/README.md @@ -17,7 +17,7 @@ The ACI-Monitoring-Stack integrates the following key components: - [Syslog-ng](https://github.com/syslog-ng/syslog-ng): is an open-source implementation of the Syslog protocol, its role in this stack is to translate syslog messages from RFC 3164 to 5424. This is needed because Promtail only support Syslog RFC 5424 over TCP and this capability is only available in ACI 6.1 and above. -- [ACI-Exporter](https://github.com/opsdis/aci-exporter): A custom-built exporter that serves as the bridge between your Cisco ACI environment and the Prometheus monitoring ecosystem. The ACI-Exporter translates ACI-specific metrics into a format that Prometheus can ingest, ensuring that all crucial data points are captured and monitored effectively. +- [aci-exporter](https://github.com/opsdis/aci-exporter): A Prometheus exporter that serves as the bridge between your Cisco ACI environment and the Prometheus monitoring ecosystem. The aci-exporter translates ACI-specific metrics into a format that Prometheus can ingest, ensuring that all crucial data points are captured and monitored effectively. - Pre-configured ACI data collections queries, alerts, and dashboards (Work In Progress): The ACI-Monitoring-Stack provides a solid foundation for monitoring an ACI fabric with its pre-defined queries, dashboards, and alerts. While these tools are crafted based on best practices to offer immediate insights into network performance, they are not exhaustive. The strength of the ACI-Monitoring-Stack lies in its community-driven approach. Users are invited to contribute their expertise by providing feedback, sharing custom solutions, and helping enhance the stack. Your input helps to refine and expand the stack's capabilities, ensuring it remains a relevant and powerful tool for network monitoring. @@ -33,7 +33,7 @@ flowchart-elk PT["Promtail"] SL["Syslog-ng"] AM["Alertmanager"] - A["ACI Exporter"] + A["aci-exporter"] G--"PromQL"-->P G--"LogQL"-->L P-->AM @@ -66,7 +66,7 @@ If you want to contribute to this project star from [Here](docs/development.md) ## Pre Requisites - Familiarity with Kubernetes: This installation guide is intended to assist with the setup of the ACI Monitoring stack and assumes prior familiarity with Kubernetes; it is not designed to provide instruction on Kubernetes itself. - A Kubernetes Cluster: Currently the stack has been tested on `Upstream Kubernetes 1.30.x` and `Minikube`. - - Persistent Volumes: 10G should be plenty for a small/demo environment. Many Storage provisioner support Volume expansion so should be easy to increase this post installation. + - Persistent Volumes: 10G should be plenty for a small/demo environment. Many storage provisioner support Volume expansion so should be easy to increase this post installation. - Ability to expose services for: - Access to the Grafana/Prometheus and Alert Manager dashboards: This will be ideally achieved via an `Ingress Controller` - (Optional) Wildcard DNS Entries for the ingress controller domain. @@ -84,19 +84,19 @@ If you are installing on Minikube please follow the [Minikube Preparation Steps] ## Config Preparation -The ACI Monitoring Stack is a combination of several [Charts](charts/aci-monitoring-stack/charts), if you are familiar with Helm you are aware of the struggle to propagate dynamic values to sub-charts. For example it is not possible to pass to a sub-chart the name of a service in a dynamic way. +The ACI Monitoring Stack is a combination of several [Charts](charts/aci-monitoring-stack/charts), if you are familiar with Helm you are aware of the struggle to propagate dynamic values to sub-charts. For example, it is not possible to pass to a sub-chart the name of a service in a dynamic way. In order to simplify the user experience the `chart` comes with a few pre-configured parameters that are populated in the configurations of the various sub-charts. -For example the ACI Exporter Service Name is pre configured as `aci-exporter-svc` and this value is then passed to Prometheus as service Discovery URL. +For example the aci-exporter Service Name is pre-configured as `aci-exporter-svc` and this value is then passed to Prometheus as service Discovery URL. All these values can be customized and if you need to you can refer to the [Values](charts/aci-monitoring-stack/values.yaml) file. -*Note:* This is the first HELM char `camrossi` created and he is sure it can be improved. If you have suggestions they are extremely welcome! :) +*Note:* This is the first HELM char `camrossi` created, and he is sure it can be improved. If you have suggestions they are extremely welcome! :) -### ACI Exporter +### The aci-exporter -ACI Exporter is the bridge between your Cisco ACI environment and the Prometheus monitoring ecosystem, for it to works it needs to know: +The aci-exporter is the bridge between your Cisco ACI environment and the Prometheus monitoring ecosystem, for it to works it needs to know: - `fabrics`: A list of fabrics and how to connect to the APICs. - Requires a **ReadOnly** **Admin** User - `service_discovery`: Configure if devices are reachable via Out Of Band (`oobMgmtAddr`) or InBand (`inbMgmtAddr`). @@ -189,7 +189,7 @@ Grafana is installed via its [own Chart](https://github.com/grafana/helm-charts/ - The `ingress` config: External URL which can access Grafana. - Persistent Volume Capacity - (Optional) `adminPassword`: If not set will be auto generated and can be found in the `grafana` secret -- (Optional) `viewers_can_edit`: This allows users with a `view only` role to modify the dashboards and access `Explorer` to execute queries against `Pormetheus` and `Loki`. However the user will not be able to save any changes. +- (Optional) `viewers_can_edit`: This allows users with a `view only` role to modify the dashboards and access `Explorer` to execute queries against `Pormetheus` and `Loki`. However, the user will not be able to save any changes. - (Optional) `deploymentStrategy`: if Grafana `Persistent Volume` is of type `ReadWriteOnce` rolling updates will get stuck as the new pod cannot start before the old one releases the PVC. Setting `deploymentStrategy.type` to `Recreate` destroy the original pod before starting the new one. Below an example: @@ -213,7 +213,7 @@ grafana: ``` ### Syslog config -The syslog config is the most complicated part as it relies on 3 components (`promtail`, `loki` and `syslog-ng`) with their own individual configs. Furthermore there are two issues we need to overcome: +The syslog config is the most complicated part as it relies on 3 components (`promtail`, `loki` and `syslog-ng`) with their own individual configs. Furthermore, there are two issues we need to overcome: - The Syslog messages don't contain the ACI Fabric name: to be able to distinguish the messaged from one fabric to another the only solution is to use dedicated `external services` with unique `IP:Port` pair per Fabric. - Until ACI 6.1 we need `syslog-ng` between `ACI` and `Promtail` to convert from RFC 3164 to 5424 From 793f2eb3991899d811eda287bc7de3b125151672 Mon Sep 17 00:00:00 2001 From: andersh Date: Thu, 18 Jul 2024 12:08:10 +0200 Subject: [PATCH 2/3] docs: rename all naming of the aci-exporter to its correct name. Minor grammar fixes --- docs/development.md | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/docs/development.md b/docs/development.md index ab19b34..18a1932 100644 --- a/docs/development.md +++ b/docs/development.md @@ -16,9 +16,9 @@ Data is currently collected in two ways: - Syslog Ingestion: The ACI Side Config "decides" what to send and assuming the correct logging level is selected you can then build the dashboards in grafana using Loki as a data source. You can take a look at the `Contract Drops Logs` dashboard for inspiration. -- [ACI Exporter](https://github.com/opsdis/aci-exporter) Queries: which queries and how the data is collected is highly customizable. +- [aci-exporter](https://github.com/opsdis/aci-exporter) Queries: which queries and how the data is collected is highly customizable. -### ACI Exporter and Prometheus +### aci-exporter and Prometheus The general idea is to use aci-exporter to convert ACI Rest API Calls in the Prometheus exposition format. @@ -26,9 +26,9 @@ The exporter also have the capability to directly scrape individual switches usi **Note:** In the context of this HELM Chart a query **MUST** be executed against a switch if possible. Any code submission that does not adhere to this convention will not be accepted. -#### ACI Exporter Quick Start +#### aci-exporter Quick Start -Before working on aci-exporter, Prometheus and Grafana at the same time I strongly suggest to take a look at the [ACI Exporter](https://github.com/opsdis/aci-exporter) git repo and understand how it works and how is configured. +Before working on aci-exporter, Prometheus and Grafana at the same time I strongly suggest to take a look at the [aci-exporter](https://github.com/opsdis/aci-exporter) git repo and understand how it works and how is configured. Here a complete example to get you started (you need to [install go](https://go.dev/doc/install)) @@ -48,7 +48,7 @@ fabrics: - https://apic1 - https://apic2 ``` -- ACI Exporter will, by default, load the queries it can execute from the `config.d` directory. For now we don't want that so we can start the exporter with this command that will just load the bare minimum config to access the fabric. +- The aci-exporter will, by default, load the queries it can execute from the `config.d` directory. For now, we don't want that so we can start the exporter with this command that will just load the bare minimum config to access the fabric. ```bash ./build/aci-exporter -config fab1.yaml -config_dir /dev/null @@ -57,7 +57,7 @@ fabrics: {"config_file":"/home/cisco/aci-exporter/fab1.yaml","level":"info","msg":"aci-exporter starting","port":9643,"read_timeout":0,"time":"2024-07-18T14:17:59+10:00","version":"undefined","write_timeout":0} ``` -- Now ACI Exporter is running on our host on port 9643, let's try a Service Discovery just run a HTTP request against the `/sd` URL. +- Now aci-exporter is running on our host on port 9643, let's try a Service Discovery just run an HTTP request against the `/sd` URL. ``` bash curl http://aci-exporter-ip:9643/sd @@ -95,7 +95,7 @@ This should return a list with all the Controllers and Switches in your fabric a Now let's try to build a query to check the `interface operation state and speed`. - The ACI Class we can use for this query is `ethpmPhysIf` -- This class is available both on the APIC as well as from the Switches: we will run this query **against the switchers** because it is the core principle for this HELM chart and it scales better. +- This class is available both on the APIC and on the Switches: we will run this query **against the switches** because it is the core principle for this HELM chart, and it scales better. - *Tip:* If you use Visual Studio Code you can install the `Thunder Client` to test API Calls. Every switch will return one `ethpmPhysIf` object for every interface. An example is provided below: @@ -171,14 +171,14 @@ Of all the various properties of `ethpmPhysIf` we need only 3: - `interface_type`: Physical, Port-Channel etc... - `interface`: The interface name, i.e. Eth1/1 -With these infos we can create 2 metrics that I am gonna call: +With these infos we can create 2 metrics that I am going to call: - `interface_oper_speed` - `interface_oper_state` -Both metrics will be labeled with the`interface_type` and `interface` (name). However we are faced with an issue... Promethesu can only ingest numbers so we can't just pass `40G` or `up` as a valid metric. +Both metrics will be labeled with the`interface_type` and `interface` (name). However, we are faced with an issue... Prometheus can only ingest numbers, so we can't just pass `40G` or `up` as a valid metric. -Thankfully one of the many ACI Exporter capabilities is to perform `value_transform` so we can write something like this: +Thankfully one of the many aci-exporter capabilities is to perform `value_transform` so we can write something like this: ```yaml value_transform: @@ -199,7 +199,7 @@ value_transform: ``` To convert text to numbers and allow Prometheus to ingest this data. -Lastly we need to also extract the `labels` from the `dn`. The format for this specific class is always something similar to `"sys/phys-[eth1/34]/phys"` to do this ACI Exporter employs RegEx, below an example: +Lastly we need to also extract the `labels` from the `dn`. The format for this specific class is always something similar to `"sys/phys-[eth1/34]/phys"` to do this aci-exporter employs RegEx, below an example: ```yaml labels: @@ -252,7 +252,7 @@ class_queries: regex: "^sys/(?P[a-z]+)-\\[(?P[^\\]]+)\\]/" ``` -Now Copy Paste this into the config file. +Now Copy/Paste this into the config file. Based on the service discovery we executed before we have all the required infos to run a query against a switch, the aci-exporter URL has the following format: @@ -295,7 +295,7 @@ Selection between APIC or Switches is done by using different re-labeling config To add a new query follow these steps: -- Develop a new ACI-Exporter query and test is with `curl` to ensure it returns the expected data +- Develop a new aci-exporter query and test is with `curl` to ensure it returns the expected data - Add the query to one of the files in the [config.d](../charts/aci-monitoring-stack/config.d) folder or create a new file if your query dosen't belong to any of the existing categoris. - add the query name in the `queries` list of the APIC or Switches inside the [ScrapeConfigs](../charts/aci-monitoring-stack/templates/prometheus/configmap-config.yaml). From dc4b4b4672403bdbacf430ea8852ab5ecb594571 Mon Sep 17 00:00:00 2001 From: andersh Date: Thu, 18 Jul 2024 12:08:21 +0200 Subject: [PATCH 3/3] docs: Minor grammar fixes --- docs/minikube.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/minikube.md b/docs/minikube.md index da40b7c..acb7810 100644 --- a/docs/minikube.md +++ b/docs/minikube.md @@ -2,7 +2,7 @@ This can be used to run aci-monitoring-stack locally (say on your laptop). -By default minikube only provide access locally and this is an issue for logs ingestion however for a lab you can configure HAProxy to expose you Minikube instance over the Host IP Address. This implies that you should configure all your External Services as `NodePort` and configure HAProxy to send the traffic to the correct `NodePort` +By default, minikube only provide access locally and this is an issue for logs ingestion however for a lab you can configure HAProxy to expose you Minikube instance over the Host IP Address. This implies that you should configure all your External Services as `NodePort` and configure HAProxy to send the traffic to the correct `NodePort` I have configured minikube with 4GB or RAM and 4 CPU and that was plenty to monitor a small 10 switch ACI Fabric. @@ -61,7 +61,7 @@ While installing Minikube I hit the following issues: ## minikube/podman wrong CNI Version -If minikube dosen't start and complains about the wrong CNI version for bridge open /etc/cni/net.d/11-crio-ipv4-bridge.conflist and set "cniVersion": "0.4.0" from 1.0.0 +If minikube doesn't start and complains about the wrong CNI version for bridge open /etc/cni/net.d/11-crio-ipv4-bridge.conflist and set "cniVersion": "0.4.0" from 1.0.0 ## Prometheus does not install under minikube/podman