Skip to content

Commit

Permalink
closes #22 and labs
Browse files Browse the repository at this point in the history
  • Loading branch information
camrossi committed Oct 10, 2024
1 parent 5c1e646 commit f8b415d
Show file tree
Hide file tree
Showing 21 changed files with 273 additions and 38 deletions.
2 changes: 1 addition & 1 deletion charts/aci-monitoring-stack/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.1.4
version: 0.1.5
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
{{- if $.Values.syslog.enabled }}
apiVersion: v1
kind: ConfigMap
metadata:
Expand Down Expand Up @@ -30,4 +31,5 @@ data:
scl.conf: |
@module appmodel
@include 'scl/*/*.conf'
@define java-module-dir "`module-install-dir`/java-modules"
@define java-module-dir "`module-install-dir`/java-modules"
{{- end }}
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
{{- if $.Values.syslog.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
Expand Down Expand Up @@ -32,4 +33,5 @@ spec:
volumes:
- name: {{ $.Release.Name }}-syslog-ng-config
configMap:
name: {{ $.Release.Name }}-syslog-ng-config
name: {{ $.Release.Name }}-syslog-ng-config
{{- end }}
2 changes: 2 additions & 0 deletions charts/aci-monitoring-stack/templates/syslog-ng/service.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
{{- if $.Values.syslog.enabled }}
{{- range $key, $values := .Values.syslog.services }}
---
apiVersion: v1
Expand Down Expand Up @@ -43,4 +44,5 @@ spec:
{{- end }}
selector:
app.kubernetes.io/component: {{ $.Release.Name }}-syslog-ng
{{- end }}
{{- end }}
3 changes: 3 additions & 0 deletions charts/aci-monitoring-stack/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,8 @@ grafana:
allowUiUpdates: true
alerts:
enabled: false
rbac:
create: false

promtail:
enabled: true
Expand Down Expand Up @@ -229,6 +231,7 @@ loki:
replication_factor: 1
limits_config:
discover_log_levels: false
discover_service_name: []
schemaConfig:
configs:
- from: 2024-04-01
Expand Down
24 changes: 0 additions & 24 deletions docs/4-fabric-example.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,6 @@ prometheus:
persistentVolume:
accessModes: ["ReadWriteOnce"]
size: 5Gi
# Run on high performance nodes (For context here high performance means not on my Raspberry Pis... ;) )
nodeSelector:
type: highperf

alertmanager:
baseURL: "http://aci-exporter-alertmanager.apps.c1.cam.ciscolabs.com"
Expand All @@ -55,8 +52,6 @@ prometheus:
paths:
- path: /
pathType: ImplementationSpecific
nodeSelector:
type: highperf
config:
route:
group_by: ['alertname']
Expand All @@ -79,8 +74,6 @@ grafana:
users:
viewers_can_edit: "True"
adminPassword: <adminPassword>
nodeSelector:
type: highperf
deploymentStrategy:
type: Recreate
ingress:
Expand All @@ -97,24 +90,15 @@ loki:
rulerConfig:
external_url: http://aci-exporter-grafana.apps.c1.cam.ciscolabs.com

minio:
nodeSelector:
type: highperf
backend:
replicas: 3
nodeSelector:
type: highperf
persistence:
enableStatefulSetAutoDeletePVC: true
size: 2Gi
read:
replicas: 3
nodeSelector:
type: highperf
write:
replicas: 3
nodeSelector:
type: highperf
persistence:
enableStatefulSetAutoDeletePVC: true
size: 2Gi
Expand All @@ -123,8 +107,6 @@ syslog:
services:
nsd-backbone:
name: nsd-backbone
labels:
type: bgp-ingress
containerPort: 1516
protocol: UDP
service:
Expand All @@ -135,26 +117,20 @@ promtail:
extraPorts:
Fab1:
name: fab1
labels:
type: bgp-ingress
containerPort: 1513
protocol: TCP
service:
type: LoadBalancer
port: 1514
Fab2:
name: fab2
labels:
type: bgp-ingress
containerPort: 1514
protocol: TCP
service:
type: LoadBalancer
port: 1514
Steve-UK:
name: steve-uk
labels:
type: bgp-ingress
containerPort: 1515
protocol: TCP
service:
Expand Down
28 changes: 25 additions & 3 deletions docs/LABDCN-2620/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,27 @@
# LABDCN-2620: Open Source Monitoring for Cisco ACI - Cisco Live APJC 2024
# LABDCN-2620: Open Source Monitoring for Cisco ACI

This section contains specific instruction on how to run the LABDCN-2620 Walk In Lab.
This lab runs on a pre-existing Kubernetes cluster and can support up to 30 concurrent students.
This section contains specific instruction on how to run the *LABDCN-2620* Walk In Lab for Cisco Live APJC 2024.
All the tasks aside the last one can be run without VPN access to the DMZ.
DMZ credentials will be available from the eXpo portal.


## Task 1 - Getting Familiar with the ACI Monitoring Stack

If this is your first time learning about the ACI monitoring stack you should start with the [README](../../README.md) that provides an overview of the Stack Architecture.
You do not need to deep dive in the details, unless you want to, but is good to have a generic understanding of the components used in the Stack.

Next head over the [Demo Environment](../demo-environment.md) documentation, as you read this section explore the dashboard that are available in the Demo Environment.

## Task 2 - Create a Dashboard

[Lab1](../labs/lab1.md): In this lab we are going to re-built the ACI Fault Dashboard

## Task 3 - Explore The Logs

[Lab2](../labs/lab2.md): In this lab we are going to use `Explore` to visualize the Logs Received by our ACI fabrics.

## Task 4 - Deploy the Monitoring Stack (Requires VPN Access)

The ACI Monitoring Stack can be deploy on any Kubernetes cluster by following the [Deployment](../deployment.md) instructions. Before proceeding you should familiarize yourself by reading the Deployment guide however for this lab I am providing a pre-configured environment where no major configuration will be required.

[DMZ Deployment](dmz-deploy.md) instructions
145 changes: 145 additions & 0 deletions docs/LABDCN-2620/dmz-deploy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
# Overview

In this lab we are going to deploy the ACI Monitoring Stack in the DMZ environment. The DMZ environment is already pre-configured with a K8s cluster that provides:

- An ingress controller to expose services via HTTPS
- Persistent Storage

## Connect to the Kubernetes Cluster

The first step consist into SSHing into the Linux server where we are going to deploy the stack.
The VPN and SSH details are present in the eXpo portal.

## Review the Config

After you have connected to the linux server move to the `aci-mon-stack-values` directory.

```shell
~ cd aci-mon-stack-values
➜ aci-mon-stack-values
```

Depending on your `<PODID>` inspect the `aci-mon-stack-values-pod-<PODID>.yaml` file for example `pod1` contains this

```yaml
aci_exporter:
# Defines 3 ACI fabric to probe with their credentials
fabrics:
site1:
apic:
- https://<IP>
password: <PASS>
service_discovery: oobMgmtAddr
username: aci-exporter
site2:
apic:
- https://<IP>
password: <PASS>
service_discovery: oobMgmtAddr
username: aci-exporter
site3:
apic:
- https://<IP>
password: <PASS>
service_discovery: oobMgmtAddr
username: aci-exporter
# Enable Grafana
grafana:
# Enable Grafana Ingress controller over the grafana.pod1.apps.minikube.dmz URL
ingress:
enabled: true
hosts:
- grafana.pod1.apps.minikube.dmz
adminPassword: <PASS>
defaultDashboardsEnabled: false
deploymentStrategy:
type: Recreate
enable: true
# Allocate 200Mi for grafana storage
persistence:
enabled: true
size: 200Mi
service:
enabled: true
type: ClusterIP
prometheus:
# Enable prometheus Ingress controller over the prom.pod1.apps.minikube.dmz URL
server:
ingress:
enabled: true
hosts:
- prom.pod1.apps.minikube.dmz
baseURL: "http://prom.pod1.apps.minikube.dmz"

# Allocate 200Mi for prometheus storage
persistentVolume:
accessModes:
- ReadWriteOnce
size: 200Mi
service:
retentionSize: 200Mi
alertmanager:
# Allocate 200Mi for alertmanager storage
persistence:
size: 200Mi
baseURL: "http://alertmanager.pod1.apps.minikube.dmz"

# Enable alertmanager Ingress controller over the prom.pod1.apps.minikube.dmz URL
ingress:
enabled: true
hosts:
- host: alertmanager.pod1.apps.minikube.dmz
paths:
- path: /
pathType: ImplementationSpecific

# For this lab I am not enabling Syslog collection
loki:
enabled: false
promtail:
enabled: false
syslog:
enabled: false
```
In order to deploy your stack we first need to have the HELM repository configured, to do so execute the following:
```shell
helm repo add aci-monitoring-stack https://datacenter.github.io/aci-monitoring-stack
helm repo update
```

If you get a message stating `"aci-monitoring-stack" already exists with the same configuration, skipping` it simply means you are not the first student of the day.

Next we can deploy the stack with this single line, please **be careful** to use replace <PODID> with your PODID

```
helm -n pod-<PODID>-aci-monitoring-stack upgrade --install --create-namespace pod-<PODID>-aci-monitoring-stack aci-monitoring-stack/aci-monitoring-stack -f aci-mon-stack-values-pod-<PODID>.yaml
```

Now you can check with kubectl and see if your POD are deployed in this below example `<PODID> == 2`

```
➜ aci-mon-stack-values kubectl -n pod-2-aci-monitoring-stack get pod
NAME READY STATUS RESTARTS AGE
pod-2-aci-monitoring-stack-aci-exporter-f7dfdc997-bswv2 1/1 Running 0 10m
pod-2-aci-monitoring-stack-alertmanager-0 1/1 Running 0 10m
pod-2-aci-monitoring-stack-grafana-7f766c95cd-khxnc 3/3 Running 0 10m
pod-2-aci-monitoring-stack-prometheus-server-5867fb886-jxs66 2/2 Running 0 10m
```

This should also have created the required `Ingress` routes

```
➜ aci-mon-stack-values kubectl -n pod-2-aci-monitoring-stack get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
pod-2-aci-monitoring-stack-alertmanager traefik alertmanager.pod2.apps.minikube.dmz 172.16.0.210 80 11m
pod-2-aci-monitoring-stack-grafana traefik grafana.pod2.apps.minikube.dmz 172.16.0.210 80 11m
pod-2-aci-monitoring-stack-prometheus-server traefik prom.pod2.apps.minikube.dmz 172.16.0.210 80 11m
```

You should now be able to access the Grafana UI from your browser, you **MUST use HTTPS** as the connections are terminated on a reverse proxy. All the URL will be in the format of
`https://grafana.pod<PODID>.apps.minikube.dmz`


2 changes: 1 addition & 1 deletion docs/demo-environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ All the Dashboards are located in the `ACI` Folder in the `Dashboards` section o
These dashboards are using `Prometheus` as data source meaning the data we are visualizing came from an ACI Managed Object and was translated by the `aci-exporter`

### ACI Faults
This dashboard is a 1:1 copy of the faults that are present inside ACI. The main advantages copmpared to looking at the faults in the ACI UI are:
This dashboard is a 1:1 copy of the faults that are present inside ACI. The main advantages compared to looking at the faults in the ACI UI are:
- the ability to aggregating Faults from Multiple Fabrics in a single table
- allowing advanced sorting and filtering

Expand Down
5 changes: 3 additions & 2 deletions docs/deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,7 @@ syslog:
If you need a reminder on how to configure ACI Syslog take a look [Here](syslog.md)

## Example Config for 4 Fabrics
Here you can see an [Example Config for 4 Fabrics](docs/4-fabric-example.yaml)
Here you can see an [Example Config for 4 Fabrics](4-fabric-example.yaml)

# Chart Deployment

Expand All @@ -275,4 +275,5 @@ Once the configuration file is generated i.e.: `aci-mon-stack-config.yaml` Helm
```shell
helm repo add aci-monitoring-stack https://datacenter.github.io/aci-monitoring-stack
helm repo update
helm -n aci-mon-stack upgrade --install --create-namespace aci-mon-stack aci-monitoring-stack/aci-monitoring-stack -f aci-mon-stack-config.yaml
helm -n aci-mon-stack upgrade --install --create-namespace aci-mon-stack aci-monitoring-stack/aci-monitoring-stack -f aci-mon-stack-config.yaml
```
2 changes: 1 addition & 1 deletion docs/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -296,7 +296,7 @@ Selection between APIC or Switches is done by using different re-labeling config
To add a new query follow these steps:

- Develop a new aci-exporter query and test is with `curl` to ensure it returns the expected data
- Add the query to one of the files in the [config.d](../charts/aci-monitoring-stack/config.d) folder or create a new file if your query dosen't belong to any of the existing categoris.
- Add the query to one of the files in the [config.d](../charts/aci-monitoring-stack/config.d) folder or create a new file if your query doesn't belong to any of the existing categories.
- add the query name in the `queries` list of the APIC or Switches inside the [ScrapeConfigs](../charts/aci-monitoring-stack/templates/prometheus/configmap-config.yaml).

Below a scrape config example:
Expand Down
Binary file added docs/labs/images/lab2/explore.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/labs/images/lab2/log-details.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/labs/images/lab2/logs-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/labs/images/lab2/loki-builder.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/labs/images/lab2/multi-fabric-logs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/labs/images/lab2/select-loki.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/labs/images/lab2/ui-filter-result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/labs/images/lab2/ui-filter.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit f8b415d

Please sign in to comment.