narmidm · narmidm · Oct 31, 2024
diff --git a/Dockerfile b/Dockerfile
@@ -1,4 +1,4 @@
-FROM golang:1.19-alpine AS build
+FROM golang:1.19.2-alpine AS build
 
 WORKDIR /app
 
@@ -10,4 +10,8 @@ FROM alpine:latest
 
 COPY --from=build /app/cpu-stress /usr/local/bin/cpu-stress
 
+LABEL maintainer="narmidm"
+LABEL version="1.0.0"
+LABEL description="A tool to simulate CPU stress on Kubernetes pods."
+
 ENTRYPOINT ["cpu-stress"]
diff --git a/README.md b/README.md
@@ -166,6 +166,157 @@ spec:
 
 This manifest runs the `k8s-pod-cpu-stressor` as a Kubernetes Job, which will execute the stress test once for 5 minutes and then stop. The `backoffLimit` specifies the number of retries if the job fails.
 
+## Detailed Usage Examples
+
+Here are some detailed usage examples to help you better understand how to use the `k8s-pod-cpu-stressor`:
+
+### Example 1: Run CPU stress for 30 seconds with 50% CPU usage
+
+```shell
+docker run --rm k8s-pod-cpu-stressor -cpu=0.5 -duration=30s
+```
+
+### Example 2: Run CPU stress indefinitely with 80% CPU usage
+
+```shell
+docker run --rm k8s-pod-cpu-stressor -cpu=0.8 -forever
+```
+
+### Example 3: Run CPU stress for 1 minute with 10% CPU usage
+
+```shell
+docker run --rm k8s-pod-cpu-stressor -cpu=0.1 -duration=1m
+```
+
+## Step-by-Step Guide for Building and Running the Docker Image
+
+Follow these steps to build and run the Docker image for `k8s-pod-cpu-stressor`:
+
+1. Clone the repository:
+
+   ```shell
+   git clone https://github.com/narmidm/k8s-pod-cpu-stressor.git
+   cd k8s-pod-cpu-stressor
+   ```
+
+2. Build the Docker image:
+
+   ```shell
+   docker build -t k8s-pod-cpu-stressor .
+   ```
+
+3. Run the Docker container with desired parameters:
+
+   ```shell
+   docker run --rm k8s-pod-cpu-stressor -cpu=0.2 -duration=10s
+   ```
+
+## Using the Tool in a Kubernetes Environment
+
+To use the `k8s-pod-cpu-stressor` in a Kubernetes environment, you can create a deployment or a job using the provided sample manifests.
+
+### Sample Deployment Manifest
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: cpu-stressor-deployment
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: cpu-stressor
+  template:
+    metadata:
+      labels:
+        app: cpu-stressor
+    spec:
+      containers:
+        - name: cpu-stressor
+          image: narmidm/k8s-pod-cpu-stressor:latest
+          args:
+            - "-cpu=0.2"
+            - "-duration=10s"
+            - "-forever"
+          resources:
+            limits:
+              cpu: "200m"
+            requests:
+              cpu: "100m"
+```
+
+### Sample Job Manifest
+
+```yaml
+apiVersion: batch/v1
+kind: Job
+metadata:
+  name: cpu-stressor-job
+spec:
+  template:
+    metadata:
+      labels:
+        app: cpu-stressor
+    spec:
+      containers:
+        - name: cpu-stressor
+          image: narmidm/k8s-pod-cpu-stressor:latest
+          args:
+            - "-cpu=0.5"
+            - "-duration=5m"
+          resources:
+            limits:
+              cpu: "500m"
+            requests:
+              cpu: "250m"
+      restartPolicy: Never
+  backoffLimit: 3
+```
+
+## Troubleshooting and Common Issues
+
+### Issue 1: High CPU Usage
+
+If you experience unexpectedly high CPU usage, ensure that the `-cpu` parameter is set correctly. For example, `-cpu=0.2` represents 20% CPU usage.
+
+### Issue 2: Container Fails to Start
+
+If the container fails to start, check the Docker logs for error messages. Ensure that the `-duration` parameter is set to a valid duration value.
+
+### Issue 3: Kubernetes Pod Restarting
+
+If the Kubernetes pod keeps restarting, ensure that the resource requests and limits are set appropriately in the manifest. Adjust the values based on your cluster's capacity.
+
+## Advanced Usage Scenarios
+
+### Scenario 1: Using Horizontal Pod Autoscaler (HPA)
+
+To automatically scale the number of pod replicas based on CPU usage, you can use a Horizontal Pod Autoscaler (HPA). Here is an example HPA manifest:
+
+```yaml
+apiVersion: autoscaling/v1
+kind: HorizontalPodAutoscaler
+metadata:
+  name: cpu-stressor-hpa
+spec:
+  scaleTargetRef:
+    apiVersion: apps/v1
+    kind: Deployment
+    name: cpu-stressor-deployment
+  minReplicas: 1
+  maxReplicas: 10
+  targetCPUUtilizationPercentage: 80
+```
+
+### Scenario 2: Integrating with CI/CD Pipelines
+
+You can integrate the `k8s-pod-cpu-stressor` with your CI/CD pipelines for automated testing and monitoring. For example, you can use GitHub Actions to build and push the Docker image, and then deploy it to your Kubernetes cluster for stress testing.
+
+### Scenario 3: Monitoring with Prometheus and Grafana
+
+To monitor the resource usage of the `k8s-pod-cpu-stressor`, you can use Prometheus and Grafana. Set up Prometheus to scrape metrics from your Kubernetes cluster, and use Grafana to visualize the metrics. This helps identify bottlenecks and optimize resource allocation.
+
 ## Contributing
 
 Contributions are welcome! If you find a bug or have a suggestion, please open an issue or submit a pull request. For major changes, please discuss them first in the issue tracker.

diff --git a/hpa.yaml b/hpa.yaml
@@ -0,0 +1,12 @@
+apiVersion: autoscaling/v1
+kind: HorizontalPodAutoscaler
+metadata:
+  name: cpu-stressor-hpa
+spec:
+  scaleTargetRef:
+    apiVersion: apps/v1
+    kind: Deployment
+    name: cpu-stressor-deployment
+  minReplicas: 1
+  maxReplicas: 10
+  targetCPUUtilizationPercentage: 80
diff --git a/main.go b/main.go
@@ -9,14 +9,27 @@ import (
 	"runtime"
 	"sync/atomic"
 	"time"
+
+	"github.com/sirupsen/logrus"
 )
 
+var log = logrus.New()
+
 func main() {
 	cpuUsagePtr := flag.Float64("cpu", 0.2, "CPU usage as a fraction (e.g., 0.2 for 20% CPU usage)")
 	durationPtr := flag.Duration("duration", 10*time.Second, "Duration for the CPU stress (e.g., 10s)")
 	runForeverPtr := flag.Bool("forever", false, "Run CPU stress indefinitely")
 	flag.Parse()
 
+	// Validate input parameters
+	if *cpuUsagePtr <= 0 || *cpuUsagePtr > 1 {
+		log.Fatalf("Invalid CPU usage: %f. It must be between 0 and 1.", *cpuUsagePtr)
+	}
+
+	if *durationPtr <= 0 {
+		log.Fatalf("Invalid duration: %s. It must be greater than 0.", *durationPtr)
+	}
+
 	numCPU := runtime.NumCPU()
 	runtime.GOMAXPROCS(numCPU)
 
@@ -26,13 +39,15 @@ func main() {
 		numGoroutines = 1
 	}
 
-	fmt.Printf("Starting CPU stress with %d goroutines targeting %.2f CPU usage...\n", numGoroutines, *cpuUsagePtr)
+	log.Infof("Starting CPU stress with %d goroutines targeting %.2f CPU usage...", numGoroutines, *cpuUsagePtr)
 
 	done := make(chan struct{})
 
 	// Capture termination signals
 	quit := make(chan os.Signal, 1)
-	signal.Notify(quit, os.Interrupt, os.Kill)
+	if err := signal.Notify(quit, os.Interrupt, os.Kill); err != nil {
+		log.Fatalf("Failed to set up signal notification: %v", err)
+	}
 
 	var stopFlag int32
 
@@ -63,21 +78,21 @@ func main() {
 	go func() {
 		// Wait for termination signal
 		<-quit
-		fmt.Println("\nTermination signal received. Stopping CPU stress...")
+		log.Println("\nTermination signal received. Stopping CPU stress...")
 		atomic.StoreInt32(&stopFlag, 1)
 		close(done)
 	}()
 
 	if !*runForeverPtr {
 		time.Sleep(*durationPtr)
-		fmt.Println("\nCPU stress completed.")
+		log.Println("\nCPU stress completed.")
 		atomic.StoreInt32(&stopFlag, 1)
 		close(done)
 		// Keep the process running to prevent the pod from restarting
 		select {}
 	}
 
 	// Run stress indefinitely
-	fmt.Println("CPU stress will run indefinitely. Press Ctrl+C to stop.")
+	log.Println("CPU stress will run indefinitely. Press Ctrl+C to stop.")
 	<-done
 }
diff --git a/monitoring-tools.md b/monitoring-tools.md
@@ -0,0 +1,78 @@
+# Monitoring Tools for Kubernetes
+
+To effectively monitor and optimize the resource usage of your Kubernetes cluster, you can use monitoring tools like Prometheus and Grafana. These tools help collect and visualize resource usage metrics, allowing you to identify bottlenecks and make informed decisions about resource allocation.
+
+## Prometheus
+
+Prometheus is an open-source monitoring and alerting toolkit designed for reliability and scalability. It collects metrics from various sources and stores them in a time-series database. Prometheus can be used to monitor the resource usage of your Kubernetes cluster, including CPU and memory usage.
+
+### Installing Prometheus
+
+To install Prometheus in your Kubernetes cluster, you can use the Prometheus Operator, which simplifies the deployment and management of Prometheus instances. Follow these steps to install Prometheus using the Prometheus Operator:
+
+1. Add the Prometheus Operator Helm repository:
+
+   ```shell
+   helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
+   helm repo update
+   ```
+
+2. Install the Prometheus Operator:
+
+   ```shell
+   helm install prometheus-operator prometheus-community/kube-prometheus-stack
+   ```
+
+3. Verify the installation:
+
+   ```shell
+   kubectl get pods -n default -l "release=prometheus-operator"
+   ```
+
+### Configuring Prometheus
+
+Once Prometheus is installed, you need to configure it to scrape metrics from your Kubernetes cluster. The Prometheus Operator automatically configures Prometheus to scrape metrics from various Kubernetes components, including the kubelet, API server, and cAdvisor.
+
+To customize the Prometheus configuration, you can edit the `values.yaml` file used during the Helm installation. For example, you can add custom scrape configurations to collect metrics from additional endpoints.
+
+## Grafana
+
+Grafana is an open-source analytics and monitoring platform that integrates with Prometheus to visualize metrics. It provides a rich set of features for creating and sharing dashboards, setting up alerts, and exploring metrics data.
+
+### Installing Grafana
+
+Grafana is included in the Prometheus Operator installation, so you don't need to install it separately. To access the Grafana dashboard, follow these steps:
+
+1. Get the Grafana admin password:
+
+   ```shell
+   kubectl get secret prometheus-operator-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
+   ```
+
+2. Forward the Grafana service port to your local machine:
+
+   ```shell
+   kubectl port-forward svc/prometheus-operator-grafana 3000:80
+   ```
+
+3. Open your web browser and navigate to `http://localhost:3000`. Log in with the username `admin` and the password obtained in step 1.
+
+### Creating Dashboards
+
+Grafana provides a wide range of pre-built dashboards for Kubernetes monitoring. You can import these dashboards from the Grafana dashboard library or create custom dashboards to visualize the metrics collected by Prometheus.
+
+To import a pre-built dashboard, follow these steps:
+
+1. In the Grafana UI, click on the "+" icon in the left sidebar and select "Import".
+2. Enter the dashboard ID or URL from the Grafana dashboard library and click "Load".
+3. Select the Prometheus data source and click "Import".
+
+## Analyzing Metrics
+
+With Prometheus and Grafana set up, you can start analyzing the collected metrics to optimize resource allocation in your Kubernetes cluster. Here are some tips for analyzing metrics:
+
+- **Identify Bottlenecks**: Look for high CPU or memory usage in your pods and nodes. Identify the components that are consuming the most resources and investigate the root cause.
+- **Adjust Resource Requests and Limits**: Based on the observed resource usage, adjust the resource requests and limits in your Kubernetes manifests to ensure optimal resource allocation.
+- **Set Up Alerts**: Use Prometheus alerting rules to set up alerts for critical resource usage thresholds. Configure Grafana to send notifications when alerts are triggered.
+
+By using Prometheus and Grafana, you can gain valuable insights into the resource usage of your Kubernetes cluster and make informed decisions to optimize performance and resource allocation.
diff --git a/resource-quotas.yaml b/resource-quotas.yaml
@@ -0,0 +1,8 @@
+apiVersion: v1
+kind: ResourceQuota
+metadata:
+  name: cpu-stressor-quota
+spec:
+  hard:
+    requests.cpu: "1"
+    limits.cpu: "2"