Skip to content

Commit

Permalink
Add services parameter to 3 more checks and separate K8s cluster cert…
Browse files Browse the repository at this point in the history
… check (#1010)

Co-authored-by: abhishek-unskript <[email protected]>
  • Loading branch information
1 parent 5f8f6c4 commit 6b0afa9
Show file tree
Hide file tree
Showing 14 changed files with 205 additions and 116 deletions.
6 changes: 3 additions & 3 deletions Kubernetes/legos/k8s_check_service_pvc_utilization/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@
This check fetches the PVC associated with a given service, determines its utilized size, and then compares it to its total capacity. If the used percentage exceeds the provided threshold, it triggers an alert.

## Lego Details
k8s_check_service_pvc_utilization(handle, service_name: str = "", namespace: str = "", threshold: int = 80)
k8s_check_service_pvc_utilization(handle, core_services: list, namespace: str = "", threshold: int = 80)
handle: Object of type unSkript K8S Connector.
service_name: The name of the service.
core_services: List of services to check PVC utilization
threshold: Percentage threshold for utilized PVC disk size. E.g., a 80% threshold checks if the utilized space exceeds 80% of the total PVC capacity.
namespace: The namespace in which the service resides.


## Lego Input
This Lego takes inputs handle, service_name, namespace, threshold.
This Lego takes inputs handle, core_services, namespace, threshold.

## Lego Output
Here is a sample output.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,10 @@
from kubernetes.client.rest import ApiException

class InputSchema(BaseModel):
namespace: Optional[str] = Field(..., description='The namespace in which the service resides.', title='Namespace')
service_name: Optional[str] = Field(
namespace: str = Field(..., description='The namespace in which the service resides.', title='Namespace')
core_services: list = Field(
...,
description='The name of the service for which the used PVC size needs to be checked.',
description='List of services for which the used PVC size needs to be checked.',
title='K8s Sservice name',
)
threshold: Optional[int] = Field(
Expand All @@ -34,7 +34,7 @@ def k8s_check_service_pvc_utilization_printer(output):
print(f"PVC: {pvc['pvc_name']} - Utilized: {pvc['used']} of {pvc['capacity']}")
print("-" * 40)

def k8s_check_service_pvc_utilization(handle, service_name: str = "", namespace: str = "", threshold: int = 80) -> Tuple:
def k8s_check_service_pvc_utilization(handle, core_services: list, namespace:str, threshold: int = 80) -> Tuple:
"""
k8s_check_service_pvc_utilization checks the utilized disk size of a service's PVC against a given threshold.
Expand All @@ -57,37 +57,11 @@ def k8s_check_service_pvc_utilization(handle, service_name: str = "", namespace:
:return: Status and dictionary with PVC name and its size information if the PVC's disk size is below the threshold.
"""
# Fetch namespace based on service name
if service_name and not namespace:
get_service_namespace_command = f"kubectl get service {service_name} -o=jsonpath='{{.metadata.namespace}}'"
response = handle.run_native_cmd(get_service_namespace_command)
if not response or response.stderr:
raise ApiException(f"Error fetching namespace for service {service_name}: {response.stderr if response else 'empty response'}")
namespace = response.stdout.strip()
print(f"Service {service_name} belongs to namespace: {namespace}")

# Get current context's namespace if not provided
if not namespace:
get_ns_command = "kubectl config view --minify --output 'jsonpath={..namespace}'"
response = handle.run_native_cmd(get_ns_command)
if not response or response.stderr:
raise ApiException(f"Error fetching current namespace: {response.stderr if response else 'empty response'}")
namespace = response.stdout.strip() or "default"
print(f"Operating in the current namespace: {namespace}")

# Get all services in the namespace if service_name is not specified
services_to_check = [service_name] if service_name else []
if not service_name:
get_all_services_command = f"kubectl get svc -n {namespace} -o=jsonpath='{{.items[*].metadata.name}}'"
response = handle.run_native_cmd(get_all_services_command)
if not response or response.stderr:
raise ApiException(f"Error fetching services in namespace {namespace}: {response.stderr if response else 'empty response'}")
services_to_check = response.stdout.strip().split()

alert_pvcs_all_services = []
services_without_pvcs = []

for svc in services_to_check:
for svc in core_services:
# Get label associated with the service
get_service_labels_command = f"kubectl get services {svc} -n {namespace} -o=jsonpath='{{.spec.selector}}'"
response = handle.run_native_cmd(get_service_labels_command)
Expand Down
7 changes: 4 additions & 3 deletions Kubernetes/legos/k8s_detect_service_crashes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,15 @@ This action detects service crashes by checking the logs of each pod for specifi

## Lego Details

k8s_detect_service_crashes(handle, namespace: str = '', tail_lines: int = 100)
k8s_detect_service_crashes(handle, namespace: str, core_services: list, tail_lines: int = 100)

handle: Object of type unSkript K8S Connector
namespace: Kubernetes namespace (Optional)
namespace: Kubernetes namespace
core_services: List of services to detect service crashes
tail_lines: Number of log lines to fetch from each container. Defaults to 100.

## Lego Input
This Lego take four input handle, namespace, tail_lines.
This Lego take 4 inputs handle, namespace, tail_lines, core_services.

## Lego Output
Here is a sample output.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@
from tabulate import tabulate

class InputSchema(BaseModel):
namespace: Optional[str] = Field(
'',
namespace: str = Field(
description='K8S Namespace',
title='K8S Namespace'
)
Expand All @@ -20,6 +19,9 @@ class InputSchema(BaseModel):
description='Number of log lines to fetch from each container. Defaults to 100.',
title='No. of lines (Default: 100)'
)
core_services: list = Field(
description='List of services to detect service crashes on.'
)

def k8s_detect_service_crashes_printer(output):
status, data = output
Expand All @@ -33,7 +35,7 @@ def k8s_detect_service_crashes_printer(output):



def k8s_detect_service_crashes(handle, namespace: str = '', tail_lines: int = 100) -> Tuple:
def k8s_detect_service_crashes(handle, namespace: str, core_services:list, tail_lines: int = 100) -> Tuple:
"""
k8s_detect_service_crashes detects service crashes by checking the logs of each pod for specific error messages.
Expand All @@ -53,42 +55,48 @@ def k8s_detect_service_crashes(handle, namespace: str = '', tail_lines: int = 10
"Exception"
# Add more error patterns here as necessary
]
ERROR_PATTERNS = ["Worker exiting", "Exception"] # Add more error patterns as necessary
crash_logs = []

# Retrieve all services and pods in the namespace just once
kubectl_cmd = f"kubectl -n {namespace} get services,pods -o json"
try:
kubectl_cmd = "kubectl "
if namespace:
kubectl_cmd += f"-n {namespace} "
kubectl_cmd += "get services,pods -o json"
response = handle.run_native_cmd(kubectl_cmd)
services_and_pods = {}
services_and_pods = json.loads(response.stdout.strip())["items"]
except json.JSONDecodeError as json_err:
print(f"Error parsing JSON response: {str(json_err)}")
return (True, None) # Return early if we can't parse the JSON at all
except Exception as e:
print(f"Unexpected error while fetching services and pods: {str(e)}")
return (True, None)

if response:
response = response.stdout.strip()
services_and_pods = json.loads(response)["items"]
for service_name_to_check in core_services:
service_found = False
for item in services_and_pods:
if item.get("kind") == "Service":
pass
service_name = item.get("metadata", {}).get("name", None)
if item.get("kind") == "Service" and item.get("metadata", {}).get("name") == service_name_to_check:
service_found = True
pod_labels = item.get('spec', {}).get("selector", None)
if pod_labels and service_name:
if pod_labels:
pod_selector = ",".join([f"{key}={value}" for key, value in pod_labels.items()])
try:
kubectl_logs_cmd = "kubectl "
if namespace:
kubectl_logs_cmd += f"-n {namespace}"
kubectl_logs_cmd += f"logs --selector {pod_selector} --tail={tail_lines}"
pod_logs = handle.run_native_cmd(kubectl_logs_cmd)
pod_logs = pod_logs.stdout.strip()
crash_logs = [{
"pod": item.get('metadata', {}).get('name', 'N/A'),
"namespace": item.get('metadata', {}).get('namespace', 'N/A'),
"error": error_pattern,
"timestamp": re.findall(r"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}", pod_logs)[-1] if re.search(error_pattern, pod_logs) else "Unknown Time"
} for error_pattern in ERROR_PATTERNS if re.search(error_pattern, pod_logs)]
kubectl_logs_cmd = f"kubectl -n {namespace} logs --selector {pod_selector} --tail={tail_lines}"
pod_logs = handle.run_native_cmd(kubectl_logs_cmd).stdout.strip()

for error_pattern in ERROR_PATTERNS:
if re.search(error_pattern, pod_logs):
crash_logs.append({
"service": service_name_to_check,
"pod": item.get('metadata', {}).get('name', 'N/A'),
"namespace": item.get('metadata', {}).get('namespace', 'N/A'),
"error": error_pattern,
"timestamp": re.findall(r"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}", pod_logs)[-1] if re.search(error_pattern, pod_logs) else "Unknown Time"
})
except Exception as e:
raise e

except Exception as e:
raise e

# Log the error but don't stop execution
print(f"Error fetching logs for service {service_name_to_check}: {str(e)}")
pass

if not service_found:
print(f"Service {service_name_to_check} not found in namespace {namespace}. Continuing with next service.")

return (False, crash_logs) if crash_logs else (True, None)
27 changes: 27 additions & 0 deletions Kubernetes/legos/k8s_get_expiring_cluster_certificate/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
[<img align="left" src="https://unskript.com/assets/favicon.png" width="100" height="100" style="padding-right: 5px">](https://unskript.com/assets/favicon.png)
<h1>Check the valifity of K8s certificate for a cluster. </h1>

## Description
This action checks if the certificate is expiring for a K8s cluster.


## Lego Details

k8s_get_expiring_cluster_certificate(handle, expiring_threshold: int = 7)

handle: Object of type unSkript K8S Connector
expiration_threshold (int): The threshold (in days) for considering a certificate as expiring soon.

## Lego Input

This Lego take three inputs handle, expiration_threshold.


## Lego Output
Here is a sample output.
<img src="./1.png">


## See it in Action

You can see this Lego in action following this link [unSkript Live](https://us.app.unskript.io)
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
{
"action_title": "Get expiring K8s certificates",
"action_description": "Get the expiring certificates for a K8s cluster.",
"action_title": "Check expiry of K8s cluster certificate",
"action_description": "Check expiry of K8s cluster certificate",
"action_type": "LEGO_TYPE_K8S",
"action_entry_function": "k8s_get_expiring_certificates",
"action_entry_function": "k8s_get_expiring_cluster_certificate",
"action_is_check": true,
"action_needs_credential": true,
"action_supports_poll": true,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
##
# Copyright (c) 2023 unSkript, Inc
# All rights reserved.
##
from pydantic import BaseModel, Field
from typing import Optional, Tuple
import base64
import datetime
from cryptography import x509
from cryptography.hazmat.backends import default_backend


class InputSchema(BaseModel):
expiring_threshold: Optional[int] = Field(
default=7,
title='Expiration Threshold (in days)',
description='Expiration Threshold of certificates (in days). Default- 90 days')

def k8s_get_expiring_cluster_certificate_printer(output):
if output is None:
return
success, data = output
if not success:
print(data)
else:
print("K8s certificate is valid.")

def get_expiry_date(pem_data: str) -> datetime.datetime:
cert = x509.load_pem_x509_certificate(pem_data.encode(), default_backend())
return cert.not_valid_after

def k8s_get_expiring_cluster_certificate(handle, expiring_threshold:int=7) -> Tuple:
"""
Check the validity for a K8s cluster certificate.
Args:
handle: Object of type unSkript K8S Connector
expiration_threshold (int): The threshold (in days) for considering a certificate as expiring soon.
Returns:
tuple: Status, details of the certificate.
"""
result = []
try:
# Fetch cluster CA certificate
ca_cert = handle.run_native_cmd("kubectl get secret -o jsonpath=\"{.items[?(@.type=='kubernetes.io/service-account-token')].data['ca\\.crt']}\" --all-namespaces")
if ca_cert.stderr:
raise Exception(f"Error occurred while fetching cluster CA certificate: {ca_cert.stderr}")

# Decode and check expiry date of the cluster's CA certificate
ca_cert_decoded = base64.b64decode(ca_cert.stdout.strip()).decode("utf-8")
ca_cert_exp = get_expiry_date(ca_cert_decoded)
days_remaining = (ca_cert_exp - datetime.datetime.now()).days
if days_remaining < 0:
# Certificate has already expired
result.append({
"certificate": "Kubeconfig Cluster certificate",
"days_remaining": days_remaining,
"status": "Expired"
})
elif ca_cert_exp < datetime.datetime.now() + datetime.timedelta(days=expiring_threshold):
result.append({
"certificate": "Kubeconfig Cluster certificate",
"days_remaining": days_remaining,
"status": "Expiring Soon"
})
except Exception as e:
print(f"Error occurred while checking cluster CA certificate: {e}")
raise e

if len(result) != 0:
return (False, result)
return (True, None)
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
[<img align="left" src="https://unskript.com/assets/favicon.png" width="100" height="100" style="padding-right: 5px">](https://unskript.com/assets/favicon.png)
<h1>Get All Deployment Status From Namespace </h1>
<h1>Get the expiring TLS secret certificates for a K8s cluster. </h1>

## Description
This action gets the expiring certificates for a K8s cluster.


## Lego Details

k8s_get_expiring_certificates(handle, expiring_threshold: int = 90, namespace: str = "")
k8s_get_expiring_tls_secret_certificates(handle, namespace:str='', expiring_threshold:int=7)

handle: Object of type unSkript K8S Connector
namespace (str) : Optional - k8s namespace.
expiration_threshold (int): The threshold (in seconds) for considering a certificate as expiring soon.
expiration_threshold (int): The threshold (in days) for considering a certificate as expiring soon.

## Lego Input

This Lego take three inputs handle, deployment and expiration_threshold.
This Lego take three inputs handle, namespace and expiration_threshold.


## Lego Output
Expand Down
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"action_title": "Get expiring secret certificates",
"action_description": "Get the expiring secret certificates for a K8s cluster.",
"action_type": "LEGO_TYPE_K8S",
"action_entry_function": "k8s_get_expiring_tls_secret_certificates",
"action_is_check": true,
"action_needs_credential": true,
"action_supports_poll": true,
"action_supports_iteration": true,
"action_output_type": "ACTION_OUTPUT_TYPE_LIST",
"action_categories": [ "CATEGORY_TYPE_CLOUDOPS", "CATEGORY_TYPE_DEVOPS", "CATEGORY_TYPE_SRE" ,"CATEGORY_TYPE_K8S"],
"action_next_hop": [""],
"action_next_hop_parameter_mapping": {}
}
Loading

0 comments on commit 6b0afa9

Please sign in to comment.