Skip to content

Commit

Permalink
feat(new): Added Azure.AKS.MaintenanceWindow (Azure#2997)
Browse files Browse the repository at this point in the history
* feat(new): Added Azure.AKS.MaintenanceWindow

* fix: Fixed doc

* Update docs/en/rules/Azure.AKS.MaintenanceWindow.md

Co-authored-by: Bernie White <[email protected]>

* Update docs/en/rules/Azure.AKS.MaintenanceWindow.md

Co-authored-by: Bernie White <[email protected]>

* Update docs/en/rules/Azure.AKS.MaintenanceWindow.md

Co-authored-by: Bernie White <[email protected]>

* Update docs/en/rules/Azure.AKS.MaintenanceWindow.md

Co-authored-by: Bernie White <[email protected]>

* Update docs/en/rules/Azure.AKS.MaintenanceWindow.md

Co-authored-by: Bernie White <[email protected]>

* Update docs/en/rules/Azure.AKS.MaintenanceWindow.md

Co-authored-by: Bernie White <[email protected]>

* Update docs/en/rules/Azure.AKS.MaintenanceWindow.md

Co-authored-by: Bernie White <[email protected]>

* Update docs/en/rules/Azure.AKS.MaintenanceWindow.md

Co-authored-by: Bernie White <[email protected]>

* Update docs/en/rules/Azure.AKS.MaintenanceWindow.md

Co-authored-by: Bernie White <[email protected]>

* Update src/PSRule.Rules.Azure/rules/Azure.AKS.Rule.ps1

Co-authored-by: Bernie White <[email protected]>

* feat: Update test and doc

* Fix typo

* Adding in in-flight export

---------

Co-authored-by: Bernie White <[email protected]>
  • Loading branch information
BenjaminEngeset and BernieWhite authored Jul 25, 2024
1 parent df2d5ef commit 92539d9
Show file tree
Hide file tree
Showing 7 changed files with 443 additions and 0 deletions.
5 changes: 5 additions & 0 deletions docs/CHANGELOG-v1.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,11 @@ See [upgrade notes][1] for helpful information when upgrading from previous vers

## Unreleased

- New rules:
- Azure Kubernetes Service:
- Verify that clusters have the customer-controlled maintenance windows 'aksManagedAutoUpgradeSchedule' and 'aksManagedNodeOSUpgradeSchedule' configured by @BenjaminEngeset.
[#2444](https://github.com/Azure/PSRule.Rules.Azure/issues/2444)

What's changed since pre-release v1.39.0-B0009:

- New rules:
Expand Down
167 changes: 167 additions & 0 deletions docs/en/rules/Azure.AKS.MaintenanceWindow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
---
severity: Important
pillar: Reliability
category: RE:04 Target metrics
resource: Azure Kubernetes Service
online version: https://azure.github.io/PSRule.Rules.Azure/en/rules/Azure.AKS.MaintenanceWindow/
---

# Customer-controlled maintenance window configuration

## SYNOPSIS

Configure customer-controlled maintenance windows for AKS clusters.

## DESCRIPTION

AKS clusters undergo periodic maintenance automatically to ensure your applications remains secure, stable, and up-to-date.
This maintenance includes applying security updates, system upgrades, and software patches.

During peak load times, AKS clusters or workloads may already be scaled to their configured maximums or under stress.
As a result, rescheduling pods, or upgrading a node may take longer then normal.

Maintenance configurations provide a best-effort option that allows you to schedule planned maintenance operations to a predefined window.
This provides greater predictability over cluster operations so that maintenance during peak load times can be avoided when possible.
Noting that some critical or urgent maintenance operations may be performed outside the configured maintenance window.

AKS provides three (3) schedule configuration types are available for customer-controlled maintenance:

- `default` is a basic configuration for controlling AKS weekly releases.
- `aksManagedAutoUpgradeSchedule` controls when to schedule AKS Kubernetes version upgrades.
This configuration affects the schedule for when cluster auto-upgrades are applied based on your configured channel.
- `aksManagedNodeOSUpgradeSchedule` controls when to schedule AKS node OS upgrades.
This configuration affects the schedule for when AKS node OS auto-upgrades are applied based on your configured channel.

To read more about automated maintenance operations in AKS see the reference links below.

## RECOMMENDATION

Consider using a planned maintenance windows for AKS and node OS upgrades to avoid periods of high cluster utilization for improved reliability.
Do this by configuring the `aksManagedAutoUpgradeSchedule` and `aksManagedNodeOSUpgradeSchedule` maintenance configurations.

## EXAMPLES

### Configure with Azure template

- Deploy maintenance configurations for cluster version and node OS auto-upgrades.
- For cluster version auto-upgrades, set the `name` property to `aksManagedAutoUpgradeSchedule`.
- For node OS auto-upgrades, set the `name` property to `aksManagedNodeOSUpgradeSchedule`.

For cluster version auto-upgrades see the example:

```json
{
"type": "Microsoft.ContainerService/managedClusters/maintenanceConfigurations",
"apiVersion": "2024-03-02-preview",
"name": "[format('{0}/{1}', parameters('clusterName'), 'aksManagedAutoUpgradeSchedule')]",
"properties": {
"maintenanceWindow": {
"schedule": {
"weekly": {
"intervalWeeks": 1,
"dayOfWeek": "Sunday"
}
},
"durationHours": 4,
"utcOffset": "+00:00",
"startDate": "2024-07-15",
"startTime": "00:00"
}
},
"dependsOn": [
"[resourceId('Microsoft.ContainerService/managedClusters', parameters('clusterName'))]"
]
}
```

For node OS auto-upgrades see the example:

```json
{
"type": "Microsoft.ContainerService/managedClusters/maintenanceConfigurations",
"apiVersion": "2024-03-02-preview",
"name": "[format('{0}/{1}', parameters('clusterName'), 'aksManagedNodeOSUpgradeSchedule')]",
"properties": {
"maintenanceWindow": {
"schedule": {
"weekly": {
"intervalWeeks": 1,
"dayOfWeek": "Sunday"
}
},
"durationHours": 4,
"utcOffset": "+00:00",
"startDate": "2024-07-15",
"startTime": "00:00"
}
},
"dependsOn": [
"[resourceId('Microsoft.ContainerService/managedClusters', parameters('clusterName'))]"
]
}
```

### Configure with Bicep

To deploy AKS clusters that pass this rule:

- Deploy maintenance configurations for cluster version and node OS auto-upgrades.
- For cluster version auto-upgrades, set the `name` property to `aksManagedAutoUpgradeSchedule`.
- For node OS auto-upgrades, set the `name` property to `aksManagedNodeOSUpgradeSchedule`.

For cluster version auto-upgrades see the example:

```bicep
resource aksManagedAutoUpgradeSchedule 'Microsoft.ContainerService/managedClusters/maintenanceConfigurations@2024-03-02-preview' = {
parent: aks
name: 'aksManagedAutoUpgradeSchedule'
properties: {
maintenanceWindow: {
schedule: {
weekly: {
intervalWeeks: 1
dayOfWeek: 'Sunday'
}
}
durationHours: 4
utcOffset: '+00:00'
startDate: '2024-07-15'
startTime: '00:00'
}
}
}
```

For node OS auto-upgrades see the example:

```bicep
resource aksManagedNodeOSUpgradeSchedule 'Microsoft.ContainerService/managedClusters/maintenanceConfigurations@2024-03-02-preview' = {
parent: aks
name: 'aksManagedNodeOSUpgradeSchedule'
properties: {
maintenanceWindow: {
schedule: {
weekly: {
intervalWeeks: 1
dayOfWeek: 'Sunday'
}
}
durationHours: 4
utcOffset: '+00:00'
startDate: '2024-07-15'
startTime: '00:00'
}
}
}
```

## LINKS

- [RE:04 Target metrics](https://learn.microsoft.com/azure/well-architected/reliability/metrics)
- [Planned maintenance to schedule and control upgrades](https://learn.microsoft.com/azure/aks/planned-maintenance)
- [Automatically upgrade an Azure Kubernetes Service (AKS) cluster](https://learn.microsoft.com/azure/aks/auto-upgrade-cluster)
- [Auto-upgrade node OS images](https://learn.microsoft.com/azure/aks/auto-upgrade-node-os-image)
- [Patch and upgrade guidance](https://learn.microsoft.com/azure/architecture/operator-guides/aks/aks-upgrade-practices)
- [Create a maintenance window](https://learn.microsoft.com/azure/aks/planned-maintenance#create-a-maintenance-window)
- [Upgrade options](https://learn.microsoft.com/azure/aks/upgrade-cluster)
- [Azure deployment reference](https://learn.microsoft.com/azure/templates/microsoft.containerservice/managedclusters/maintenanceconfigurations)
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ internal sealed class ResourceExportVisitor
private const string PROPERTY_CONTAINERS = "containers";
private const string PROPERTY_SHARES = "shares";
private const string PROPERTY_TOPICS = "topics";
private const string PROPERTY_MAINTENANCECONFIGURATIONS = "maintenanceConfigurations";

private const string TYPE_CONTAINERSERVICE_MANAGEDCLUSTERS = "Microsoft.ContainerService/managedClusters";
private const string TYPE_CONTAINERREGISTRY_REGISTRIES = "Microsoft.ContainerRegistry/registries";
Expand Down Expand Up @@ -114,6 +115,7 @@ internal sealed class ResourceExportVisitor
private const string APIVERSION_2023_06_30 = "2023-06-30";
private const string APIVERSION_2023_09_01 = "2023-09-01";
private const string APIVERSION_2023_12_15_PREVIEW = "2023-12-15-preview";
private const string APIVERSION_2024_03_02_PREVIEW = "2024-03-02-preview";

private readonly ProviderData _ProviderData;

Expand Down Expand Up @@ -604,6 +606,9 @@ private static async Task<bool> VisitAKSCluster(ResourceContext context, JObject
}
}

// Get maintenance configurations
AddSubResource(resource, await GetSubResourcesByType(context, resourceId, PROPERTY_MAINTENANCECONFIGURATIONS, APIVERSION_2024_03_02_PREVIEW));

// Get diagnostic settings
await GetDiagnosticSettings(context, resource, resourceId);
return true;
Expand Down
1 change: 1 addition & 0 deletions src/PSRule.Rules.Azure/en/PSRule-rules.psd1
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
AKSPlatformLogs = "The diagnostic setting ({0}) logs should enable ({1})."
AKSAuditAdmin = "The diagnostic setting ({0}) should use 'kube-audit-admin' instead of the 'kube-audit' log category."
AKSEphemeralOSDiskNotConfigured = "The OS disk type 'Managed' should be of type 'Ephemeral'."
AKSMaintenanceWindow = "The cluster ({0}) should have the customer-controlled maintenance windows 'aksManagedAutoUpgradeSchedule' and 'aksManagedNodeOSUpgradeSchedule' configured."
SubnetNSGNotConfigured = "The subnet ({0}) has no NSG associated."
ServiceUrlNotHttps = "The service URL for '{0}' is not a HTTPS endpoint."
BackendUrlNotHttps = "The backend URL for '{0}' is not a HTTPS endpoint."
Expand Down
24 changes: 24 additions & 0 deletions src/PSRule.Rules.Azure/rules/Azure.AKS.Rule.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,30 @@ Rule 'Azure.AKS.AuditAdmin' -Ref 'AZR-000445' -Type 'Microsoft.ContainerService/
}
}

# Synopsis: Configure customer-controlled maintenance windows for AKS clusters.
Rule 'Azure.AKS.MaintenanceWindow' -Ref 'AZR-000446' -Type 'Microsoft.ContainerService/managedClusters' -Tag @{ release = 'GA'; ruleSet = '2024_09'; 'Azure.WAF/pillar' = 'Reliability'; } {
$maintenanceConfigs = @(GetSubResources -ResourceType 'Microsoft.ContainerService/managedClusters/maintenanceConfigurations')

$hasAutoUpgrade = $false
$hasNodeUpgrade = $false

foreach ($config in $maintenanceConfigs) {
if ($config.name -match 'aksManagedAutoUpgradeSchedule$') {
$hasAutoUpgrade = $true
}
elseif ($config.name -match 'aksManagedNodeOSUpgradeSchedule$') {
$hasNodeUpgrade = $true
}
}

if ($hasAutoUpgrade -and $hasNodeUpgrade) {
return $Assert.Pass()
}
else {
$Assert.Fail().Reason($LocalizedData.AKSMaintenanceWindow, $PSRule.TargetName)
}
}

#region Helper functions

function global:GetAgentPoolProfiles {
Expand Down
17 changes: 17 additions & 0 deletions tests/PSRule.Rules.Azure.Tests/Azure.AKS.Tests.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -543,6 +543,23 @@ Describe 'Azure.AKS' -Tag AKS {
$ruleResult.Length | Should -Be 9;
$ruleResult.TargetName | Should -Be 'cluster-A', 'cluster-B', 'cluster-C', 'cluster-D', 'cluster-F', 'cluster-G', 'cluster-H', 'cluster-K', 'cluster-L';
}

It 'Azure.AKS.MaintenanceWindow' {
$filteredResult = $result | Where-Object { $_.RuleName -eq 'Azure.AKS.MaintenanceWindow' };

# Fail
$ruleResult = @($filteredResult | Where-Object { $_.Outcome -eq 'Fail' });
$ruleResult.Length | Should -Be 9;
$ruleResult.TargetName | Should -Be 'cluster-A', 'cluster-B', 'cluster-D', 'cluster-G', 'cluster-H', 'cluster-I', 'cluster-J', 'cluster-K', 'cluster-L';

$ruleResult[0].Reason | Should -Be "The cluster (cluster-A) should have the customer-controlled maintenance windows 'aksManagedAutoUpgradeSchedule' and 'aksManagedNodeOSUpgradeSchedule' configured.";
$ruleResult[1].Reason | Should -Be "The cluster (cluster-B) should have the customer-controlled maintenance windows 'aksManagedAutoUpgradeSchedule' and 'aksManagedNodeOSUpgradeSchedule' configured.";

# Pass
$ruleResult = @($filteredResult | Where-Object { $_.Outcome -eq 'Pass' });
$ruleResult.Length | Should -Be 2;
$ruleResult.TargetName | Should -Be 'cluster-C', 'cluster-F';
}
}

Context 'Resource name' {
Expand Down
Loading

0 comments on commit 92539d9

Please sign in to comment.