-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
azurerm_kubernetes_cluster_node_pool
: Adds support for temporary_name_for_rotation
#27791
base: main
Are you sure you want to change the base?
Conversation
Updates the NodePoolUpdate function to rotate the node pool. Removes the ForceNew flag on properties.
Restoring name as ForceNew.
Deleting obsolete methods. Renaming `retrySystemNodePoolCreation` to `retryNodePoolCreation`.
Test Logsmake acctests SERVICE='containers' TESTARGS='-run=TestAccKubernetesClusterNodePool_' |
Note on test cases
|
azurerm_kubernetes_cluster_node_pool
: Adds support for
temporary_name_for_rotation`azurerm_kubernetes_cluster_node_pool
: Adds support for temporary_name_for_rotation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR @CorrenSoft!
In addition to the comments and suggestions in-line, we also need to consider the behaviour when rotation of the node pool fails e.g. what happens if we fail to spin up the node pool with the new configuration and are left with the temporary node pool? It would be good if failures here are recoverable for the user.
The azurerm_kubernetes_cluster
handles this by falling back on the temporary_name_for_rotation
system node pool when we perform a read.
Test cases that simulate these failure scenarios should be added for this as well, I've linked two tests we wrote for the AKS resource below that can help you with those:
terraform-provider-azurerm/internal/services/containers/kubernetes_cluster_scaling_resource_test.go
Lines 102 to 164 in eeb928f
func TestAccKubernetesCluster_updateVmSizeAfterFailureWithTempWithoutDefault(t *testing.T) { | |
data := acceptance.BuildTestData(t, "azurerm_kubernetes_cluster", "test") | |
r := KubernetesClusterResource{} | |
data.ResourceTest(t, r, []acceptance.TestStep{ | |
{ | |
Config: r.basicWithTempName(data), | |
Check: acceptance.ComposeTestCheckFunc( | |
check.That(data.ResourceName).ExistsInAzure(r), | |
// create the temporary node pool and delete the old default node pool to simulate the case where resizing fails when trying to bring up the new node pool | |
data.CheckWithClientForResource(func(ctx context.Context, clients *clients.Client, state *terraform.InstanceState) error { | |
if _, ok := ctx.Deadline(); !ok { | |
var cancel context.CancelFunc | |
ctx, cancel = context.WithTimeout(ctx, 1*time.Hour) | |
defer cancel() | |
} | |
client := clients.Containers.AgentPoolsClient | |
id, err := commonids.ParseKubernetesClusterID(state.Attributes["id"]) | |
if err != nil { | |
return err | |
} | |
defaultNodePoolId := agentpools.NewAgentPoolID(id.SubscriptionId, id.ResourceGroupName, id.ManagedClusterName, state.Attributes["default_node_pool.0.name"]) | |
resp, err := client.Get(ctx, defaultNodePoolId) | |
if err != nil { | |
return fmt.Errorf("retrieving %s: %+v", defaultNodePoolId, err) | |
} | |
if resp.Model == nil { | |
return fmt.Errorf("retrieving %s: model was nil", defaultNodePoolId) | |
} | |
tempNodePoolName := "temp" | |
profile := resp.Model | |
profile.Name = &tempNodePoolName | |
profile.Properties.VMSize = pointer.To("Standard_DS3_v2") | |
tempNodePoolId := agentpools.NewAgentPoolID(id.SubscriptionId, id.ResourceGroupName, id.ManagedClusterName, tempNodePoolName) | |
if err := client.CreateOrUpdateThenPoll(ctx, tempNodePoolId, *profile); err != nil { | |
return fmt.Errorf("creating %s: %+v", tempNodePoolId, err) | |
} | |
if err := client.DeleteThenPoll(ctx, defaultNodePoolId); err != nil { | |
return fmt.Errorf("deleting default %s: %+v", defaultNodePoolId, err) | |
} | |
return nil | |
}, data.ResourceName), | |
), | |
// the plan will show that the default node pool name has been set to "temp" and we're trying to set it back to "default" | |
ExpectNonEmptyPlan: true, | |
}, | |
{ | |
Config: r.updateVmSize(data, "Standard_DS3_v2"), | |
Check: acceptance.ComposeTestCheckFunc( | |
check.That(data.ResourceName).ExistsInAzure(r), | |
), | |
}, | |
data.ImportStep("default_node_pool.0.temporary_name_for_rotation"), | |
}) | |
} |
terraform-provider-azurerm/internal/services/containers/kubernetes_cluster_scaling_resource_test.go
Lines 44 to 99 in eeb928f
func TestAccKubernetesCluster_updateVmSizeAfterFailureWithTempAndDefault(t *testing.T) { | |
data := acceptance.BuildTestData(t, "azurerm_kubernetes_cluster", "test") | |
r := KubernetesClusterResource{} | |
data.ResourceTest(t, r, []acceptance.TestStep{ | |
{ | |
Config: r.basic(data), | |
Check: acceptance.ComposeTestCheckFunc( | |
check.That(data.ResourceName).ExistsInAzure(r), | |
// create the temporary node pool to simulate the case where both old default node pool and temp node pool exist | |
data.CheckWithClientForResource(func(ctx context.Context, clients *clients.Client, state *terraform.InstanceState) error { | |
if _, ok := ctx.Deadline(); !ok { | |
var cancel context.CancelFunc | |
ctx, cancel = context.WithTimeout(ctx, 1*time.Hour) | |
defer cancel() | |
} | |
client := clients.Containers.AgentPoolsClient | |
id, err := commonids.ParseKubernetesClusterID(state.Attributes["id"]) | |
if err != nil { | |
return err | |
} | |
defaultNodePoolId := agentpools.NewAgentPoolID(id.SubscriptionId, id.ResourceGroupName, id.ManagedClusterName, state.Attributes["default_node_pool.0.name"]) | |
resp, err := client.Get(ctx, defaultNodePoolId) | |
if err != nil { | |
return fmt.Errorf("retrieving %s: %+v", defaultNodePoolId, err) | |
} | |
if resp.Model == nil { | |
return fmt.Errorf("retrieving %s: model was nil", defaultNodePoolId) | |
} | |
tempNodePoolName := "temp" | |
profile := resp.Model | |
profile.Name = &tempNodePoolName | |
profile.Properties.VMSize = pointer.To("Standard_DS3_v2") | |
tempNodePoolId := agentpools.NewAgentPoolID(id.SubscriptionId, id.ResourceGroupName, id.ManagedClusterName, tempNodePoolName) | |
if err := client.CreateOrUpdateThenPoll(ctx, tempNodePoolId, *profile); err != nil { | |
return fmt.Errorf("creating %s: %+v", tempNodePoolId, err) | |
} | |
return nil | |
}, data.ResourceName), | |
), | |
}, | |
{ | |
Config: r.updateVmSize(data, "Standard_DS3_v2"), | |
Check: acceptance.ComposeTestCheckFunc( | |
check.That(data.ResourceName).ExistsInAzure(r), | |
), | |
}, | |
data.ImportStep("default_node_pool.0.temporary_name_for_rotation"), | |
}) |
I hope that makes sense, let me know if you have any questions!
} | ||
|
||
if d.HasChange("kubelet_config") { | ||
if kubeletConfig := d.Get("kubelet_config").([]interface{}); len(kubeletConfig) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen here if a user had kubelet_config
defined in their configuration and then removed it completely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't be updated.
Fixed!
} | ||
|
||
if d.HasChange("linux_os_config") { | ||
if linuxOSConfig := d.Get("linux_os_config").([]interface{}); len(linuxOSConfig) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here, what happens if a user had linux_os_config
defined in their configuration and then removed it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same answer ;)
@@ -723,10 +718,41 @@ func resourceKubernetesClusterNodePoolUpdate(d *pluginsdk.ResourceData, meta int | |||
props.EnableAutoScaling = utils.Bool(enableAutoScaling) | |||
} | |||
|
|||
if d.HasChange("fips_enabled") { | |||
props.EnableFIPS = utils.Bool(d.Get("fips_enabled").(bool)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we're using the utils.Bool
/utils.String
etc. elsewhere in this resource, but these functions are actually deprecated and can all be replaced by pointer.To
. Could you update your changes to use pointer.To
?
props.EnableFIPS = utils.Bool(d.Get("fips_enabled").(bool)) | |
props.EnableFIPS = pointer.To(d.Get("fips_enabled").(bool)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied!
var subnetID *commonids.SubnetId | ||
if subnetIDValue, ok := d.GetOk("vnet_subnet_id"); ok { | ||
subnetID, err = commonids.ParseSubnetID(subnetIDValue.(string)) | ||
if err != nil { | ||
return err | ||
} | ||
if subnetID != nil { | ||
props.VnetSubnetID = utils.String(subnetID.ID()) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can safely assume that subnetID
will not be nil
is it has been parsed correctly, i.e. err = nil
var subnetID *commonids.SubnetId | |
if subnetIDValue, ok := d.GetOk("vnet_subnet_id"); ok { | |
subnetID, err = commonids.ParseSubnetID(subnetIDValue.(string)) | |
if err != nil { | |
return err | |
} | |
if subnetID != nil { | |
props.VnetSubnetID = utils.String(subnetID.ID()) | |
} | |
} | |
if subnetIDValue, ok := d.GetOk("vnet_subnet_id"); ok { | |
subnetID, err := commonids.ParseSubnetID(subnetIDValue.(string)) | |
if err != nil { | |
return err | |
} | |
props.VnetSubnetID = pointer.To(subnetID.ID()) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated! Thanks :)
@@ -798,6 +868,13 @@ func resourceKubernetesClusterNodePoolUpdate(d *pluginsdk.ResourceData, meta int | |||
props.NetworkProfile = expandAgentPoolNetworkProfile(d.Get("node_network_profile").([]interface{})) | |||
} | |||
|
|||
if d.HasChange("zones") { | |||
zones := zones.ExpandUntyped(d.Get("zones").(*schema.Set).List()) | |||
if len(zones) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a user chooses to not specify zones
in their config anymore then this check would prevent that change from propagating
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated!
As a note in this property (and similar cases mentioned above), i replicated the implementation made on the default_node_pool scenario; not sure if it would have the same problem or the behaviour is different (for whatever reason), maybe that should be verified.
Hey @CorrenSoft we have some customers requesting this feature. Would you be able to let me know whether you're planning to work through the review feedback that was left? I'm happy to take this over to get it in if you don't find yourself with time/energy at the moment to get back to this, just checking before I step on your toes here 🙂 |
I see the point. I will do further test to evaluate how is behaving and what can do about it. |
Community Note
Description
temporary_name_for_rotation
property (optional, not persisted).TestAccKubernetesClusterNodePool_manualScaleVMSku
andTestAccKubernetesClusterNodePool_ultraSSD
test cases.schemaNodePoolSysctlConfigForceNew
,schemaNodePoolKubeletConfigForceNew
andschemaNodePoolLinuxOSConfigForceNew
as they are no longer used.retrySystemNodePoolCreation
toretryNodePoolCreation
, as now is being used for both cases.PR Checklist
Changes to existing Resource / Data Source
Testing
Change Log
Below please provide what should go into the changelog (if anything) conforming to the Changelog Format documented here.
azurerm_kubernetes_cluster_node_pool
- adds thetemporary_name_for_rotation
property to allow to cycle the node pool instead of a direct recreation [azurerm_kubernetes_cluster_node_pool
: Adds support fortemporary_name_for_rotation
#27791]This is a (please select all that apply):
Related Issue(s)
Fixes #22265
Note
If this PR changes meaningfully during the course of review please update the title and description as required.