SCSI IDs changing on machines built with 2.60+. #2089

gavinwill · 2023-12-13T09:18:37Z

Community Guidelines

I have read and agree to the HashiCorp Community Guidelines .
Vote on this issue by adding a 👍 reaction to the original issue initial description to help the maintainers prioritize.
Do not leave "+1" or other comments that do not add relevant information or questions.
If you are interested in working on this issue or have submitted a pull request, please leave a comment.

Terraform

Terraform v1.3.0

Terraform Provider

v2.6.0

VMware vSphere

7.0.3.01700

Description

Hi

On building a VM from an Ubuntu OVF template we are seeing the scsi order change and therefore the interface naming change (which has impact as we use cloudinit and specify nic to configure)

On a machine deployed with provider 2.5.1 we see the correct ordering for us


03:00.0 Serial Attached SCSI controller: VMware PVSCSI SCSI Controller (rev 02)
04:00.0 Serial Attached SCSI controller: VMware PVSCSI SCSI Controller (rev 02)
0b:00.0 Ethernet controller: VMware VMXNET3 Ethernet Controller (rev 01)
13:00.0 Serial Attached SCSI controller: VMware PVSCSI SCSI Controller (rev 02)
1b:00.0 Serial Attached SCSI controller: VMware PVSCSI SCSI Controller (rev 02)

This provides us with ens192
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 on the ubuntu machine

When we deploy a brand new machine with provider 2.6.0+ we see the scsi order change

### lspci from vm w with provider 2.6.0 = bad

03:00.0 Serial Attached SCSI controller: VMware PVSCSI SCSI Controller (rev 02)
04:00.0 Ethernet controller: VMware VMXNET3 Ethernet Controller (rev 01)
0b:00.0 Serial Attached SCSI controller: VMware PVSCSI SCSI Controller (rev 02)
0c:00.0 Ethernet controller: VMware VMXNET3 Ethernet Controller (rev 01)
13:00.0 Serial Attached SCSI controller: VMware PVSCSI SCSI Controller (rev 02)
1b:00.0 Serial Attached SCSI controller: VMware PVSCSI SCSI Controller (rev 02)

This change in order means that our nic interface name has changed since naming is
# example Interface names are generated as:
# en --> ethernet
# p0 --> bus number
# s31 --> slot number

ip link on this machine shows 2: ens161: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000

Note - i need to get access via console in vmware since cant ssh to host as network config mismatch.

If i have a machine built on provider 2.5.1 then upgrade the provider to 2.6.0+ and do a plan we see that infra is up to date and no changes are planned. It seems it is on new vm creation with later provider the order is incorrect.

Upgrading modules...


Initializing provider plugins...
- Finding hashicorp/vsphere versions matching "2.6.0"...
- Installing hashicorp/vsphere v2.6.0...
- Installed hashicorp/vsphere v2.6.0 (signed by HashiCorp)

Plan 
No changes. Your infrastructure matches the configuration.```


### Affected Resources or Data Sources

vsphere_network.network

### Terraform Configuration

Will provide details in update

### Debug Output

Will provide details in update

### Panic Output

_No response_

### Expected Behavior

We would expect no change in the scsi ordering when using new provider

### Actual Behavior

Scsi ordering incorrect causing issues with disk and nics

### Steps to Reproduce

upgrade provider to 2.6.0+ (from verified good 2.5.1) and create new machine

### Environment Details

_No response_

### Screenshots

_No response_

### References

_No response_

The text was updated successfully, but these errors were encountered:

github-actions · 2023-12-13T09:18:52Z

Hello, gavinwill! 🖐

Thank you for submitting an issue for this provider. The issue will now enter into the issue lifecycle.

If you want to contribute to this project, please review the contributing guidelines and information on submitting pull requests.

tenthirtyam · 2023-12-13T11:13:22Z

@vasilsatanasov cannyou investigate to see if this is related to SR-IOV introduction?

gavinwill · 2023-12-13T11:32:19Z

I did think it may be SR-IOV related from quickly looking at the diff from 2.5.1 > 2.6.0

Potentially may be available for PR to fix also

vasilsatanasov · 2023-12-13T13:01:50Z

Looking at it, @gavinwill , could you please provide and example HCL to reproduce the issue + Ubuntu version you are using ?

gavinwill · 2023-12-13T15:15:12Z

Hi @vasilsatanasov
Apologies - Its an Ubuntu 2004 OVF template

I have just tested this out by building the provider against different commits go build -o terraform-provider-vsphere and using dev_overrides on the provider installation and can confirm that the last commit this works for me is 6211c3b

If i taint machine and rebuild with building the provider against 9c25530 It fails and we see the scsi device order wrong and hence the ens161 nic.

We use a slightly customised module. I am just parsing that down to minimal stand alone code so that you can repo.

vasilsatanasov · 2023-12-14T08:55:01Z

Thank you @gavinwill , waiting for the code for reproduction!

gavinwill · 2023-12-14T11:00:29Z

Hi

I have "converted" our module to a simple tf file with hard coded values but can repo the issue with the below config.

If i specify the provider to be 2.5.1 and apply (after cleaning out .terraform folder to be sure including terraform init) the machine boots up fine with expected scsi order and nic is ens192

If I clean out the .terraform folder and update provider to 2.6.0+ the machine boots up but the nic is ens161 and the scsi ordering is changed. The change to terraform is only the provider version.

terraform {
  required_providers {
    vsphere = {
      source  = "hashicorp/vsphere"
      version = "2.6.1"
    }
  }
  required_version = ">= 1.3.0"
}

provider "vsphere" {
  vsphere_server       = "vsphereserver.com"
  user                 = "[email protected]"
  password             = "hunter2"

}

resource "vsphere_virtual_machine" "vm" {
  name                    = "gt-gavintest-01"
  resource_pool_id        = "resgroup-1234"
  folder                  = "test"
  extra_config = {
              "guestinfo.metadata": "our metadata for cloud init including netplan for ens192"
              "guestinfo.userdata": "our base64 userdata",
              "guestinfo.userdata.encoding": "base64"
            }

  extra_config_reboot_required  = false
  firmware                      = "bios"
  efi_secure_boot_enabled       = false
  enable_disk_uuid              = false
  datastore_id                  =  "datastore-1234"

  num_cpus               = 4
  num_cores_per_socket   = 2
  cpu_hot_add_enabled    = true
  cpu_hot_remove_enabled = true
  memory                 = 8192
  guest_id               = "ubuntu64Guest"
  scsi_bus_sharing       = "noSharing"
  scsi_type              = "pvscsi"
  scsi_controller_count  = 4
  wait_for_guest_net_routable = false
  wait_for_guest_ip_timeout   = 0 
  wait_for_guest_net_timeout  = 5

  dynamic "network_interface" {
    for_each = local.networks
    content {
      network_id   = "dvportgroup-1234"
      adapter_type = "vmxnet3"
      ovf_mapping  = "nic${network_interface.key}"
    }
  }

  disk {
      label             = "disk0"
      size              = 72
      unit_number       = 0
      thin_provisioned  = true
      eagerly_scrub     = false
      datastore_id      = "datastore-1234"
      io_reservation    = 0
      io_share_level    = "normal"
      io_share_count    = 1000
    }
  
  clone {
    template_uuid = "12345-78910-121213-1415-16171819e"
    linked_clone  = false
    timeout       = 30
  }

  hv_mode                          = "hvAuto"
  ept_rvi_mode                     = "automatic"
  nested_hv_enabled                = false
  enable_logging                   = false
  cpu_performance_counters_enabled = false
  swap_placement_policy            = "inherit"
  latency_sensitivity              = "normal"
  shutdown_wait_timeout = 3
  force_power_off       = false
}

our locals contains

locals {    
    networks = [
      { "addresses" : ["10.12.13.14/24"], },
      { "addresses" : [], },
    ]
}

We use the address to populate our cloudinit and do a for each on the key in above tf.

Hope this helps

adamhorden · 2023-12-20T21:03:52Z

I have faced this same issue today, I could not work out why the order was incorrect on new VM builds, before finding this issue. VMs would come up, but the network would not come up so needed manual intervention via the console.

v2.5.1:

03:00.0 Ethernet controller: VMware VMXNET3 Ethernet Controller (rev 01)

v2.6.1:

04:00.0 Ethernet controller: VMware VMXNET3 Ethernet Controller (rev 01)

This causes the network to not come up as ens160 now becomes ens224.
Terraform Plans on v2.6.1 are clean but any new VMs on v2.6.1 have the incorrect order.
For the moment pinning to v2.5.1 works as expected.

Adam Horden

tenthirtyam · 2023-12-20T22:45:42Z

@vasilsatanasov - this might be related to the SR-IOV enhancement?

vasilsatanasov · 2023-12-21T08:27:42Z

@vasilsatanasov - this might be related to the SR-IOV enhancement?

Looks like it is, as per @gavinwill 's report.

After the introduction of the SR-IOV feature network adapters due to inconsistent check for changes in the `physical_function` attribute have been recreated after clone since the check always returned `true`. The result is that instead of updating existing NICs the relocate tack started after clone was always deleting the existing NICs and creating new ones. This was causing the new VM to be disconnected from network Changed the check for changes in `physical_function` attribute to treat nil and empty string equaly so missing `physical_function` attribute in the device compared to empty string from the schema won't be cosnidered as changed. Testing done: cloned VM from template with 2 nics and verified that there is network connectivity. Also verified the output from lspci command on the template VM and on the clone VM. Fixes hashicorp#2089 Signed-off-by: Vasil Atanasov <[email protected]>

Ref: #2089 Signed-off-by: Vasil Atanasov <[email protected]>

github-actions · 2024-02-16T02:02:38Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

gavinwill added bug Type: Bug needs-triage Status: Issue Needs Triage labels Dec 13, 2023

tenthirtyam added this to the v2.7.0 milestone Dec 14, 2023

iBrandyJackson assigned iBrandyJackson and vasilsatanasov and unassigned iBrandyJackson Jan 9, 2024

vasilsatanasov mentioned this issue Jan 15, 2024

fix: scsi ids changing on machines built with v2.6.x #2115

Merged

2 tasks

tenthirtyam pushed a commit that referenced this issue Jan 16, 2024

fix: scsi ids changing on machines built with v2.6.x (#2115)

cc7366c

Ref: #2089 Signed-off-by: Vasil Atanasov <[email protected]>

tenthirtyam closed this as completed in #2115 Jan 16, 2024

github-actions bot locked as resolved and limited conversation to collaborators Feb 16, 2024

tenthirtyam removed the needs-triage Status: Issue Needs Triage label Apr 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SCSI IDs changing on machines built with 2.60+. #2089

SCSI IDs changing on machines built with 2.60+. #2089

gavinwill commented Dec 13, 2023

github-actions bot commented Dec 13, 2023

tenthirtyam commented Dec 13, 2023

gavinwill commented Dec 13, 2023

vasilsatanasov commented Dec 13, 2023

gavinwill commented Dec 13, 2023

vasilsatanasov commented Dec 14, 2023

gavinwill commented Dec 14, 2023 •

edited

Loading

adamhorden commented Dec 20, 2023 •

edited

Loading

tenthirtyam commented Dec 20, 2023

vasilsatanasov commented Dec 21, 2023

github-actions bot commented Feb 16, 2024

SCSI IDs changing on machines built with 2.60+. #2089

SCSI IDs changing on machines built with 2.60+. #2089

Comments

gavinwill commented Dec 13, 2023

Community Guidelines

Terraform

Terraform Provider

VMware vSphere

Description

github-actions bot commented Dec 13, 2023

tenthirtyam commented Dec 13, 2023

gavinwill commented Dec 13, 2023

vasilsatanasov commented Dec 13, 2023

gavinwill commented Dec 13, 2023

vasilsatanasov commented Dec 14, 2023

gavinwill commented Dec 14, 2023 • edited Loading

adamhorden commented Dec 20, 2023 • edited Loading

tenthirtyam commented Dec 20, 2023

vasilsatanasov commented Dec 21, 2023

github-actions bot commented Feb 16, 2024

gavinwill commented Dec 14, 2023 •

edited

Loading

adamhorden commented Dec 20, 2023 •

edited

Loading