Skip to content

Commit

Permalink
* disable node registration when UUID is empty (#607)
Browse files Browse the repository at this point in the history
* show vsphere-cloud-controller-manager version in the log correctly

* Remove cache based on UUID when unregister node

* do not use the cached node addresses when NodeAddresses is invoked.

* Return error when VM's info returned from VC doesn't contains IP address

* Update chart

* Fix golang syntax

Co-authored-by: Nicole Han <[email protected]>
  • Loading branch information
lubronzhan and nicolehanjing authored Mar 24, 2022
1 parent 8f0470d commit e365579
Show file tree
Hide file tree
Showing 21 changed files with 79 additions and 143 deletions.
6 changes: 2 additions & 4 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
name: Release Charts

name: Release Helm Charts

on:
Expand All @@ -18,13 +16,13 @@ jobs:
uses: actions/checkout@v2
with:
fetch-depth: 0

# Configure Git for helm release
- name: Configure Git
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "[email protected]"
# Install Helm
- name: Install Helm
uses: azure/setup-helm@v1
Expand Down
3 changes: 2 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -334,4 +334,5 @@ squash:
docker-image:
docker build \
-f cluster/images/controller-manager/Dockerfile \
-t "$(IMAGE):$(BRANCH_NAME)" . \
-t "$(IMAGE):$(BRANCH_NAME)" \
--build-arg "VERSION=${VERSION}" . \
Binary file removed charts/vsphere-cpi-1.0.0.tgz
Binary file not shown.
Binary file added charts/vsphere-cpi-1.21.3.tgz
Binary file not shown.
4 changes: 2 additions & 2 deletions charts/vsphere-cpi/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
apiVersion: v2
appVersion: 1.21.2
appVersion: 1.21.3
description: A Helm chart for vSphere Cloud Provider Interface Manager (CPI)
name: vsphere-cpi
version: 1.0.0
version: 1.21.3
keywords:
- vsphere
- vmware
Expand Down
2 changes: 1 addition & 1 deletion charts/vsphere-cpi/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ serviceAccount:
daemonset:
annotations: {}
image: gcr.io/cloud-provider-vsphere/cpi/release/manager
tag: v1.21.2
tag: v1.21.3
pullPolicy: IfNotPresent
dnsPolicy: ClusterFirst
cmdline:
Expand Down
4 changes: 2 additions & 2 deletions cluster/images/controller-manager/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ ARG DISTROLESS_IMAGE=gcr.io/distroless/static@sha256:9b60270ec0991bc4f14bda475e8
FROM ${GOLANG_IMAGE} as builder

# This build arg is the version to embed in the CPI binary
ARG VERSION=unknown
ARG VERSION=1.21.3

# This build arg controls the GOPROXY setting
ARG GOPROXY
Expand All @@ -44,7 +44,7 @@ COPY pkg/ pkg/
COPY cmd/ cmd/
ENV CGO_ENABLED=0
ENV GOPROXY ${GOPROXY:-https://proxy.golang.org}
RUN go build -a -ldflags='-w -s -extldflags=static -X main.version=${VERSION}' -o vsphere-cloud-controller-manager ./cmd/vsphere-cloud-controller-manager
RUN go build -a -ldflags="-w -s -extldflags=static -X main.version=${VERSION}" -o vsphere-cloud-controller-manager ./cmd/vsphere-cloud-controller-manager

################################################################################
## MAIN STAGE ##
Expand Down
2 changes: 1 addition & 1 deletion docs/book/tutorials/deploying-cpi-with-k3s.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This document is designed to show you how to integrate k3s with cloud provider v

When running with a cloud-controller-manager, it is expected to pass the node provider ID to a CCM as `<provider>://<id>`, in our case, `vsphere://1234567`. However, k3s passes it as `k3s://<hostname>`, which makes vsphere CCM not be able to find the node.

We only support `vsphere` as the provider name that is used for constructing **providerID** for both [vsphere](https://github.com/kubernetes/cloud-provider-vsphere/blob/v1.21.2/pkg/cloudprovider/vsphere/cloud.go#L51) and [vsphere-paravirtual](https://github.com/kubernetes/cloud-provider-vsphere/blob/v1.21.2/pkg/cloudprovider/vsphereparavirtual/cloud.go#L42).
We only support `vsphere` as the provider name that is used for constructing **providerID** for both [vsphere](https://github.com/kubernetes/cloud-provider-vsphere/blob/v1.21.3/pkg/cloudprovider/vsphere/cloud.go#L51) and [vsphere-paravirtual](https://github.com/kubernetes/cloud-provider-vsphere/blob/v1.21.3/pkg/cloudprovider/vsphereparavirtual/cloud.go#L42).

## How to integrate k3s with cloud provider vsphere

Expand Down
101 changes: 7 additions & 94 deletions go.sum

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion hack/check-format.sh
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ goformat_exit_code=0; test -z "$(head -n 1 "${out}")" || goformat_exit_code=1
rm -f "${out}" && touch "${out}"

# Run goimports on all the sources.
go get golang.org/x/tools/cmd/goimports
go install golang.org/x/tools/cmd/goimports
cmd=$(go list -f \{\{\.Target\}\} golang.org/x/tools/cmd/goimports)
flags="-e -w"
[ -z "${PROW_JOB_ID-}" ] || flags="-d ${flags}"
Expand Down
3 changes: 2 additions & 1 deletion hack/check-lint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@ set -o pipefail
# script is located.
cd "$(dirname "${BASH_SOURCE[0]}")/.."

go get golang.org/x/lint/golint
go mod download golang.org/x/lint
go install golang.org/x/lint/golint

CMD=$(go list -f \{\{\.Target\}\} golang.org/x/lint/golint)

Expand Down
1 change: 1 addition & 0 deletions hack/check-staticcheck.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ set -o pipefail
cd "$(dirname "${BASH_SOURCE[0]}")/.."

go get honnef.co/go/tools/cmd/staticcheck
go install honnef.co/go/tools/cmd/staticcheck
CMD=$(go list -f \{\{\.Target\}\} honnef.co/go/tools/cmd/staticcheck)

# re-enable SA1019 when we upgrade to Go 1.14
Expand Down
12 changes: 6 additions & 6 deletions index.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ apiVersion: v1
entries:
vsphere-cpi:
- apiVersion: v2
appVersion: 1.21.2
created: "2021-07-28T14:36:52.229812-07:00"
appVersion: 1.21.3
created: "2022-03-22T12:05:19.700473-07:00"
description: A Helm chart for vSphere Cloud Provider Interface Manager (CPI)
digest: db24dcbcbdb250809c313912ea7bfc50fbf5b2a99ee0cc3bbd799cdf0db726ac
digest: aee171bf111b38f76a046a203fd9c7e571d4a888d48f965f93991e268f7f36d1
home: https://github.com/kubernetes/cloud-provider-vsphere
icon: https://raw.githubusercontent.com/kubernetes/cloud-provider-vsphere/master/docs/vmware_logo.png
keywords:
Expand All @@ -18,6 +18,6 @@ entries:
sources:
- https://github.com/kubernetes/cloud-provider-vsphere
urls:
- https://kubernetes.github.io/cloud-provider-vsphere/charts/vsphere-cpi-1.0.0.tgz
version: 1.0.0
generated: "2021-07-28T14:36:52.22876-07:00"
- https://kubernetes.github.io/cloud-provider-vsphere/charts/vsphere-cpi-1.21.3.tgz
version: 1.21.3
generated: "2022-03-22T12:05:19.699567-07:00"
15 changes: 2 additions & 13 deletions pkg/cloudprovider/vsphere/instances.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ import (
var (
// ErrNotFound is returned by NodeAddresses, NodeAddressesByProviderID,
// and InstanceID when a node cannot be found.
ErrNodeNotFound = errors.New("Node not found")
ErrNodeNotFound = errors.New("node not found")
)

func newInstances(nodeManager *NodeManager) cloudprovider.Instances {
Expand All @@ -52,12 +52,6 @@ var _ cloudprovider.Instances = &instances{}
func (i *instances) NodeAddresses(ctx context.Context, nodeName types.NodeName) ([]v1.NodeAddress, error) {
klog.V(4).Info("instances.NodeAddresses() called with ", string(nodeName))

// Check if node has been discovered already
if node, ok := i.nodeManager.nodeNameMap[string(nodeName)]; ok {
klog.V(2).Info("instances.NodeAddresses() CACHED with ", string(nodeName))
return node.NodeAddresses, nil
}

if err := i.nodeManager.DiscoverNode(string(nodeName), cm.FindVMByName); err == nil {
if i.nodeManager.nodeNameMap[string(nodeName)] == nil {
klog.Errorf("DiscoverNode succeeded, but CACHE missed for node=%s. If this is a Linux VM, hostnames are case sensitive. Make sure they match.", string(nodeName))
Expand All @@ -77,12 +71,7 @@ func (i *instances) NodeAddresses(ctx context.Context, nodeName types.NodeName)
func (i *instances) NodeAddressesByProviderID(ctx context.Context, providerID string) ([]v1.NodeAddress, error) {
klog.V(4).Info("instances.NodeAddressesByProviderID() called with ", providerID)

// Check if node has been discovered already
uid := GetUUIDFromProviderID(providerID)
if node, ok := i.nodeManager.nodeUUIDMap[uid]; ok {
klog.V(2).Info("instances.NodeAddressesByProviderID() CACHED with ", uid)
return node.NodeAddresses, nil
}

if err := i.nodeManager.DiscoverNode(uid, cm.FindVMByUUID); err == nil {
klog.V(2).Info("instances.NodeAddressesByProviderID() FOUND with ", uid)
Expand Down Expand Up @@ -194,7 +183,7 @@ func (i *instances) InstanceShutdownByProviderID(ctx context.Context, providerID
// Check if node has been discovered already
uid := GetUUIDFromProviderID(providerID)
if _, ok := i.nodeManager.nodeUUIDMap[uid]; !ok {
// IF the uuid is not cached, we end up here
// if the uuid is not cached, we end up here
klog.V(2).Info("instances.InstanceShutdownByProviderID() NOT CACHED")
if err := i.nodeManager.DiscoverNode(uid, cm.FindVMByUUID); err != nil {
klog.V(4).Info("instances.InstanceShutdownByProviderID() NOT FOUND with ", uid)
Expand Down
2 changes: 1 addition & 1 deletion pkg/cloudprovider/vsphere/instances_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ func TestInstance(t *testing.T) {
instances := newInstances(&nm.NodeManager)

vm := simulator.Map.Any("VirtualMachine").(*simulator.VirtualMachine)
name := vm.Name
name := strings.ToLower(vm.Name)
vm.Guest.HostName = name
vm.Guest.Net = []vimtypes.GuestNicInfo{
{
Expand Down
37 changes: 35 additions & 2 deletions pkg/cloudprovider/vsphere/nodemanager.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ var (

// ErrDatacenterNotFound is returned when the configured datacenter cannot
// be found.
ErrDatacenterNotFound = errors.New("Datacenter not found")
ErrDatacenterNotFound = errors.New("datacenter not found")

// ErrVMNotFound is returned when the specified VM cannot be found.
ErrVMNotFound = errors.New("VM not found")
Expand Down Expand Up @@ -103,6 +103,19 @@ func (nm *NodeManager) removeNode(uuid string, node *v1.Node) {
klog.V(4).Info("removeNode NodeName: ", node.GetName(), ", UID: ", uuid)
delete(nm.nodeRegUUIDMap, uuid)
nm.nodeRegInfoLock.Unlock()

nm.nodeInfoLock.Lock()
klog.V(4).Info("removeNode from UUID and Name cache. NodeName: ", node.GetName(), ", UID: ", uuid)
// in case of a race condition that node with same name create happens before delete event,
// delete the node based on uuid
name := nm.getNodeNameByUUID(uuid)
if name != "" {
delete(nm.nodeNameMap, name)
} else {
klog.V(4).Info("node name: ", node.GetName(), " has a different uuid. Skip deleting this node from cache.")
}
delete(nm.nodeUUIDMap, uuid)
nm.nodeInfoLock.Unlock()
}

func (nm *NodeManager) shakeOutNodeIDLookup(ctx context.Context, nodeID string, searchBy cm.FindVM) (*cm.VMDiscoveryInfo, error) {
Expand Down Expand Up @@ -183,6 +196,10 @@ func (nm *NodeManager) DiscoverNode(nodeID string, searchBy cm.FindVM) error {
return err
}

if vmDI.UUID == "" {
return errors.New("discovered VM UUID is empty")
}

var oVM mo.VirtualMachine
err = vmDI.VM.Properties(ctx, vmDI.VM.Reference(), []string{"guest", "summary"}, &oVM)
if err != nil {
Expand All @@ -199,6 +216,11 @@ func (nm *NodeManager) DiscoverNode(nodeID string, searchBy cm.FindVM) error {
return errors.New("VM Guest hostname is empty")
}

if len(oVM.Guest.Net) == 0 {
klog.V(4).Infof("oVM.Guest.Net is empty, skipping node discovery. This could be cauesd by vmtool not reporting correct IP address")
return errors.New("VM GuestNicInfo is empty")
}

tenantRef := vmDI.VcServer
if vmDI.TenantRef != "" {
tenantRef = vmDI.TenantRef
Expand Down Expand Up @@ -359,13 +381,14 @@ func (nm *NodeManager) DiscoverNode(nodeID string, searchBy cm.FindVM) error {

if len(oVM.Guest.Net) > 0 {
if !foundInternal && !foundExternal {
klog.V(4).Infof("oVM.Guest.Net=%v", oVM.Guest.Net)
return fmt.Errorf("unable to find suitable IP address for node %s with IP family %s", nodeID, ipFamily)
}
}

klog.V(2).Infof("Found node %s as vm=%+v in vc=%s and datacenter=%s",
nodeID, vmDI.VM, vmDI.VcServer, vmDI.DataCenter.Name())
klog.V(2).Info("Hostname: ", oVM.Guest.HostName, " UUID: ", oVM.Summary.Config.Uuid)
klog.V(2).Info("Hostname: ", oVM.Guest.HostName, " UUID: ", vmDI.UUID)

os := "unknown"
if g, ok := GuestOSLookup[oVM.Summary.Config.GuestId]; ok {
Expand Down Expand Up @@ -535,3 +558,13 @@ func (nm *NodeManager) FindNodeInfo(UUID string) (*NodeInfo, error) {
klog.V(4).Infof("FindNodeInfo( %s ) FOUND", UUIDlower)
return nodeInfo, nil
}

func (nm *NodeManager) getNodeNameByUUID(UUID string) string {
for k, v := range nm.nodeNameMap {
if v.UUID == UUID {
return k
}

}
return ""
}
8 changes: 4 additions & 4 deletions pkg/cloudprovider/vsphere/nodemanager_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -79,11 +79,11 @@ func TestRegUnregNode(t *testing.T) {

nm.UnregisterNode(node)

if len(nm.nodeNameMap) != 1 {
t.Errorf("Failed: nodeNameMap should be a length of 1")
if len(nm.nodeNameMap) != 0 {
t.Errorf("Failed: nodeNameMap should be a length of 0")
}
if len(nm.nodeUUIDMap) != 1 {
t.Errorf("Failed: nodeUUIDMap should be a length of 1")
if len(nm.nodeUUIDMap) != 0 {
t.Errorf("Failed: nodeUUIDMap should be a length of 0")
}
if len(nm.nodeRegUUIDMap) != 0 {
t.Errorf("Failed: nodeRegUUIDMap should be a length of 0")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -148,16 +148,16 @@ func TestProcessIPPoolCreateOrUpdate(t *testing.T) {
c, cs := newController()

if _, err := c.ippoolclientset.NsxV1alpha1().IPPools(testClusterNS).Create(context.Background(), &testIPPool, metav1.CreateOptions{}); err != nil {
t.Errorf("failed to create test ippool %s: %w", testIPPool.Name, err)
t.Errorf("failed to create test ippool %s: %v", testIPPool.Name, err)
}

if err := c.processIPPoolCreateOrUpdate(&testIPPool); err != nil {
t.Errorf("failed to processIPPoolCreateOrUpdate %v: %w", testIPPool, err)
t.Errorf("failed to processIPPoolCreateOrUpdate %v: %v", testIPPool, err)
}

for _, n := range tc.nodes {
if _, err := c.kubeclientset.CoreV1().Nodes().Create(context.Background(), &n, metav1.CreateOptions{}); err != nil {
t.Errorf("failed to create test node %s: %w", n.Name, err)
t.Errorf("failed to create test node %s: %v", n.Name, err)
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -131,12 +131,12 @@ func TestProcessNodeCreateOrUpdate(t *testing.T) {
// create nodes and run process processNodeCreateOrUpdate
for _, n := range tc.nodes {
if err := ippc.processNodeCreateOrUpdate(&n); err != nil {
t.Errorf("failed to create test node %s: %w", n.Name, err)
t.Errorf("failed to create test node %s: %v", n.Name, err)
}
}
for _, n := range tc.nodesUpdate {
if err := ippc.processNodeCreateOrUpdate(&n); err != nil {
t.Errorf("failed to create test node %s: %w", n.Name, err)
t.Errorf("failed to create test node %s: %v", n.Name, err)
}
}

Expand Down Expand Up @@ -231,14 +231,14 @@ func TestProcessNodeDelete(t *testing.T) {
if tc.ippoolExist {
// pre create test nodes
if _, err := createIPPool(ippcs, tc.nodes); err != nil {
t.Errorf("failed to create ippool %w", err)
t.Errorf("failed to create ippool %v", err)
}
}

// delete node
for _, n := range tc.nodesToBeDeleted {
if err := ippc.processNodeDelete(n.Name); err != nil {
t.Errorf("failed to create test node %s: %w", n.Name, err)
t.Errorf("failed to create test node %s: %v", n.Name, err)
}
}

Expand Down
2 changes: 1 addition & 1 deletion releases/v1.21/vsphere-cloud-controller-manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,7 @@ spec:
serviceAccountName: cloud-controller-manager
containers:
- name: vsphere-cloud-controller-manager
image: gcr.io/cloud-provider-vsphere/cpi/release/manager:v1.21.2
image: gcr.io/cloud-provider-vsphere/cpi/release/manager:v1.21.3
args:
- --cloud-provider=vsphere
- --v=2
Expand Down
4 changes: 2 additions & 2 deletions test/vcsim/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ ENV LC_ALL=C.UTF-8

# Copy the vCenter simulator and govc binaries.
COPY --from=builder /go/src/github.com/vmware/govmomi/govc/govc \
/go/src/github.com/vmware/govmomi/vcsim/vcsim \
/usr/local/bin/
/go/src/github.com/vmware/govmomi/vcsim/vcsim \
/usr/local/bin/

# Set the working directory.
WORKDIR /
Expand Down

0 comments on commit e365579

Please sign in to comment.