Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitopsClusters should only be Ready if ClusterConnected is Ready too #56

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

sarataha
Copy link
Member

@sarataha sarataha commented Aug 15, 2023

Closes: weaveworks/weave-gitops-enterprise#3200

PR to update reconciler to make GitopsCluster only ready when secret is found and cluster is connected condition is true.

@sarataha sarataha added the enhancement New feature or request label Aug 15, 2023
@sarataha sarataha force-pushed the check-cluster-connected branch 13 times, most recently from dc8757b to 785e72e Compare August 16, 2023 13:47
@sarataha sarataha marked this pull request as ready for review August 16, 2023 13:50
@sarataha sarataha requested a review from foot August 16, 2023 13:52
@@ -230,6 +229,21 @@ func (r *GitopsClusterReconciler) Reconcile(ctx context.Context, req ctrl.Reques
return ctrl.Result{}, err
}

// Cluster is ready only if it has a secret and is connected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simplify this, check for the Secret and check for Connectivity.

If they are both ok, then it's Ready...

Otherwise they are not ready, with an appropriate message

  • No Connectivity
  • Secret missing
  • Secret missing and no connectivity

Something like this ?

Copy link
Member Author

@sarataha sarataha Aug 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of secret missing, I think we return early on line 164. So I wonder how to return the message secret missing and no connectivity in this case? 🤔

https://github.com/weaveworks/cluster-controller/pull/56/files#diff-b7f08baf7274b678218bc46e310600979ae9785930eca852dffa6e75ea98dffbR164

@sarataha sarataha force-pushed the check-cluster-connected branch 11 times, most recently from cf17717 to 499edca Compare August 21, 2023 18:48
@sarataha sarataha force-pushed the check-cluster-connected branch from 499edca to 0efad4b Compare August 21, 2023 18:52
Copy link
Contributor

@bigkevmcd bigkevmcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking ok.

// ClusterNotConnectedReason signals that a given cluster is not connected.
ClusterNotConnectedReason string = "ClusterNotConnected"
// SecretMissingAndNoConnectivityReason signals that a given secret is missing and there is no connectivity to the cluster.
SecretMissingAndNoConnectivityReason string = "SecretMissingAndNoConnectivity"
Copy link
Contributor

@bigkevmcd bigkevmcd Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this state us really valid, because if the secret is missing, you can't know whether or not connectivity is available?

@@ -41,7 +45,4 @@ const (
// ClusterProvisionedReason is the reason for the provisioned state being
// set.
ClusterProvisionedReason string = "ClusterProvisioned"

// ClusterConnectivity indicates if the cluster has connectivity
ClusterConnectivity string = "ClusterConnectivity"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to remove this one?

I'm trying to think if we were to evolve this further in the future, if a cluster could be "not ready" but the connectivity could be true.

E.g. parts of the querying system in WGE currently watch this, we'd transition them over to Just "ready".

On the other hand if you can connect to a cluster then that is a pretty ready cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would we test for connectivity without a secret?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The request was to simplify the Ready state, so that it's either "ready" i.e. can be used or "not ready" can't be used.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I guess part of the user story was perhaps:

  • The cluster is Ready
  • I can't see any resources in the UI
  • This is confusing

So we want to change the UI querying to just use Ready.

In this case the "final" ready is state is now cluster connectivity. If we wanted to model something else like... (??), tenants haven't been configured yet so this cluster isn't technically ready yet. We'd use some other status type..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes...

I think the original split was correct, to make it easier to differentiate, but the request from CX is to simplify it.

There's nothing to prevent a resource having several different Conditions, and the GitopsCluster does, it can be provisioned or not.

We need to differentiate these, because it could transit through provisioned quickly to another state, and the controller may never see it.

@@ -128,105 +127,78 @@ func (r *GitopsClusterReconciler) Reconcile(ctx context.Context, req ctrl.Reques
}

// examine DeletionTimestamp to determine if object is under deletion
if cluster.ObjectMeta.DeletionTimestamp.IsZero() {
if cluster.Spec.SecretRef != nil || cluster.Spec.CAPIClusterRef != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

updated := testGetGitopsCluster(t, r.Client, tt.obj)

if controllerutil.ContainsFinalizer(updated, controllers.GitOpsClusterFinalizer) {
result, err = r.Reconcile(context.TODO(), reconcile.Request{NamespacedName: tt.obj})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is TODO a good context to use or was it like a transitional one when context was initially introduced?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I was just searching the codebase for TODOs..)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this should probably be Background() but the differences are trivial in this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO also suggests that the context doesn't really matter :-)

e := fmt.Errorf("failed to get CAPI cluster %q: %w", name, err)
conditions.MarkFalse(cluster, meta.ReadyCondition, gitopsv1alpha1.WaitingForCAPIClusterReason, e.Error())
var connectivityErr error
// TODO: We should check for connectivity with CAPI clusters
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need to resolve this TODO?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We didn't do this before.

I do think we should be checking for the connectivity tho'

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we might have been doing this before?

(and guessing the name of the capi-cluster-secret)

if cluster.Spec.CAPIClusterRef != nil {
secretRef = fmt.Sprintf("%s-kubeconfig", cluster.Spec.CAPIClusterRef.Name)
}

// avoid checking the cluster if it's under deletion.
if !cluster.ObjectMeta.DeletionTimestamp.IsZero() {
return nil
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting.. still okay to update Status while something is being deleted probably

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// examine DeletionTimestamp to determine if object is under deletion
if !cluster.ObjectMeta.DeletionTimestamp.IsZero() {
return r.finalize(ctx, cluster)
}

It will return the result of reconciling the deletion.

i.e. it will return with a finalizer still in place if the secret hasn't been removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GitopsClusters should only be Ready if "ClusterConnected is Ready too"
3 participants