Skip to content
This repository has been archived by the owner on Jun 24, 2020. It is now read-only.

Update to latest manifestival to apply strategic merge patches to its resources. #243

Merged
merged 2 commits into from
Dec 10, 2019

Conversation

jcrossley3
Copy link
Contributor

Fixes #226 #202 #160

Makes things a little more straightforward to implement whatever direction comes from #163

Proposed Changes

  • The latest manifestival performs the same strategic merging used by kubectl apply
  • Except for ConfigMap -- any change to the KnativeServing config entries will overwrite the corresponding ConfigMaps
  • Superfluous updates of Deployment and Image resources should be fixed
  • Diffs are now logged so it's easier to see why a resource is being updated

Release Note

The serving operator will now apply [strategic merging](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/declarative-config/#merge-patch-calculation) of its resources to any changes made that aren't a part of its embedded manifest should not be overwritten.

@googlebot googlebot added the cla: yes Author(s) signed a CLA. label Dec 6, 2019
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jcrossley3

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@googlebot googlebot added cla: no and removed cla: yes Author(s) signed a CLA. labels Dec 7, 2019
This employs the k8s 3-way strategic merging for all manifest
resources except ConfigMaps.

This makes manifestival's Apply behave like kubectl apply.
This should remove the noise in the logs from manifestival trying to
update resources right after they're created. When we convert from
unstructured to typed structs and back, we get zeroed defaults added
to the unstructured objects (usually creationTimestamp) which
manifestival interprets as a change worth persisting.
@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@googlebot googlebot added cla: yes Author(s) signed a CLA. and removed cla: no labels Dec 7, 2019
@knative-metrics-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-knative-serving-operator-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/knativeserving/common/certs.go 92.6% 92.9% 0.3
pkg/reconciler/knativeserving/common/images.go 86.7% 87.3% 0.6

@@ -54,6 +55,8 @@ func CustomCertsTransform(instance *servingv1alpha1.KnativeServing, log *zap.Sug
if err := scheme.Scheme.Convert(deployment, u, nil); err != nil {
return err
}
// Avoid superfluous updates from converted zero defaults
u.SetCreationTimestamp(metav1.Time{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if these would now also be fixed due to the three-way-merge semantics? I.e., do we need to manually set them still?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're relying on this logic to actually remove the field. Without it, we attempt to apply the Time's zero-value set by the typed object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The need for this seems like it could be non-obvious for those implementing new transforms. Would it make sense to move logic which ignores creation timestamp (or something similar) to a location that allows it to be re-used (manifestival, or somewhere else in this package)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I view it as a tradeoff for being able to treat any type in the manifest as Unstructured. I would prefer not to special-case any fields at this time, if we can avoid it.

And though I share your concern about the non-obviousness, keep these in mind:

  • It's only a problem if you convert the unstructured to a structured type and back again. Most transforms (currently) don't do this.
  • The only risk to not clearing the zeroed fields is executing unnecessary updates; annoying and slow, but logically harmless.
  • It will be very obvious to anyone looking at the log that unnecessary updates are occurring because we now log the diff showing why we're updating any resource. I would expect this to trigger the transform author, so it may be less non-obvious than we think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Thanks.

Copy link
Contributor

@garron garron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done an initial pass, but wouldn't say I thoroughly understand the code yet. How extensive is the testing that has been done here? I plan on trying out your changes tomorrow.

These changes do add a fair amount of complexity to what is going on in manifestival (I think this was to be expected given what they are replicating). I do think it would be good to get eyes on this from someone with more deep kubernetes knowledge than I have.

@@ -54,6 +55,8 @@ func CustomCertsTransform(instance *servingv1alpha1.KnativeServing, log *zap.Sug
if err := scheme.Scheme.Convert(deployment, u, nil); err != nil {
return err
}
// Avoid superfluous updates from converted zero defaults
u.SetCreationTimestamp(metav1.Time{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The need for this seems like it could be non-obvious for those implementing new transforms. Would it make sense to move logic which ignores creation timestamp (or something similar) to a location that allows it to be re-used (manifestival, or somewhere else in this package)?

@@ -109,6 +113,10 @@ func updateCachingImage(instance *servingv1alpha1.KnativeServing, u *unstructure
if err != nil {
return err
}
// Cleanup zero-value default to prevent superfluous updates
u.SetCreationTimestamp(metav1.Time{})
delete(u.Object, "status")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The need to delete status here again seems like something that would be hard for someone to know to do. Would it make sense to have manifestival ingore status when comparing for equality? Or is this not a generic/universal enough need?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was curious about this, too, because the core types don't exhibit this behavior. Note that converting to appv1.Deployment did not set a default status. Because Image is a knative type, I expect there may be some critical difference between the core and knative generated functions. Could it be a bug with Image that it's setting that status field at all?

}
obj, err := scheme.Scheme.New(src.GroupVersionKind())
switch {
case src.GetKind() == "ConfigMap":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this seems OK for now since it allows this to be rolled out with no API change, I would be inclined to move this decision into the consuming package. The specific types that should or should not be merged seems like something we might want to maintain in the operator itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't like special-casing ConfigMap, either. But I didn't want our lack of consensus on that matter (#163) to prevent us from fixing HEAD right now.

My goal was to make ApplyAll behave exactly like kubectl apply so I'm not even really keen to expose an API to override that for any type, but I concede there could be valid reasons for doing so. I just don't think this PR should be blocked by that resolution.

if annotations == nil {
return nil
}
return []byte(annotations[v1.LastAppliedConfigAnnotation])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't totally thought this through (and may be misunderstanding), but I was surprised to see this using the same annotation key as kubectl.

If I'm understanding the implications, that means that any time someone runs kubectl apply, they will update our (the only) original, meaning our next reconcile loop will detect that modified and original no loner match and will revert any changes made via kubectl via a patch.

In terms of how we want the operator to behave, this may be correct/desired, however it was still surprising to me that 3-way merge is only with non-kubectl-based changes. I'm still thinking this through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the goal was for ApplyAll to mimic the behavior of kubectl apply. That may be the wrong goal, but I see the "next reconcile loop" running ApplyAll as being no more or less valid than another invocation of kubectl apply on the same resource.

If I saw two annotations on the resource (one for manifestival and one for kubectl), I'm not entirely sure I could look at the remaining fields and explain how they got that way. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, if we go forward with the KnativeServing CR being the single source of truth, then it might make sense to give manfestival its own annotation key. Seeing both in there would then indicate a user error. 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan to make this as close as possible to "kubectl apply" seems good for now. If we want to diverge that, we can do it later.

@jcrossley3
Copy link
Contributor Author

jcrossley3 commented Dec 10, 2019

How extensive is the testing that has been done here?

I feel good about the coverage. Most of the testcases in there (kind: T) are carried over from the previous code in which I was manually calculating the patches. All of those cases ultimately run this line. And that library of course runs its own extensive tests. And even though only ConfigMaps in the k-s manifest actually follow this code path, I still carried all those tests over just to make sure I didn't break anything.

The strategic merge testcases (kind: Deployment) come from this example. Because its kind is a registered GVK (as are all the resources in the k-s manifest), it ultimately runs this line. And that's the same function kubectl runs, and its library also has extensive tests.

@garron
Copy link
Contributor

garron commented Dec 10, 2019

/lgtm

@knative-prow-robot knative-prow-robot merged commit cdd9d70 into knative:master Dec 10, 2019
jcrossley3 added a commit to jcrossley3/serving-operator that referenced this pull request Dec 11, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Serving operator interferes with serving updates to MutatingWebhookConfiguration
7 participants