-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(feature): add direct image registry client Unpacker
implementation
#145
(feature): add direct image registry client Unpacker
implementation
#145
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #145 +/- ##
===========================================
- Coverage 87.02% 49.72% -37.30%
===========================================
Files 3 6 +3
Lines 131 366 +235
===========================================
+ Hits 114 182 +68
- Misses 10 163 +153
- Partials 7 21 +14
☔ View full report in Codecov by Sentry. |
3c0fb59
to
e3ec751
Compare
818e465
to
3aae479
Compare
Unpacker
implementation
// TODO: Add garbage collection to remove any unused | ||
// images/SHAs that exist in the cache | ||
|
||
// TODO: Make asynchronous |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have one (yet) but I think it's useful to consider writing code for controllers as if there were a long-lived deployment you are on the pager for. Every commit that merges to the codebase deploys, and you need to keep in mind that branch cuts for future releases can happen at any point.
The implication here is that if we were to cut a release of catalogd
with this PR, the server would cache image layers forever and not do any cleanup. Prefer when adding new features to a controller to handle (minimally) the full lifecycle.
There are a couple layers of indirection in the codebase so I don't know how the basic intuition translates to this case, but writing the full lifecycle code in a controller context is fairly straightforward.
Maybe I'm misunderstanding, but I thought that is what feature gates are used for? As far as I understand it, the feature gate should signal that a feature is minimally functional (IMO this is - it does what it needs to even if not optimized) but is still a work in progress, is bound to change, and could need optimizations. This feature gate should be disabled by default and therefore this implication shouldn't happen unless the feature gate is explicitly enabled. If I am misunderstanding I am happy to add the garbage collection logic. My intention here was to simply get the basic implementation in and then iterate from there.
I don't think it would be too complex to do a minimal implementation of garbage collection. As I mentioned above, I don't mind adding it if I misunderstood the message sent by sticking this functionality behind a feature gate and that we could get these optimization type things in iteratively. |
IMO, it is reasonable to deliver half-baked features behind a disabled-by-default feature gate. It would be good to have a design that enumerates the plan for how the feature is delivered and what the proposed maturity process is for the feature gate. Is there a brief or RFC for this? Seems big enough to warrant one. |
I'm less interested in Google Docs than making the development easier. Since lifecycle is the bread and butter of what controllers do, it's usually useful to tackle it (minimally) whole-hog so that you don't make more refactor work for yourself in the future on code you just wrote. |
I don't disagree. I'd guess that we may not all agree what our definitions of "minimally" are though. That's where a high-level plan could help achieve alignment on expectations. |
There is not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on the RFC.
1c1ba1e
to
e0c4b15
Compare
image-registry: ## Setup in-cluster image registry | ||
./test/tools/imageregistry/registry.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you seen https://github.com/tilt-dev/ctlptl?
Since we're already tilt-enabled, perhaps this is a reasonable tool to evaluate for our cluster setup purposes, especially since it seems to support setting up image registries that are
simultaneously accessible to the local dev environment and the in-cluster processes.
Can we spend 30 minutes now (but time boxed) evaluating this to see if it would be a drop in replacement?
If it is a good candidate, it would be fine to integrate in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have seen that. I played around with doing it the kind way and wasn't able to get that working well (there seemed to be resolution issues from the direct client and it couldn't find the registry when using both the names mentioned in the docs). My understanding is that ctlptl pretty much does the steps mentioned in the kind docs for you and would, in theory, have the same results so i didn't give it a shot.
That being said, I'm happy to time box an evaluation of it. I'll circle back around to investigating this after addressing other things since this would be a follow-up anyways
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave this a shot and ran into the same issues I had with the approach documented by kind. It just can't seem to resolve the registry URL. Maybe this is just an issue on my machine though? I haven't tested it on my Mac, but I've had kind related troubles in the past on my work laptop running Fedora so 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, sounds good. Maybe make a tracker for this, even if it's just a way to mentally bookmark this apparent gap in the ecosystem for cluster+registry tooling. It seems like progress is being made, so perhaps we can revisit again in 3-6 months.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test/tools/imageregistry/imgreg.yaml
Outdated
metadata: | ||
name: my-selfsigned-ca | ||
namespace: catalogd-e2e | ||
spec: | ||
isCA: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused here. Is this a CA or is this the docker registry's server certificate, or both?
I think what we need is just a self-signed server keypair, and then clients just need the server cert to trust. But maybe I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that it is both. We use the tls.crt
and tls.key
entries in the generated secret to be used by the image registry and we inject the ca.crt
entry to the necessary locations so that clients trust the cert used by the image registry. I tried it some different ways to no avail, but it could just be due to my lack of experience with cert-manager (and certificate management in general)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving as is for now. This should be addressed by #188
test/tools/imageregistry/imgreg.yaml
Outdated
spec: | ||
containers: | ||
- name: registry | ||
image: registry:2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is coming from Docker Hub, which is rate-limited, and can cause flakes. Can we find a registry container in quay (or some other better image registry)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could try to use Project Quay as the image registry but that would require a good bit more effort to investigate how to properly configure this. This image is Apache 2.0 licensed so if we are concerned about the flakes we could probably pull from dockerhub, re-tag, and push to a quay repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another alternative is writing our own simple registry in Go and building+loading that image onto the kind cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think registry:2
is fine for now. I'm not seeing it obviously mirrored into another registry. Perhaps a follow-up issue and/or comment in the YAML about this for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Just to try it out, I testing what it would take to write a super barebones registry in Go. Was able to do it with:
package main
import (
"log"
"net/http"
"os"
"github.com/google/go-containerregistry/pkg/registry"
)
func main() {
certFile, certEnvVarExists := os.LookupEnv("REGISTRY_HTTP_TLS_CERTIFICATE")
if !certEnvVarExists {
if err := http.ListenAndServe(":5000", registry.New()); err != nil {
log.Fatal(err)
}
}
keyFile, keyEnvVarExists := os.LookupEnv("REGISTRY_HTTP_TLS_KEY")
if !keyEnvVarExists {
log.Fatal("environment variable REGISTRY_HTTP_TLS_CERTIFICATE specified but REGISTRY_HTTP_TLS_KEY was not")
}
if err := http.ListenAndServeTLS(":5000", certFile, keyFile, registry.New(registry.Logger(log.Default()))); err != nil {
log.Fatal(err)
}
}
I built and loaded an image with this for e2e and it passed when run locally. I won't push this up because it was a quick poc, but something I thought would be worth sharing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow up issue: #189
Signed-off-by: Bryce Palmer <[email protected]>
f409103
to
9fea4bc
Compare
@@ -0,0 +1,97 @@ | |||
package main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this really matters, but I'd suggest (for a follow-up) moving the GC function into a separate package so that we can avoid a unit test in the main
package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few more comments about potential follow-ups. But LGTM.
It seems that the selected library does not support mirroring. What is the plan in this regard? |
This is a known (and out-of-scope for now) feature that needs to be implemented. The RFC calls this out and says that we'll cover it in a future RFC. |
AuthNamespace string | ||
} | ||
|
||
const ConfigDirLabel = "operators.operatorframework.io.index.configs.v1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can probably switch to using ConfigsLocationLabel
from here without creating a new constant variable
return fmt.Errorf("error parsing remote image %q config file: %w", imgRef.Name(), err) | ||
} | ||
|
||
dirToUnpack, ok := cfgFile.Config.Labels[ConfigDirLabel] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment is applicable here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The most important follow up to capture for me would be the reassessing of the two interfaces, Unpack and Store, and the use of a cache in Unpack in that context.
containers: | ||
- name: manager | ||
volumeMounts: | ||
- mountPath: /etc/ssl/certs/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to have this context captured as a comment here somewhere somehow (I'm not sure what the mechanics for using comments in a kustomization files are). Issues aren't available to a reader of the code, comments are, even if the comment is just pointing to the GitHub issue.
9f3ba06
Description
Unpacker
implementations with one that communicates directly with an image registryUnpacker
implementationSecret
s in thecatalogd-system
namespace. This is so we can pull images that require authentication by using a pull secretMotivation
UnpackImageRegistryClient
feature-gate is enabled #163Unpack
implementation #182