Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

publish arm64 images #893

Open
seankhliao opened this issue Sep 22, 2023 · 15 comments
Open

publish arm64 images #893

seankhliao opened this issue Sep 22, 2023 · 15 comments

Comments

@seankhliao
Copy link

partly out of my own laziness to build the images myself, could the project publish multi platform images for easy deployment onto arm clusters as well?

@seankhliao
Copy link
Author

When trying to build it myself for linux/arm64, I also saw that a few images (git-sync, gcenode-askpass-sidecar, and resource-group-controller) didn't have their source in this repo

@mikebz
Copy link
Contributor

mikebz commented Oct 10, 2023

Hi @seankhliao

git-sync is located here: https://github.com/kubernetes/git-sync
gcenode-askpass-sidecar just got moved into this repo https://github.com/GoogleContainerTools/kpt-config-sync/tree/main/cmd/gcenode-askpass-sidecar

@mikebz
Copy link
Contributor

mikebz commented Oct 10, 2023

@karlkfi and @sdowell can comment on whether or not the linux/arm64

I see:

PLATFORMS := linux_amd64 linux_arm64 darwin_amd64 darwin_arm64 windows_amd64

so it could be that the right platform gets pulled because we are using dockerx

@seankhliao
Copy link
Author

I did get this working a few days ago
using resource-group-controller built from https://github.com/GoogleContainerTools/kpt-resource-group

The changes I had to make included somehow switching out the amd64 binaries pulled for helm and kustomize, as well as removing the hardcoded GOARCH=amd64 envs in Dockerfiles

@sdowell
Copy link
Contributor

sdowell commented Oct 10, 2023

@karlkfi and @sdowell can comment on whether or not the linux/arm64

I see:

PLATFORMS := linux_amd64 linux_arm64 darwin_amd64 darwin_arm64 windows_amd64

so it could be that the right platform gets pulled because we are using dockerx

The PLATFORMS you referenced is used for building the nomos binary. We cross compile the nomos binary, but we currently do not support multi-arch images for the controllers. To support this would require some changes to upstream builds for third party components (opentelemetry, git-sync, resource-group, helm, kustomize), and then we would need to update this repo's build tooling to support multi-arch.

@nresare
Copy link

nresare commented Nov 22, 2024

I found myself needing an arm64 build of the images, so I had a look at the specifics of what would need to be done. There is what I found:

  1. There are three different binary artefacts hosted on gcp that are amd64 only: the golang image and the binaries for kustomize and helm. The golang container image might be pretty straightforward, as the upstream docker hub version is multi-arch.
  2. The current strategy of downloading a single helm and kustomize binary outside of docker with a hard coded target architecture does not work great. I created new versions of the install shell scripts that are designed to run inside the docker build process and pull binaries with the right architecture for the local environment. This seems to work, but since the GCP hosted doesn't contain the arm64 version, I switched to pulling from upstream.
  3. The target arch is hard coded when building go binaries, removing and build the native for the currently running environment should work well within the current setup as well as enabling multi arch once that is ready
  4. As far as I can tell, docker push does not play nicely with sets of images built for multiple architectures with docker buildx. One way to address this is to replace the docker push invocations in the Makefile with docker buildx build --push (building should be quick if a recent build is in the cache.

I'll open pull requests for the above changes, please view them more as inspiration than as a requests to merge

@karlkfi
Copy link
Contributor

karlkfi commented Nov 23, 2024

To support this would require some changes to upstream builds for third party components (opentelemetry, git-sync, resource-group, helm, kustomize), and then we would need to update this repo's build tooling to support multi-arch.

Additionally, our internal forks for CVE patching of these dependencies will need to be updated to build and publish additional artifacts, in order to preserve FedRAMP compliance.

@karlkfi
Copy link
Contributor

karlkfi commented Nov 23, 2024

There are three different binary artefacts hosted on gcp that are amd64 only: the golang image and the binaries for kustomize and helm. The golang container image might be pretty straightforward, as the upstream docker hub version is multi-arch.

The Config Sync team does manage the internal helm & kustomize forks, but not the golang image. So we may need to loop in another team, and convince them to publish and maintain arm64 images, or publish our own.

@karlkfi
Copy link
Contributor

karlkfi commented Nov 23, 2024

I'm not an expert on multi-arch images, but here are some good docs:

One big concern I have is the current requirement that we use checksums when pulling images, to ensure the image contents haven't changed since we updated the image tag to pull. This makes it nearly impossible to use multi-arch images without templating the manifests for each architecture.

This is especially problematic because Config Sync's reconciler-manager manages the reconciler deployments. So templating the manifests used for install instead good enough. We also need to modify the reconciler-manager to detect or be configured with the architecture in order to select the right images to use for the reconciler and each of it's many sidecars.

The whole design of multi-arch images with multiple manifests chosen dynamically has all the same security problems as pulling the latest image all the time, which isn't secure enough for our requirements. AFAICT, the list of manifests isn't validated with a checksum, so there's no way to prove that it hasn't changed since you chose to use that set of manifests. So you have to specify the specific image and its checksum to protect against supply-chain attacks.

@karlkfi
Copy link
Contributor

karlkfi commented Nov 23, 2024

This is especially problematic because Config Sync's reconciler-manager manages the reconciler deployments. So templating the manifests used for install instead good enough. We also need to modify the reconciler-manager to detect or be configured with the architecture in order to select the right images to use for the reconciler and each of it's many sidecars.

Actually, since the reconciler template is already in a ConfigMap, we might be able to modify that for each arch at install time. So we may not need to modify reconciler-manager. But we would need to publish install manifests for each architecture, probably built with kustomize.

@seankhliao
Copy link
Author

seankhliao commented Nov 23, 2024

Re checksums, pinning to the digest of the multiarch image index should be sufficient? The underlying container runtime will handle resolving that to the actual architecture it needs.

Example of a multiarch image (that i happen to be working on atm):

crane manifest ghcr.io/seankhliao/moo-d8d32f1a62b378aeddd6a6715da2ca45@sha256:751ab863cc99e436e6d47b4e0d7183f5db7322ff838d8b767744c65e06a698db | jq -r .

The runtime will resolve it to the matching platform automagically:

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "size": 2538,
      "digest": "sha256:aef26d93e01e0d587c5de0ba1813b81e2bd488caeb2bbead90994902bf88317b",
      "platform": {
        "architecture": "amd64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "size": 2538,
      "digest": "sha256:3afa10fecdb8b3482152a2945f1bf149e0ca3204a4f74ea8a9773b4a9274ece5",
      "platform": {
        "architecture": "arm64",
        "os": "linux",
        "variant": "v8"
      }
    }
  ],
  "annotations": {
    "org.opencontainers.image.base.digest": "sha256:51ab1c8dd85f85010207db36dc5a5a3212ff42e482f6b324b7c9884956ffd293",
    "org.opencontainers.image.base.name": "gcr.io/distroless/static-debian12@sha256:3a03fc0826340c7deb82d4755ca391bef5adcedb8892e58412e1a6008199fa91"
  }
}

@karlkfi
Copy link
Contributor

karlkfi commented Nov 23, 2024

Ah, ok, cool. TIL the checksum we use to pull with is actually the checksum of the image manifest, not the checksum of the image itself.

That makes things simpler, thanks.

@nresare
Copy link

nresare commented Nov 23, 2024

It turns out, to bring up reconciler-manager there is one more image referenced that are only built for amd64: gcr.io/config-management-release/otelcontribcol. It is a bit of a mystery binary, but if I change the command path from /otelcontribcol to /otelcol-contrib I can use the upstream otel/opentelemetry-collector-contrib image and at least get it to start.

Are there any details anywhere about how gcr.io/config-management-release/otelcontribcol is built and from which sources?

Here are the details on what I did to switch to upstream otel: nresare@18d3725

@karlkfi
Copy link
Contributor

karlkfi commented Nov 23, 2024

The otel-collector image is also built from an internal fork to patch CVEs. But I think it's a few versions behind, and they've changed the binary entrypoint upstream recently.

@nresare
Copy link

nresare commented Nov 30, 2024

I created a blog post with the details on my little effort: https://noa.re/posts/configsync-on-arm64/

I would very much like to see that page become obsolete by having you support at arm64 in official releases. At least now there some prior art that will simplifying figuring out the effort needed to do this.

Ideally I would like to see a bit of a piecemeal approach where you perhaps update the build infrastructure for the forks you maintain to include both architectures. With go supporting cross compilation and the binaries you already make available having architecture encoded in their names, it might not be such a big effort. This does not need to mean that you make the commitment to support arm64 officially.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants