Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CI on ARM64 #3045

Merged
merged 1 commit into from
Aug 22, 2024
Merged

Support CI on ARM64 #3045

merged 1 commit into from
Aug 22, 2024

Conversation

yankay
Copy link
Contributor

@yankay yankay commented May 31, 2024

To support the ARM64 better, add it to the CI. And to make the CI pass, there are some changes:

  • Because the fluentd image is not a multi-arch image, so using the arm64 image name
  • Because the arm64 system is compatible with armv7, the multi_platform_linux_test.go behavior is different.

@yankay yankay force-pushed the support-ci-arm64 branch 3 times, most recently from 41ce94e to b64eca6 Compare May 31, 2024 12:24
@yankay yankay force-pushed the support-ci-arm64 branch 2 times, most recently from 2435baa to e0f42cf Compare July 3, 2024 02:16
@yankay yankay mentioned this pull request Jul 4, 2024
@yankay yankay force-pushed the support-ci-arm64 branch 5 times, most recently from d01a271 to 6449ca8 Compare July 15, 2024 13:05
@yankay
Copy link
Contributor Author

yankay commented Jul 16, 2024

ref: containerd/containerd#10466

@yankay yankay force-pushed the support-ci-arm64 branch 3 times, most recently from 2fb1b87 to 9bf3fad Compare July 16, 2024 11:24
@yankay yankay changed the title [WIP]support ci arm64 Support CI on ARM64 Jul 16, 2024
@yankay yankay force-pushed the support-ci-arm64 branch from 9bf3fad to c62a700 Compare July 16, 2024 11:55
@yankay yankay requested a review from AkihiroSuda July 16, 2024 11:56
@@ -317,3 +317,24 @@ jobs:
run: sudo vagrant up --provision-with=test-unit
- name: test-integration
run: sudo vagrant up --provision-with=test-integration

test-integration-arm64:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just merge this into test-integration with a matrix for runs-on ?

Dockerfile Outdated
@@ -356,4 +356,8 @@ FROM test-integration AS test-integration-ipv6
CMD ["gotestsum", "--format=testname", "--rerun-fails=2", "--packages=github.com/containerd/nerdctl/v2/cmd/nerdctl/...", \
"--", "-timeout=30m", "-args", "-test.kill-daemon", "-test.ipv6"]

FROM test-integration AS test-integration-arm64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't need to be a separate stage AFAICS

@@ -398,6 +398,13 @@ func TestRunSigProxy(t *testing.T) {
}
}

func fluentdImage() string {
if runtime.GOARCH == "arm64" {
return "fluent/fluentd:v1.14-arm64-debian"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be defined as a const in testutil

@@ -41,6 +41,7 @@ var (
MariaDBImage = mirrorOf("mariadb:10.5")
DockerAuthImage = mirrorOf("cesanta/docker_auth:1.7")
FluentdImage = mirrorOf("fluent/fluentd:v1.14-1")
FluentdImageArm64 = mirrorOf("fluent/fluentd:v1.14-arm64-debian")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v1.17.0-debian-1.0 is a proper multi-arch image. Do we really need to have multiple images and testing here?

@yankay yankay force-pushed the support-ci-arm64 branch 3 times, most recently from c6f4b5b to 1deba48 Compare August 6, 2024 03:02
@@ -159,6 +166,10 @@ RUN uname -m > /usr/share/nginx/html/index.html
"http://127.0.0.1:8081": "aarch64",
"http://127.0.0.1:8082": "armv7l",
}
// using aarch64 instead of armv7l because the arm64 system is compatible with armv7.
Copy link
Contributor

@apostasie apostasie Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @yankay I do not understand why we should do that / what is it bringing us (actually I do not understand how that can work? - this clearly fails for me locally).

Copy link
Contributor Author

@yankay yankay Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @apostasie

Ref to containerd/containerd#10466. The ctr has the same behavior as nerdctl.
When we run armv7 on x86, the binfmt uses qemu to emulate. So the uname -m commands show armv7 .
But when we run armv7 on aarch64, it does not need to emulate the Linux runs the binary with native fallback execution mode. So the uname -m commands show aarch64 .

Ref to: docker/buildx#198

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My host is on arm64 (with qemu installed, as we do on the CI):

$ uname -m
aarch64
$ nerdctl run --rm --platform arm alpine uname -m
armv7l

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tonis comment (in docker/buildx#198 (comment) ) makes it clear that this is without qemu.

But here we do enable qemu - which is why I am confused at how this could work (especially since it does not, locally)

Copy link
Contributor Author

@yankay yankay Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MY test environment is

root@iZj6c145pjnolrzzpcuvx2Z:~# uname -a
Linux iZj6c145pjnolrzzpcuvx2Z 5.15.0-76-generic #83-Ubuntu SMP Thu Jun 15 19:21:56 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

root@iZj6c145pjnolrzzpcuvx2Z:~# ctr run --rm  --platform=linux/arm/v7 docker.io/library/alpine:3.13 a uname -m 
aarch64 

It's strange that the x86_64 emulates correctly, so the qemu is installed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearly the call to binfmt on the CI did not install the arm emulator, while it did on my machine. As to the why, my money is on the fact that the M1 may not have the magic native execution fallback thing.

Anyhow:
a. we should pin toniis image instead of using latest
b. looking at the source of it, it has clearly grown in complexity and trying too hard to do "magic" IMHO - --install all clearly no longer means --install all, it is now "install what it thinks you need". I would suggest we stop using it entirely or at least we stop doing --install-all and we instead explicitly install the architectures we need

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @apostasie . It works. 👍

@yankay yankay force-pushed the support-ci-arm64 branch 3 times, most recently from e95ec45 to c0c0a01 Compare August 16, 2024 02:59
@yankay yankay requested a review from apostasie August 16, 2024 08:52
Signed-off-by: Kay Yan <[email protected]>
@yankay yankay closed this Aug 16, 2024
@yankay yankay reopened this Aug 16, 2024
@yankay yankay requested a review from AkihiroSuda August 16, 2024 09:51
Copy link
Contributor

@apostasie apostasie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @yankay !

Glad we are testing on arm64!

@@ -40,7 +40,7 @@ var (
WordpressIndexHTMLSnippet = "<title>WordPress &rsaquo; Installation</title>"
MariaDBImage = mirrorOf("mariadb:10.5")
DockerAuthImage = mirrorOf("cesanta/docker_auth:1.7")
FluentdImage = mirrorOf("fluent/fluentd:v1.14-1")
FluentdImage = "fluent/fluentd:v1.17.0-debian-1.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we generally avoid Docker Hub because of the dreaded 429? (which I guess is why we mirror stuff on ghcr?)
I also think this approach is problematic (as obviously publishing there is a process we cannot control from this repo).
I would prefer a solution where we cache the images on the CI.
Also, we might want to start pulling by digest for these.

Anyhow, this is fine ^ - and these problems can be addressed later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @yankay

@apostasie
Copy link
Contributor

@AkihiroSuda this here looks fine to me now (no more platform specialization / monkeying with the binfmt force install patch).

Do you think we could merge this?

@apostasie
Copy link
Contributor

@fahedouch @djdongjin

Akihiro seems to be busy / out in the past few days.

Do you think we could get this reviewed?

@djdongjin
Copy link
Member

I can take a look today

Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, merging, but please consider using GHCR to avoid rate limit

@AkihiroSuda AkihiroSuda merged commit aa31c02 into containerd:main Aug 22, 2024
39 of 40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci e.g., CI failure platform/ARM ARM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants