-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dockerfile: remove /build/core/.git directory to overwrite with submodule link #442
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose you are trying to fix CI of #441.
This used to work before. Something else is going on here. I don't think we should resort to workarounds just yet.
I'll try to debug via SSH...
Dockerfile
Outdated
@@ -89,6 +89,7 @@ RUN rm $VIRTUAL_ENV/bin/pip* && apt-get purge -y python3-pip && python3 -m venv | |||
# copy (sub)module directories to build | |||
# (ADD/COPY cannot copy a list of directories, | |||
# so we must rely on .dockerignore here) | |||
RUN rm -rf /build/core/.git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can that help at all? The submodules are copied in the follow-up line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the COPY . .
in the following line fails because the /build/core/.git
that comes with the base image is a directory but since core
is a git submodule in ocrd_all, /build/core/.git
is a file pointing to the parent git dir (gitdir: ../.git/modules/core
). And COPY
(as well as cp
and mv
) will abort when trying to overwrite a directory with a non-directory.
I get the same behavior locally, so it is not just in the CD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right! I forgot about that (base stage already yields an independent clone).
So this also begs the question (again) what is supposed to happen if the submodule version of core deviates from the base stage version of core. For now we settled with the approach of determining core's version from the submodule and passing that as tag for the base stage in a Docker build. Perhaps we should put an assertion into the Dockerfile like so:
# from-stage already contains a clone clashing with build context
RUN rm -rf /build/core/.git
COPY . .
# verify we got the right version of core
RUN COREV=`git -C /build/core describe --tags --abbrev=0` && [[ "$BASE_IMAGE" =~ :$COREV ]]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At any rate, I suggested adding a dedicated comment for the RUN rm
step and moving it a few lines up because the existing comment still applies to COPY
only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this also begs the question (again) what is supposed to happen if the submodule version of core deviates from the base stage version of core.
Considering that we derive CORE_VERSION
from git describe
and derive BASE_IMAGE
from CORE_VERSION
in the Makefile, I don't see where this might actually happen. If there is no versioned Docker image matching the version in the core
submodule (ie. the CD of core failed and we did not notice), then the build will fail at the FROM $BASE_IMAGE
line. The only scenario I can think of is that the core
submodule contains unreleased commits, in which case there could be a mismatch, e.g. it is not at version vX
but at vX-commitsha
, ie. we should not use the --abbrev=0
option, so only a core submodule at the exact commit of a tag should be acceptable.
Hence, I've adapted your proposed check to fail the build in this case with a message like
ocrd/core:v2.66.1 inconsistent with core version v2.66.1-1-gd960be4e
Considering this is now a requisite for #441, I'll merge once CI is happy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect!
CI failed as expected with
which is good, but I update core in this PR as well to make sure it won't if all is set up properly. |
Ah, but of course there is a catch. The check won't work if we derive from a non-versioned Docker base image such as I still think the check is a good idea and will open an issue for it but comment it out for now because we really need a new release urgently. |
It looks like
Which leads to exceeding the 1h time limit... |
``
That is the correct behavior as of #436. Upping the |
That's odd. We used to be well below 1h with the new multi-stage build (core → minimum → medium → maximum), which we switched to recently in #436. In fact, it was just 47min. It now times out during layer export – perhaps in this case, merely the network side happened to be slow? Or is |
Ah, I did not notice this was the most recent change. Thanks! |
Otherwise build fails with
I am not sure how we introduced this and this workaround is not pretty (
rm -rf /build/core/.git
) so if somebody has a better solution, I'd be happy to change it.