-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slim container builds #1676
Slim container builds #1676
Conversation
With commit 6e82261 the container starts and runs all scripts as before, bringing the node service online and has generated a block during its Leadership slot. I'm going to take a second pass here shortly at reviewing the layers to determine if there is any observable benefit to splitting the steps further. I'm also open to suggestions to recombine steps if we can maintain the net benefits. |
I don't see any other low hanging fruit within image size or reducing sync times. I'm moving this out of draft to start review and any discussions. |
Took one last pass.
The OS level dependencies are now their own 754 MB layer, and the static binaries are their own 411MB layer. This end result of a change like this would be:
However if cardano binaries have a new release, but the container's OS packages do not change a single layer of 411 MB is part of the pull, not a 1.39 GB layer, a 70% reduction in the size of layer to be pulled. Small changes where only CNCLI, or Ogmios, etc. get changed, but cardano binaries do not, also result in only the 411 MB being pulled vs. the 1.39 GB layer. This scenario may not be the most common, but the splitting of other layers also means that small changes in the OS result in a ~50 MB, ~162 MB, or ~750 MB layer being pulled instead of the large single layer. |
This overlooked the lack of actions cache / buildx usage. So new a image layer would be generated for each layer regardless of changes. It does still provide better parallelization downloading multiple layers which can reduce transfer times. If we want to optimize it further I can enable the cache in a separate Pull Request. This would reduce the size of data being pushed to docker hub, and then would actually result in the described behavior reducing the number of layers and size pulled by an SPO using the container. |
will test in couple of days, unless others get to it sooner |
If a build cache is desirable I will submit another commit which reorders the operations to reduce steps that cause cache invalidation. Otherwise it is ready for review. |
I'm going to double-check that my guildnet pool is running a container build that includes all commits from this branch, which I believe it does. I want to perform some light due diligence before clicking merge since it's been a few weeks. |
After buildx decision restoring the position of script adds. Tested the rest on guildnet and node produces blocks etc. The total size result is ~1.46GB, reducing from around ~1.87GB. |
Description
Simply changes the order of operations to facilitate slimming the container by merging some steps.
The separation of the "final tools" into its own step moves ~60MB to a unique layer. This change allows addition of one or two packages into the final layers and quick rebuilds. The CI does not leverage a build cache, but developers may be able to take advantage of improved observability by reducing feedback loops during development of add on features/functionality.
Where should the reviewer start?
apt-get update
again after earlier purge/cleanup.mkdir -p $CNODE_HOME/priv/files
could be moved into an earlier step, allowing the ADD to be in the Lines 40-43 block of ADD steps, and remove line 90 as line 63 now ensures every script added to/opt/cardano/cnode/scripts
is marked as executable after ADD/curl operations.Motivation and context
Examining the history of the image there is a 405MB layer towards the end. This is actually a duplication of data from the 1.39GB layer.
The current container size is 1.87GB.
Before:
This change results in the container being 1.51GB.
After:
Resulting in a net ~360MB reduction with no changes to packages.
Which issue it fixes?
No issue.
How has this been tested?
Standard container building successful. Leaving this PR in Draft ATM mostly to open discussions on the topic. Guild network pool to be upgraded today for basic functionality test.