-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] this script allows us to compare two images #669
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Not bad, but the functionality seems different. This script compares packages, whereas the linked tool compares files in the images. Guess it could be added to the script, but then it would require us to have the tool preinstalled. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @caponetto Actually, I tried that tool before I started to implement this my own thing. I didn't like the output of that script much. It's too verbose and shows complete differences of the image on the file level instead of the package level. I'm not sure whether this is what we want. That is why I went with our own way. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cool! I'm glad you've already considered it. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
#!/bin/bash | ||
# | ||
# This script serves to compare two docker images using skopeo tool. This gives | ||
# a brief information regarding the following image differences: | ||
# - size | ||
# - architecture | ||
# - operating system | ||
# - config | ||
# - default user | ||
# - exposed ports | ||
# - environment variables | ||
# - entrypoint | ||
# - working directory | ||
# - labels | ||
# - Python packages | ||
# - RPM packages | ||
# | ||
# It uses the skopeo TODO downloads images locally... | ||
# | ||
# Local execution: ./ci/compare-images.sh <image-1> <image-2> | ||
# Note: <image-*> is in the format <repository@sha256:SHA> | ||
# | ||
# Example usage: | ||
# ./ci/compare-images.sh quay.io/opendatahub/workbench-images@sha256:e92bf20e127e545bdf56887903dc72ad227082b8bc23f45ff4f0fc67e6430318 ghcr.io/jiridanek/notebooks/workbench-images:base-ubi9-python-3.9-jd_ubi_base_adedd4a943977ecdcb67bc6eb9eda572d10c3ddc | ||
|
||
shopt -s globstar | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about
https://vaneyckt.io/posts/safer_bash_scripts_with_set_euxo_pipefail/ to fail if something needed is missing? I happen to not have skopeo, so
|
||
|
||
|
||
function gather_metadata() { | ||
local image="${1}" | ||
local tmp_dir="${2}" | ||
|
||
local ret_code=0 | ||
|
||
echo "Gathering the metadata for the image: '${image}'" | ||
|
||
local image_sha | ||
image_sha=$(echo "${image}" | cut -d ':' -f2) | ||
echo "Image SHA: '${image_sha}'" | ||
|
||
# Get image size | ||
skopeo inspect --raw "docker://${image}" | jq '[ .layers[].size ] | add' > "${tmp_dir}/${image_sha}-size.txt" | ||
|
||
# Get image metadata | ||
skopeo inspect --config "docker://${image}" | jq -r '.architecture,.os,.config' > "${tmp_dir}/${image_sha}-metadata.txt" | ||
|
||
# If we don't want to download the image, then we may consider to utilize the quay.io info: | ||
# e.g.: https://quay.io/repository/opendatahub/workbench-images/manifest/sha256:f5a2c0666b5b03d68e6f9f2317b67f9bc5c3f4bd469bb7073dd144a33892f63a?tab=packages | ||
# Disadvantage is that it takes some time this info is available on the quay | ||
|
||
|
||
# Get image Python packages list | ||
podman run --entrypoint /usr/bin/pip --rm -it "${image}" list > "${tmp_dir}/${image_sha}-global-pip.txt" | ||
podman run --entrypoint /opt/app-root/bin/pip --rm -it "${image}" list > "${tmp_dir}/${image_sha}-local-pip.txt" | ||
Comment on lines
+53
to
+54
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
# Get image RPM packages list | ||
podman run --entrypoint /usr/bin/rpm --rm -it "${image}" "-qa" > "${tmp_dir}/${image_sha}-rpms.txt" | ||
|
||
echo "Metadata for image '${image}' gathered." | ||
} | ||
|
||
function compare_metadata() { | ||
local tmp_dir="${1}" | ||
|
||
echo "Let's compare the image metadata now:" | ||
|
||
diff -y "${tmp_dir}"/*-size.txt | ||
diff -y "${tmp_dir}"/*-metadata.txt | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is essentially a json, or it can be printed as json, so the diff should use that
some people on stackoverflow suggest |
||
diff -y "${tmp_dir}"/*-global-pip.txt | ||
diff -y "${tmp_dir}"/*-local-pip.txt | ||
diff -y "${tmp_dir}"/*-rpms.txt | ||
Comment on lines
+69
to
+71
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For these, https://man.archlinux.org/man/comm.1.en may work a little bit better? |
||
} | ||
|
||
function print_results() { | ||
echo "Print results TODO" | ||
} | ||
|
||
# ------------------------------ MAIN SCRIPT --------------------------------- # | ||
|
||
function main() { | ||
local image_1="${1}" | ||
local image_2="${2}" | ||
|
||
local ret_code=0 | ||
|
||
if test $# -ne 2; then | ||
echo "Error: please provide two images for comparison!" | ||
return 1 | ||
fi | ||
|
||
# Create a temporary directory for the gathered metadata | ||
local tmp_dir="" | ||
tmp_dir=$(mktemp -d /tmp/compare-images.XXXXX) | ||
|
||
# Gather the metadata for each image | ||
gather_metadata "${1}" "${tmp_dir}" | ||
gather_metadata "${2}" "${tmp_dir}" | ||
|
||
# Compare the metadata and prepare results | ||
compare_metadata "${tmp_dir}" | ||
|
||
# Print results | ||
print_results | ||
|
||
return "${ret_code}" | ||
} | ||
|
||
main "${@}" | ||
exit $? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this. We should (eventually) have the results of this available everytime we update metadata in release-2.8 etc. release branch of images, also everytime we update 2024a etc., and for every PR that makes changes to images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, that is exactly my point; I'll create a tracking issue for this so I can plan this for some of next sprints then
update: https://issues.redhat.com/browse/RHOAIENG-11254
also, I realized I can't edit description of this, so once I address your concerns here, I will probably close this PR and will raise my own later on 🙂