Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] this script allows us to compare two images #669

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 109 additions & 0 deletions ci/compare-images.sh
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. We should (eventually) have the results of this available everytime we update metadata in release-2.8 etc. release branch of images, also everytime we update 2024a etc., and for every PR that makes changes to images.

Copy link
Member

@jstourac jstourac Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that is exactly my point; I'll create a tracking issue for this so I can plan this for some of next sprints then

update: https://issues.redhat.com/browse/RHOAIENG-11254
also, I realized I can't edit description of this, so once I address your concerns here, I will probably close this PR and will raise my own later on 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you seen this? Maybe it could be useful.

Not bad, but the functionality seems different. This script compares packages, whereas the linked tool compares files in the images. Guess it could be added to the script, but then it would require us to have the tool preinstalled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caponetto Actually, I tried that tool before I started to implement this my own thing. I didn't like the output of that script much. It's too verbose and shows complete differences of the image on the file level instead of the package level. I'm not sure whether this is what we want. That is why I went with our own way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I'm glad you've already considered it.

Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
#!/bin/bash
#
# This script serves to compare two docker images using skopeo tool. This gives
# a brief information regarding the following image differences:
# - size
# - architecture
# - operating system
# - config
# - default user
# - exposed ports
# - environment variables
# - entrypoint
# - working directory
# - labels
# - Python packages
# - RPM packages
#
# It uses the skopeo TODO downloads images locally...
#
# Local execution: ./ci/compare-images.sh <image-1> <image-2>
# Note: <image-*> is in the format <repository@sha256:SHA>
#
# Example usage:
# ./ci/compare-images.sh quay.io/opendatahub/workbench-images@sha256:e92bf20e127e545bdf56887903dc72ad227082b8bc23f45ff4f0fc67e6430318 ghcr.io/jiridanek/notebooks/workbench-images:base-ubi9-python-3.9-jd_ubi_base_adedd4a943977ecdcb67bc6eb9eda572d10c3ddc

shopt -s globstar
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about

set -Eeuxo pipefail

https://vaneyckt.io/posts/safer_bash_scripts_with_set_euxo_pipefail/

to fail if something needed is missing?

I happen to not have skopeo, so

./ci/compare-images.sh quay.io/opendatahub/workbench-images@sha256:e92bf20e127e545bdf56887903dc72ad227082b8bc23f45ff4f0fc67e6430318 ghcr.io/jiridanek/notebooks/workbench-images:base-ubi9-python-3.9-jd_ubi_base_adedd4a943977ecdcb67bc6eb9eda572d10c3ddc
./ci/compare-images.sh: line 26: shopt: globstar: invalid shell option name
Gathering the metadata for the image: 'quay.io/opendatahub/workbench-images@sha256:e92bf20e127e545bdf56887903dc72ad227082b8bc23f45ff4f0fc67e6430318'
Image SHA: 'e92bf20e127e545bdf56887903dc72ad227082b8bc23f45ff4f0fc67e6430318'
./ci/compare-images.sh: line 42: skopeo: command not found
./ci/compare-images.sh: line 45: skopeo: command not found



function gather_metadata() {
local image="${1}"
local tmp_dir="${2}"

local ret_code=0

echo "Gathering the metadata for the image: '${image}'"

local image_sha
image_sha=$(echo "${image}" | cut -d ':' -f2)
echo "Image SHA: '${image_sha}'"

# Get image size
skopeo inspect --raw "docker://${image}" | jq '[ .layers[].size ] | add' > "${tmp_dir}/${image_sha}-size.txt"

# Get image metadata
skopeo inspect --config "docker://${image}" | jq -r '.architecture,.os,.config' > "${tmp_dir}/${image_sha}-metadata.txt"

# If we don't want to download the image, then we may consider to utilize the quay.io info:
# e.g.: https://quay.io/repository/opendatahub/workbench-images/manifest/sha256:f5a2c0666b5b03d68e6f9f2317b67f9bc5c3f4bd469bb7073dd144a33892f63a?tab=packages
# Disadvantage is that it takes some time this info is available on the quay


# Get image Python packages list
podman run --entrypoint /usr/bin/pip --rm -it "${image}" list > "${tmp_dir}/${image_sha}-global-pip.txt"
podman run --entrypoint /opt/app-root/bin/pip --rm -it "${image}" list > "${tmp_dir}/${image_sha}-local-pip.txt"
Comment on lines +53 to +54
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--disable-pip-version-check ?


# Get image RPM packages list
podman run --entrypoint /usr/bin/rpm --rm -it "${image}" "-qa" > "${tmp_dir}/${image_sha}-rpms.txt"

echo "Metadata for image '${image}' gathered."
}

function compare_metadata() {
local tmp_dir="${1}"

echo "Let's compare the image metadata now:"

diff -y "${tmp_dir}"/*-size.txt
diff -y "${tmp_dir}"/*-metadata.txt
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially a json, or it can be printed as json, so the diff should use that

diff <(jq --sort-keys . A.json) <(jq --sort-keys . B.json)

some people on stackoverflow suggest jd -set, https://stackoverflow.com/questions/31930041/using-jq-or-alternative-command-line-tools-to-compare-json-files

diff -y "${tmp_dir}"/*-global-pip.txt
diff -y "${tmp_dir}"/*-local-pip.txt
diff -y "${tmp_dir}"/*-rpms.txt
Comment on lines +69 to +71
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these, https://man.archlinux.org/man/comm.1.en may work a little bit better?

}

function print_results() {
echo "Print results TODO"
}

# ------------------------------ MAIN SCRIPT --------------------------------- #

function main() {
local image_1="${1}"
local image_2="${2}"

local ret_code=0

if test $# -ne 2; then
echo "Error: please provide two images for comparison!"
return 1
fi

# Create a temporary directory for the gathered metadata
local tmp_dir=""
tmp_dir=$(mktemp -d /tmp/compare-images.XXXXX)

# Gather the metadata for each image
gather_metadata "${1}" "${tmp_dir}"
gather_metadata "${2}" "${tmp_dir}"

# Compare the metadata and prepare results
compare_metadata "${tmp_dir}"

# Print results
print_results

return "${ret_code}"
}

main "${@}"
exit $?