Skip to content

Commit

Permalink
Merge pull request #84 from mdonadoni/apb23-pr
Browse files Browse the repository at this point in the history
Fixes from analysis Preservation Bootcamp 2023 @ Valencia
  • Loading branch information
michmx authored Feb 18, 2024
2 parents e45b9c1 + 39ea07b commit 7095532
Show file tree
Hide file tree
Showing 10 changed files with 160 additions and 61 deletions.
3 changes: 1 addition & 2 deletions _episodes/01-introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,7 @@ very lightweight and fast to spin up to run.
>
> Singularity in particular is used widely in HPC, and particularly by CMS, so you may have need to familiarize yourself with it at some point.
>
> The [RECAST FAQ](https://recast-docs.web.cern.ch/recast-docs/faq/#q-how-are-docker-and-singularity-different) includes a brief intro to the important differences between docker and singularity.
> To learn more about singularity, see for example the [HSF Training Singularity Module](https://github.com/hsf-training/hsf-training-singularity-webpage).
> To learn more about singularity, see for example the [HSF Training Singularity Module](https://hsf-training.github.io/hsf-training-singularity-webpage/).
{: .callout}

[docker-tutorial]: https://docs.docker.com/get-started
Expand Down
9 changes: 3 additions & 6 deletions _episodes/02-pulling-images.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,9 @@ keypoints:

Much like how GitHub allows for web hosting and searching for code, the [Docker Hub][docker-hub]
image registry allows the same for Docker images.
Hosting and building of images is [free for public repositories][docker-hub-billing] and
Hosting images is [free for public repositories][docker-hub-billing] and
allows for downloading images as they are needed.
Additionally, through integrations with GitHub and Bitbucket, Docker Hub repositories can
be linked against Git repositories so that
[automated builds of Dockerfiles on Docker Hub][docker-hub-builds] will be triggered by
pushes to repositories.
Additionally, you can configure your GitLab or GitHub CI pipelines to automatically build and push docker images to Docker Hub.

# Pulling Images

Expand Down Expand Up @@ -123,7 +120,7 @@ debian buster 00bf7fdd8baf 5 weeks ago
{: .challenge}
[docker-hub]: https://hub.docker.com/
[docker-hub-billing]: https://hub.docker.com/billing-plans/
[docker-hub-billing]: https://www.docker.com/pricing/
[docker-hub-builds]: https://docs.docker.com/docker-hub/builds/
[docker-docs-pull]: https://docs.docker.com/engine/reference/commandline/pull/
[docker-docs-images]: https://docs.docker.com/engine/reference/commandline/images/
Expand Down
7 changes: 4 additions & 3 deletions _episodes/03-running-containers.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,10 +65,11 @@ cat /etc/os-release
{: .source}

~~~
PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
Expand Down
8 changes: 7 additions & 1 deletion _episodes/04-file-io.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,13 @@ ls *.txt
{: .source}

>## Permission issues
>If you run into a `Permission denied` error, you can use ``sudo`` for a quick fix, if you have superuser access (but this is not recommended).
>If you are using Linux with SELinux enabled, you might run into a `Permission denied` error.
>Note that SELinux is enabled if the output of the command `getenforce status` is `Enforcing`.
>To fix the permission issue, append `:z` (lowercase!) at the end of the mount option, like this:
>```bash
>docker run --rm -it -v $PWD:/home/docker/data:z ...
>```
>If this still does not fix the issue you can disable SELinux by running `sudo setenforce 0`, or you can try using `sudo` to execute docker commands, but both of these methods are not recommended.
{: .callout}
~~~
Expand Down
10 changes: 5 additions & 5 deletions _episodes/05-dockerfiles.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Docker images are built through the Docker engine by reading the instructions fr
These text based documents provide the instructions through an API similar to the Linux
operating system commands to execute commands during the build.
The [`Dockerfile` for the example image][example-Dockerfile] being used is an example of
some simple extensions of the [official Python 3.6.8 Docker image][python-docker-image].
some simple extensions of the [official Python 3.9 Docker image][python-docker-image] based on Debian Bullseye (`python:3.9-bullseye`).

As a very simple example of extending the example image into a new image create a `Dockerfile`
on your local machine
Expand Down Expand Up @@ -113,8 +113,8 @@ python3 -c "import sklearn as sk; print(sk)"
||----w |
|| ||
scikit-learn 0.21.3
<module 'sklearn' from '/usr/local/lib/python3.6/site-packages/sklearn/__init__.py'>
scikit-learn 1.3.1
<module 'sklearn' from '/usr/local/lib/python3.9/site-packages/sklearn/__init__.py'>
~~~
{: .output}

Expand Down Expand Up @@ -241,9 +241,9 @@ way to bring them into the Docker build.


[docker-docs-builder]: https://docs.docker.com/engine/reference/builder/
[example-Dockerfile]: https://github.com/matthewfeickert/Intro-to-Docker/blob/master/Dockerfile
[example-Dockerfile]: https://github.com/matthewfeickert/intro-to-docker/blob/feat/update-2021-bootcamp/docker/Dockerfile
[python-docker-image]: https://hub.docker.com/_/python
[cowsay]: https://packages.debian.org/jessie/cowsay
[cowsay]: https://packages.debian.org/bullseye/cowsay
[scikit-learn]: https://scikit-learn.org
[docker-docs-build]: https://docs.docker.com/engine/reference/commandline/build/
[docker-docs-ARG]: https://docs.docker.com/engine/reference/builder/#arg
Expand Down
108 changes: 82 additions & 26 deletions _episodes/08-gitlab-ci.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ The goal of automated environment preservation is to create a docker image that
As we've seen, all these components can be encoded in a Dockerfile. So the first step to set up automated image building is to add a Dockerfile to the repo specifying these components.

> ## The `rootproject/root` docker image
> In this tutorial, we build our analysis environments on top of the `rootproject/root` base image ([link to project area on docker hub](https://hub.docker.com/r/rootproject/root)) with conda. This image comes with root 6.22 and python 3.7 pre-installed. It also comes with XrootD for downloading files from eos.
> The `rootproject/root` is itself built with a [Dockerfile](https://github.com/root-project/root-docker/blob/6.22.06-conda/conda/Dockerfile), which uses conda to install root and python on top of another base image (`continuumio/miniconda3`).
> In this tutorial, we build our analysis environments on top of the `rootproject/root` base image ([link to project area on docker hub](https://hub.docker.com/r/rootproject/root)) with conda. This image comes with root 6.22 and python 3.8 pre-installed. It also comes with XrootD for downloading files from eos.
> The `rootproject/root` is itself built with a [Dockerfile](https://github.com/root-project/root-docker/blob/6.22.06-conda/conda/Dockerfile), which uses conda to install root and python on top of another base image (`condaforge/miniforge3`).
{: .callout}

> ## Exercise (15 min)
Expand Down Expand Up @@ -62,6 +62,7 @@ As we've seen, all these components can be encoded in a Dockerfile. So the first
> ~~~
> {: .source}
>
> Hint: have a look at `skim.sh` if you are unsure about how to complete the last `RUN` statement!
> > ## Solution
> > ~~~yaml
> > # Start from the rootproject/root base image with conda
Expand Down Expand Up @@ -102,39 +103,94 @@ As we've seen, all these components can be encoded in a Dockerfile. So the first
## Automatic image building with github + dockerhub
You can automatically build a docker image every time you push to a repository with github and
dockerhub.
Now, you can proceed with updating your `.gitlab-ci.yml` to actually build the container during the CI/CD pipeline and store it in the gitlab registry. You can later pull it from the gitlab registry just as you would any other container, but in this case using your CERN credentials.
1. Create a clone of the skim and the fitting repository on your private github.
You can use the
[GitHub Importer](https://docs.github.com/en/github/importing-your-projects-to-github/importing-a-repository-with-github-importer)
for this. It's up to you whether you want to make this repository public or private.
2. Create a free account on [dockerhub](http://hub.docker.com/).
3. Once you confirmed your email, head to ``Settings`` > ``Linked Accounts``
and connect your github account.
4. Go back to the home screen (click the dockerhub icon top left) and click ``Create Repository``.
5. Choose a name of your liking, then click on the github icon in the ``Build settings``.
Select your account name as organization and select your repository.
6. Click on the ``+`` next to ``Build rules``. The default one does fine
7. Click ``Create & Build``.
> ## Not from CERN?
> If you do not have a CERN computing account with access to [gitlab.cern.ch](https://gitlab.cern.ch), then everything discussed here is also available on [gitlab.com](https://gitlab.com), which offers CI/CD tools, including the docker builder.
> Furthermore, you can achieve the same with GitHub + Github Container Registry.
> To learn more about these methods, see the next subsections.
{: .callout}
That's it! Back on the home screen your repository should appear. Click on it and select the
``Builds`` tab to watch your image getting build (it probably will take a couple of minutes
before this starts). If something goes wrong check the logs.
Add the following lines at the end of the `.gitlab-ci.yml` file to build the image with Kaniko and save it to the docker registry.
For more details about building docker images on CERN's GitLab, see the [Building docker images](https://gitlab.docs.cern.ch/docs/Build%20your%20application/Packages%20&%20Registries/using-gitlab-container-registry#building-docker-images) docs page.
~~~yaml
build_image:
stage: build
variables:
IMAGE_DESTINATION: $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG-$CI_COMMIT_SHORT_SHA
image:
# The kaniko debug image is recommended because it has a shell, and a shell is required for an image to be used with GitLab CI/CD.
name: gcr.io/kaniko-project/executor:debug
entrypoint: [""]
script:
# Prepare Kaniko configuration file
- echo "{\"auths\":{\"$CI_REGISTRY\":{\"username\":\"$CI_REGISTRY_USER\",\"password\":\"$CI_REGISTRY_PASSWORD\"}}}" > /kaniko/.docker/config.json
# Build and push the image from the Dockerfile at the root of the project.
- /kaniko/executor --context $CI_PROJECT_DIR --dockerfile $CI_PROJECT_DIR/Dockerfile --destination $IMAGE_DESTINATION
# Print the full registry path of the pushed image
- echo "Image pushed successfully to ${IMAGE_DESTINATION}"
~~~
{: .source}
<img src="../fig/dockerhub_build.png" alt="DockerHub" style="width:900px">
<!--Now, remove the line `image: rootproject/root` underneath the stages, since the image-building stage uses its own dedicated image for automated image building. You'll then need to explicitly specify that the other stages use this image by adding the line `image: rootproject/root` to the stages, since it's no longer a global specification.-->
Once the build is completed, you can pull your image in the usual way.
Once this is done, you can commit and push the updated `.gitlab-ci.yml` file to your gitlab repo and check to make sure the pipeline passed. If it passed, the repo image built by the pipeline should now be stored on the docker registry, and be accessible as follows:
~~~bash
# If you made your docker repository private, you first need to login,
# else you can skip the following line
docker login
# Now pull
docker pull <username>/<image name>:<tag>
docker login gitlab-registry.cern.ch
docker pull gitlab-registry.cern.ch/[repo owner's username]/[skimming repo name]:[branch name]-[shortened commit SHA]
~~~
{: .source}
You can also go to the container registry on the gitlab UI to see all the images you've built:
<img src="../fig/ContainerRegistry.png" alt="ContainerRegistry" style="width:900px">
Notice that the script to run is just a dummy 'ignore' command. This is because using the docker-image-build tag, the jobs always land on special runners that are managed by CERN IT which run a custom script in the background. You can safely ignore the details.
> ## Recommended Tag Structure
> You'll notice the environment variable `IMAGE_DESTINATION` in the `.gitlab-ci.yml` script above. This controls the name of the Docker image that is produced in the CI step. Here, the image name will be `<reponame>:<branch or tagname>-<short commit SHA>`. The shortened 8-character commit SHA ensures that each image created from a different commit will be unique, and you can easily go back and find images from previous commits for debugging, etc.
>
> As you'll see tomorrow, it's recommended when using your images as part of a REANA workflow to make a unique image for each gitlab commit, because REANA will only attempt to update an image that it's already pulled if it sees that there's a new tag associated with the image.
>
> If you feel it's overkill for your specific use case to save a unique image for every commit, the `-$CI_COMMIT_SHORT_SHA` can be removed. Then the `$CI_COMMIT_REF_SLUG` will at least ensure that images built from different branches will not overwrite each other, and tagged commits will correspond to tagged images.
{: .callout}
### Alternative: GitLab.com
This training module is rather CERN-centric and assumes you have a CERN computing account with access to [gitlab.cern.ch](https://gitlab.cern.ch). If this is not the case, then as with the [CICD training module](https://hsf-training.github.io/hsf-training-cicd/), everything can be carried out using [gitlab.com](https://gitlab.com) with a few slight modifications.
In particular, you will have to specify that your pipeline job that builds the image is executed on a special type of runner with the appropriate `services`. However, unlike at CERN, you can use the docker commands that you have seen in the previous episodes to build and push the docker images.
Add the following lines at the end of the `.gitlab-ci.yml` file to build the image and save it to the docker registry.
~~~yaml
build_image:
stage: build
image: docker:latest
services:
- docker:dind
variables:
IMAGE_DESTINATION: $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG-$CI_COMMIT_SHORT_SHA
script:
- docker build -t $IMAGE_DESTINATION .
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
- docker push $IMAGE_DESTINATION
~~~
{: .source}
In this job, the specific `image: docker:latest`, along with specifying the `services` to contain `docker:dind` are needed to be able to execute docker commands. If you are curious to read about this in detail, refer to the [official gitlab documentation](https://docs.gitlab.com/ee/ci/docker/using_docker_build.html) or [this example](https://gitlab.com/gitlab-examples/docker).
In the `script` of this job there are three components :
- [`docker build`](https://docs.docker.com/engine/reference/commandline/build/) : This is performing the same build of our docker image to the tagged image which we will call `<reponame>:<branch or tagname>-<short commit SHA>`
- [`docker login`](https://docs.docker.com/engine/reference/commandline/login/) : This call is performing [an authentication of the user to the gitlab registry](https://docs.gitlab.com/ee/user/packages/container_registry/#authenticating-to-the-gitlab-container-registry) using a set of [predefined environment variables](https://docs.gitlab.com/ee/ci/variables/predefined_variables.html) that are automatically available in any gitlab repository.
- [`docker push`](https://docs.docker.com/engine/reference/commandline/push/) : This call is pushing the docker image which exists locally on the runner to the gitlab.com registry associated with the repository against which we have performed the authentication in the previous step.
If the job runs successfully, then in the same way as described for [gitlab.cern.ch](https://gitlab.cern.ch) in the previous section, you will be able to find the `Container Registry` on the left hand icon menu of your gitlab.com web browser and navigate to the image that was pushed to the registry. Et voila, c'est fini, exactement comme au CERN!
You can also build Docker images on [github.com](https://github.com) and push them to the GitHub Container Registry ([ghcr.io](https://ghcr.io)) with the help of [GitHub Actions](https://github.com/features/actions).
The bonus episode [Building and deploying a Docker container to Github Packages](/hsf-training-docker/12-bonus/index.html) explains how to do so.
> ## Tag your docker image
> Notice that the command above had a ``<tag>`` specified. A tag uniquely identifies a docker image and is usually used to identify different versions of the same image. The tag name has to be written with ASCII symbols.
Expand Down
Loading

0 comments on commit 7095532

Please sign in to comment.