Skip to content

Commit

Permalink
update f79aa71
Browse files Browse the repository at this point in the history
  • Loading branch information
BuildTheDocs authored and BuildTheDocs committed Apr 19, 2024
0 parents commit 8ca9fd2
Show file tree
Hide file tree
Showing 67 changed files with 8,428 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 06e61cda1f3049a80646cc8001f049be
tags: 645f666f9bcd5a90fca523b33c5a78b7
Empty file added .nojekyll
Empty file.
Binary file added _images/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
136 changes: 136 additions & 0 deletions _sources/context.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
.. _qus:context:

Context: containers and QEMU
############################

> *Docker is a computer program that performs operating-system-level virtualization, also known as "containerization".
(...) Containers are (software packages) isolated from each other and bundle their own application, tools, libraries
and configuration files;
(...) All containers are run by a single operating-system kernel and are thus more lightweight than virtual machines.*

> *Docker (...) uses the resource isolation features of the Linux kernel such as cgroups and kernel namespaces, and a
union-capable file system such as OverlayFS and others to allow independent "containers" to run within a single Linux
instance, avoiding the overhead of starting and maintaining virtual machines (VMs).
The Linux kernel's support for namespaces mostly isolates an application's view of the operating environment,
including process trees, network, user IDs and mounted file systems, while the kernel's cgroups provide resource
limiting for memory and CPU.*

-- Wikipedia: Docker_(software) :cite:p:`w:docker`

> *QEMU (short for Quick Emulator) is a free and open-source emulator that performs hardware virtualization.
(...) it emulates the machine's processor through dynamic binary translation and provides a set of different hardware
and device models for the machine, enabling it to run a variety of guest operating systems.
It also can be used with KVM to run virtual machines at near-native speed (by taking advantage of hardware extensions
such as Intel VT-x).
QEMU can also do emulation for user-level processes, allowing applications compiled for one architecture to run on
another.*

-- Wikipedia: QEMU :cite:p:`w:qemu`

QEMU modes
==========

The main issue to underline is that QEMU provides multiple operating modes:
`full-system emulation <https://qemu.weilnetz.de/doc/qemu-doc.html#QEMU-System-emulator-for-non-PC-targets>`__,
`user-mode emulation <https://qemu.weilnetz.de/doc/qemu-doc.html#QEMU-User-space-emulator>`__ and
`virtualization <https://wiki.qemu.org/Features/KVM>`__.
Some of them allow dynamic binary translation of the instruction set, endianness and 32/64 bit mismatches.
Besides, some focus on isolation between the host and the guest, and/or on performance.

Virtualization and full-system emulation (when the guest architecture is the same as the host's) are similar to docker
containers, but each machine runs it's own kernel while containers use/share the kernel of the host.
This has been addressed from different perspectives.
For example, the content of a docker image can extracted and used in a QEMU machine :cite:p:`rottenkolber15`.
Conversely, a QEMU image in QCOW2 format can be converted to a docker image :cite:p:`golfayi`.
Furthermore, approaches such as Kata Containers :cite:p:`katacontainers` provide alternative runtimes for docker to
seamlessly bring the best of both: execute containers on top of QEMU virtualization.

Unfortunately, virtualization of foreign architectures is not supported [#f1]_, so it is out of the scope of Kata
Containers for now [#f2]_.
As a result, execution of docker containers on a qemu-system VM requires the user to learn how to handle images, launch
options and communication between the host and the VM.
For example, in :cite:p:`taylor16`, an image for RPi is created.
On the one hand, the process requires manual steps (although it can be probably automated as in
`rouault/gdal_coverage: .travis.yml <https://github.com/rouault/gdal_coverage/blob/freebsd9.2/.travis.yml>`__ from
:cite:p:`rouault16`).
On the other hand, the execution command is not friendly for new users:
``qemu-system-arm -kernel raspberry-qemu/kernel-qemu -cpu arm1176 -m 256 -M versatilepb -no-reboot -serial stdio -append "root=/dev/sda2 panic=1 rootfstype=ext4 rw" -net user,hostfwd=tcp::10022-:22 -net nic -display none -hda 2015-11-21-raspbian-jessie-lite.img``.
Moreover, that example does not consider the execution of docker, which is the target of this project.

.. TIP::
Contributions of example scripts to automatically provision QEMU images for some known SBC
(such as `PYNQ <http://www.pynq.io/board.html>`__,
`Raspberry Pi <https://www.raspberrypi.org/), [96boards.org](https://www.96boards.org/>`__,
`Pine64 <https://www.pine64.org>`__, etc.) allowing running docker images are welcome!
Please `open a pull request <https://github.com/dbhi/qus/compare>`__.

Alternatively, in user-mode emulation, QEMU runs a program for another Linux/BSD on any supported architecture.
System calls are thunked for endianness and for 32/64 bit mismatches, so that the program is executed as any other
application on the host.

It is to be noted that user-mode emulation has three main caveats.
First, user-mode emulation seems to be less polished than full-system emulation, so it might crash if non-supported
features are used :cite:p:`voipio17`.
Second, because the underlying machine is the host, there is no emulated kernel and hardware resources specific to the
target device/system are not available (unlike in a fully-featured VM).
Third, there is no isolation between the program and the host, so malicious programs can gain privileges.

Nevertheless, within its contraints, it is a very valuable solution for cross-building and executing foreign docker
images.
This is specially so in free/public CI environments, because most provides do not support native architectures others
than x86-64.
Hence, QEMU allows to build, for instance, docker images for Raspberry Pi in GitHub Actions.
Precisely, *qus* is used in dbhi/containers :cite:p:`dbhi-containers` to build multiarch images (for ``arm32v7``, ``arm64v8``
and ``amd64``).
Moreover, since 2017, QEMU is installed and enabled by default with Docker Desktop [#f3]_; thus, features equivalent to
a subset of what *qus* provides are available off the shelf on Windows and macOS.
Regarding isolation, the fact that programs are executed inside docker containers does allow to partially restrict them
[#f4]_.

Summarizing, this repository is focused on alternatives to configure and use QEMU in user-mode emulation mode.
Nonetheless, we are open to contributions of examples with system-mode emulation.

Installing QEMU
===============

As explained at `qemu.org/download <https://www.qemu.org/download/>`__, QEMU is packaged by most Linux distributions,
so either of ``qemu-user`` or ``qemu-user-static`` can be installed through package managers.
Furthermore, since ``qemu-user-static`` packages contain statically built binaries :cite:p:`w:static-build`, it is
possible to extract them directly.
That is, to retrieve pre-built packages, extract the desired binary, and copy it to the development workstation.
Alternatively, QEMU can be built from sources.

Either of the installation procedures allows to execute a binary for a foreign architecture by prepending the
corresponding QEMU executable. E.g.:

.. code-block:: bash
qemu-<arch>[-static] <binary>
This procedure is straightforward for explicitly executing a few binaries.
However, it is not practical in the context of docker images, because it would require dockerfiles and scripts to be
modified ad-hoc.
Fortunately, the Linux kernel has a capability named ``binfmt_misc`` :cite:p:`w:binfmt_misc` which allows arbitrary
executable file formats to be transparently recognized and passed to certain applications :cite:p:`bottomley16`
:cite:p:`corbet16`.
This is configured either by directly sending special sequences to the register file in a special purpose file system
interface (usually mounted under part of ``/proc``), or using a wrapper (like Debian-based distributions) or systemd's
``systemd-binfmt.service``.

Moreover, in version 4.8 of the kernel a new flag was added to the ``binfmt`` handlers :cite:p:`kernelnewbies`.
It allows to open the emulation binary when it is registered, so in future it is cloned from the open file.
This is specially useful because it allows to work with foreign architecture containers without contaminating the
container image.

.. [#f1]
See `wiki.qemu.org: Features/KVM <https://wiki.qemu.org/Features/KVM>`__.
.. [#f2]
See `kata-containers/runtime#1280 <https://github.com/kata-containers/runtime/issues/1280>`__.
.. [#f3]
See :ref:`qus:related:linuxkit` below.
.. [#f4]
See `docs.docker.com: Docker security <https://docs.docker.com/engine/security/security/>`__ and
`mviereck/x11docker: Security <https://github.com/mviereck/x11docker#security>`__.
33 changes: 33 additions & 0 deletions _sources/development.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
.. _qus:development:

Development
###########

Continuous Integration/Delivery
===============================

A GitHub Actions workflow is triggered after each push.
Container images and manifests are built and pushed to the *docker.io* registry.
Moreover, :ref:`qus:tests` are executed.
On tagged commits, deb packages are extracted and artifacts are pushed to GitHub Releases.

Roadmap
=======

* This project uses a modified ``qemu-binfmt-conf.sh`` script from `umarcor/qemu <https://github.com/umarcor/qemu/tree/series-qemu-binfmt-conf>`__,
which includes some enhancements.
These patches have already been submitted upstream and will be hopefully included in future releases.

* *CLI*: apart from checking whether a new version is available upstream, the Python CLI tool (see :qussrc:`cli`)
can provide tables showing the available assets/packages.
It would be interesting to add that info to the web site.
On the other hand, builds and tests are currently written in :qussrc:`run.sh`.
Ideally, those would be migrated/merged into the CLI tool.

* *Dropping the kernel dependency*: ``sudo`` privileges, which are required in order to register ``binfmt`` formats, are
not available in all contexts [#f1]_.
In :cite:p:`angelatos15`, an alternative to ``binfmt`` is proposed.
However, that approach has not been implemented in this repo yet.

.. [#f1]
See, for example, `play-with-docker/play-with-docker#276 <https://github.com/play-with-docker/play-with-docker/issues/276>`__.
67 changes: 67 additions & 0 deletions _sources/faq.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
.. _qus:faq:

Frequently Asked Questions (FAQ)
################################

Does `qus` work for building images?
====================================

Yes, once the QEMU binary is configured/loaded, both building images and running containers is supported.

Do I need to install ``qemu-*-static`` on the host, even though it is only needed by the containers?
====================================================================================================

You can use static binaries at any location.
So, you don't need to install all the ``qemu-*-static`` on the host.
You just download the ones you want/need to a temporal folder.
See cases ``c``, ``C``, ``v`` or ``V`` in :ref:`qus:tests`.

It is still a downside that any other process in the host will use these binaries.
However, the advantage is that you don't need to copy anything in the docker images.
You can use them directly.
Furthermore, you can run multiple foreign containers with a single binary, instead of copying it to all the images.

If the container depends on a given minimum QEMU version, do I need to ensure that the host provides this version?
==================================================================================================================

You can put the ``qemu-*-static`` binaries of the version you want in a temporary folder on the host (and only for the
foreign architectures you need).
Then, temporarily use those binaries system-wide, as commented above.
When you are done, just reset the registered formats.
See ``QEMU_BIN_DIR`` in the register script.

How can I use ``qemu-*-static`` binaries without registering them with the *persistent* flag and without installing them in the container image?
================================================================================================================================================

If you don't use ``-p``, you can still share the binary/ies from the host with the containers.
The advantages of this approach are that you can use a single binary for multiple containers, you can use the version of
qemu that you want, and other processes on the host can use different versions of QEMU.
The downside is that we don't know how to make it work with ``docker build`` yet.
See cases ``v`` or ``V`` in ref:`qus:tests`.

Moreover, the ``qemu-*-static`` binaries can be saved in a docker volume.
This allows to avoid saving them on the host and to run multiple containers with ``--volumes-from``.

How can the scripts in aptman/qus be customized?
================================================

For testing purposes, it is possible to customize scripts ``register.sh`` or ``qemu-binfmt-conf.sh``, which are the
default entrypoint in image aptman/qus.
In order to do so, get a copy of any or both of them, and modify them locally.
Then:

* Test that you can overwrite the copy inside the container with it:

.. code-block:: sh
$ docker run --rm --privileged -itv $(pwd)/qemu-binfmt-conf.sh:/qus/qemu-binfmt-conf.sh --entrypoint=sh aptman/qus
# cat /qus/qemu-binfmt-conf.sh
* If successful, use it to run *regular* commands:

.. code-block:: sh
docker run --rm --privileged -v $(pwd)/qemu-binfmt-conf.sh:/qus/qemu-binfmt-conf.sh aptman/qus -s -- -p
For instance, this is used in :qussharp:`4`, to work around an upstream bug that prevents 32-bit ARM
interpreters from being registered on 64-bit only ARM hosts.
78 changes: 78 additions & 0 deletions _sources/images.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
.. _qus:images:

Provided container images
#########################

When QEMU is installed from distribution package managers, it is normally set up along with ``binfmt_misc``.
Nonetheless, in the context of this project we want to configure it with custom options, instead of relying on the
defaults.
A script provided by QEMU, `qemu-binfmt-conf.sh <https://raw.githubusercontent.com/qemu/qemu/master/scripts/qemu-binfmt-conf.sh>`__,
can be used to do so.
Among other options, the flag that tells ``binfmt`` to hold interpreters in memory is supported in ``qemu-binfmt-conf.sh``,
as ``-p``.

This project uses a modified version of ``qemu-binfmt-conf.sh`` [#f1]_, which includes the following enhancements:

* Optionally, the list of QEMU interpreters to be registered on the host can be limited.
* Add option ``--clear``.
* Add option ``--test``.

In fact, the entrypoint to the following docker images is a wrapper [#f2]_ around ``qemu-binfmt-conf.sh`` to provide
some synctactic sugar.

Manifests
=========

Manifests are provided for the following hosts: ``amd64``, ``arm64v8``, ``arm32v7``, ``arm32v6``, ``i386``, ``s390x`` or
``ppc64le``.
That is, any of the target architectures provided by QEMU can be used on any of those hosts.

The procedure to generate each image involves extracting pre-built binaries and packaging them in container images,
along with helper scripts.
Hence, multiple images are generated in the process:

* ``aptman/qus:pkg``:
all the ``qemu-*-static`` binaries from `packages.debian.org/sid/qemu-user-static <https://packages.debian.org/sid/qemu-user-static>`__
extracted on a ``scratch`` image.

* ``aptman/qus:register``:
a ``busybox`` image with :qussrc:`register.sh` and `qemu-binfmt-conf.sh <https://raw.githubusercontent.com/qemu/qemu/master/scripts/qemu-binfmt-conf.sh>`__.
The entrypoint is set to ``register.sh``.

* ``aptman/qus``:
union of the two previous images.

.. TIP::
Find usage instructions in the :qussrc:`README <README.md#usage>`.

Debian
======

For each ``HOST_ARCH``, an image named ``${HOST_ARCH}-d${VERSION}${TAG}`` is published; where ``TAG`` is
``-pkg|-register|""``.
Moreover, three manifests are available:
``aptman/qus:d${VERSION}-pkg``,
``aptman/qus:d${VERSION}-register``
and ``aptman/qus:d${VERSION}``.

.. TIP::
``latest``/default versions above correspond to these Debian variants. Therefore, running ``aptman/qus`` on an
``amd64`` host is equivalent to running ``aptman/qus:d6.2`` or ``aptman/qus:amd64-d6.2``.

Apart from those, ``aptman/qus:mips-pkg`` and ``aptman/qus:mips64el-pkg`` are also available.

Fedora
======

For each ``HOST_ARCH`` (except ``arm32v6``), an image named ``${HOST_ARCH}-f${VERSION}${TAG}`` is published; where
``TAG`` is ``-pkg|-register|""``.
Moreover, three manifests are available:
``aptman/qus:f${VERSION}-pkg``,
``aptman/qus:f${VERSION}-register``
and ``aptman/qus:f${VERSION}``.

.. [#f1]
See `umarcor/qemu: series-qemu-binfmt-conf/scripts/qemu-binfmt-conf.sh <https://github.com/umarcor/qemu/blob/series-qemu-binfmt-conf/scripts/qemu-binfmt-conf.sh>`__.
.. [#f2]
See :qussrc:`register.sh <register.sh>`.
Loading

0 comments on commit 8ca9fd2

Please sign in to comment.