-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Add support for encrypted images #2297
Draft
rst0git
wants to merge
279
commits into
checkpoint-restore:criu-dev
Choose a base branch
from
rst0git:encrypted-images
base: criu-dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add a sanity check for THP_DISABLE. This discovered a broken commit in Google's kernel tree. Signed-off-by: Michał Mirosław <[email protected]>
Apparently Skylake uses init-optimization when saving FPU state, and ptrace() returns XSTATE_BV[0] = 0 meaning FPU was not used by a task (in init state). Since CRIU restore uses sigreturn to restore registers, FPU state is always restored. Fill the state with default values on dump to make restore happy. Signed-off-by: Michał Mirosław <[email protected]>
This commit revises the error handling in the fdspy test. Previously, a failure case could have been incorrectly reported as successful because of a specific check `pass != 0`, leading to potential false positives when `check_pipe_ends()` returned `-1` due to a read/write pipe error. To improve this, we've adjusted the error handling to return `0` in case of any error. As such, the final success condition remains unchanged. This approach will help accurately differentiate between successful and failed cases, ensuring the output "All OK" is printed for success, and "Something went WRONG" for any failure. Fixes: 5364ca3 ("compel/test: Fix warn_unused_result") Signed-off-by: Haorong Lu <[email protected]>
Signed-off-by: Michał Mirosław <[email protected]>
Signed-off-by: Michał Mirosław <[email protected]>
Signed-off-by: Michał Mirosław <[email protected]>
Use $TMPDIR for tests_root as the host's /tmp might not have enough features or space. Signed-off-by: Michał Mirosław <[email protected]>
Extend ability to limit time taken to all CRIU invocations. Signed-off-by: Michał Mirosław <[email protected]>
We don't want test framework to change its behaviour on whether we run a single or multiple tests in a run. When we shard the test suite it can result in some shards having a single test to run and unexpectedly change the test output format. Signed-off-by: Michał Mirosław <[email protected]>
Allow to split test suite into predictable sets to parallelize runs on multiple machines or VMs. Signed-off-by: Michał Mirosław <[email protected]>
Make it clear that the option numbers are indexes not the option identifiers ("names"). Also show the value change that prompted test failure. Signed-off-by: Michał Mirosław <[email protected]>
Make it possible to skip network lock to enable uses that break connections anyway to work without iptables/nftables being present. Signed-off-by: Michał Mirosław <[email protected]>
The fail() macro provides a new line character at the end of the message. This patch fixes the following lint check that currently fails in CI: $ git --no-pager grep -E '^\s*\<(pr_perror|fail)\>.*\\n"' test/zdtm/static/thp_disable.c: fail("prctl(GET_THP_DISABLE) returned unexpected value: %d != 1\n", ret); test/zdtm/static/thp_disable.c: fail("Flags changed %lx -> %lx\n", orig_flags, new_flags); test/zdtm/static/thp_disable.c: fail("Madvs changed %lx -> %lx\n", orig_madv, new_madv); test/zdtm/static/thp_disable.c: fail("post-migration prctl(GET_THP_DISABLE) returned unexpected value: %d != 1\n", ret); test/zdtm/static/thp_disable.c: fail("Flags changed %lx -> %lx\n", orig_flags, new_flags); test/zdtm/static/thp_disable.c: fail("Madvs changed %lx -> %lx\n", orig_madv, new_madv); Fixes: checkpoint-restore#2193 Signed-off-by: Radostin Stoyanov <[email protected]>
During dump, CRIU stores the structs representing sockets in a statically sized hashmap of size 32. We have some (admittedly crazy) tasks that use tens of thousands of sockets, and seem to spend most of the dump time iterating over the linked lists of the map. 16K is chosen arbitrarily, so that it reduces the lengths of the chains to few elements on average, while not introducing significant memory overhead. From: Radosław Burny <[email protected]> Signed-off-by: Michał Mirosław <[email protected]>
From: Piotr Figiel <[email protected]> Signed-off-by: Michał Mirosław <[email protected]>
Try IPv6 if IPv4 sockets are not supported. Signed-off-by: Michał Mirosław <[email protected]>
Signed-off-by: Michał Mirosław <[email protected]>
The test for HAS_MEMFD is empty and noit used. Remove it. Fixes: 5ee1ac1 ("criu: remove FEATURE_TEST_MEMFD") Change-Id: I43b8f0cfd50ce9bdf93dafb647377318df1deae8 Signed-off-by: Michał Mirosław <[email protected]>
`make` without `-s` option will normally show the commands executed. In the case of detecting build environment features current makefile will cause detected features to be seen as 'echo #define' commands, but not detected ones will be silent. Change it so that all tried features can be seen (outside of make's silent mode) regardless of detection result. Signed-off-by: Michał Mirosław <[email protected]>
$LDFLAGS can contain `-Ldir`s that are required by '-lib's in $LIBS. Reverse the order so that `-L` options make effect. Signed-off-by: Michał Mirosław <[email protected]>
Make $(AR) used also for libzdtmtst build. Signed-off-by: Michał Mirosław <[email protected]>
When trying to build CRIU with libbsd enabled the compilation fails due to duplicate definition of __aligned macro. Other such definitions are already wrapped with #ifndef make __aligned definition consistent and make it easier in the future to use the libbsd features if needed. Signed-off-by: Michał Mirosław <[email protected]>
nla_get_s32() was added to libnl 3.2.7 in 2015. Remove CRIU's definition as it breaks build when statically linking the binary. From: Uros Prestor <[email protected]> Signed-off-by: Michał Mirosław <[email protected]>
Container runtimes commonly use CRIU with RPC. However, this prevents the use of action-scripts set in a CRIU configuration file due to the explicit scripts mode introduced with the following commit: ac78f13 actions: Introduce explicit scripts mode This patch enables container checkpoint/restore with action-scripts specified via configuration file. Signed-off-by: Radostin Stoyanov <[email protected]>
Signed-off-by: Radostin Stoyanov <[email protected]>
New 'query-ext-files' action for `criu dump` is sent after freezing the process tree. This allows to defer gathering the external file list when the process tree is in a stable state and avoids race with the process creating and deleting files. Change-Id: Iae32149dc3992dea086f513ada52cf6863beaa1f Signed-off-by: Michał Mirosław <[email protected]>
Google's RPC client process is in a different pidns and has more privileges -- CRIU can't open its /proc/<pid>/fd/<fd>. For images_dir_fd to be useful here it would need to refer to a passed or CRIU's fd. From: Michał Cłapiński <[email protected]> Change-Id: Icbfb5af6844b21939a15f6fbb5b02264c12341b1 Signed-off-by: Michał Mirosław <[email protected]>
If the error is ignored it is not important enough - make it a warning instead. From: Mian Luo <[email protected]> Change-Id: If2641c3d4e0a4d57fdf04e4570c49be55f526535 Signed-off-by: Michał Mirosław <[email protected]>
kerndat_nsid() is not used outside kerndat.c. Make it static. Change-Id: I52e518ecb7c627cc1866e373411b2be3f71a2c9d Signed-off-by: Michał Mirosław <[email protected]>
If not dumping netns nor connections, nsid support is not used. Don't fail the run as if the support is needed, the dumping process will fail later. Change-Id: I39a086756f6d520c73bb6b21eaf6d9fb49a18879 Signed-off-by: Michał Mirosław <[email protected]>
Signed-off-by: Bhavik Sachdev <[email protected]>
Signed-off-by: Adrian Reber <[email protected]>
Move PYTHON_EXTERNALLY_MANAGED and PIP_BREAK_SYSTEM_PACKAGES into Makefile.install to avoid code duplication. In addition, add PIPFLAGS variable to enable specifying pip options during installation. This is particularly useful for packaging, where it is common for `pip install` to run in an environment with pre-installed dependencies and without internet access. In such environment, we need to specify the following options: --no-build-isolation --no-index --no-deps Signed-off-by: Radostin Stoyanov <[email protected]>
The current link opens a page with the following text: The MediaWiki FAQ can be found at: https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:FAQ Signed-off-by: Radostin Stoyanov <[email protected]>
…and run the plugin finalizer later in the dump sequence Restore rseq_cs state before calling RESUME_DEVICES_LATE as the CUDA plugin will temporarily unfreeze a thread during the plugin hook to assist with device restore Run the plugin finalizer later in the dump sequence since the finalizer is used by the CUDA plugin to handle some process cleanup Signed-off-by: Jesus Ramos <[email protected]>
…DEVICES to be used during pstree collection PAUSE_DEVICES is called before a process is frozen and is used by the CUDA plugin to place the process in a state that's ready to be checkpointed and quiesce any pending work CHECKPOINT_DEVICES is called after all processes in the tree have been frozen and PAUSE'd and performs the actual checkpointing operation for CUDA applications Signed-off-by: Jesus Ramos <[email protected]>
Adding support for the NVIDIA cuda-checkpoint utility, requires the use of an r555 or higher driver along with the cuda-checkpoint binary. Signed-off-by: Jesus Ramos <[email protected]>
Commit fc683cb ("compel: shstk: save CET state when CPU supports it") started using PTRACE_ARCH_PRCTL to query shadow stack status. While PTRACE_ARCH_PRCTL has existed in the kernel for a long time, it was only added to glibc in version 2.27. Amazon Linux 2 (AL2) has glibc 2.26, which does not have this definition. As a result, build on AL2 fails with the below error: compel/arch/x86/src/lib/infect.c: In function ‘get_task_xsave’: compel/arch/x86/src/lib/infect.c:276:14: error: ‘PTRACE_ARCH_PRCTL’ undeclared (first use in this function) 276 | if (ptrace(PTRACE_ARCH_PRCTL, pid, (unsigned long)&features, ARCH_SHSTK_STATUS)) { | ^~~~~~~~~~~~~~~~~ While the definition is present on the system via the kernel headers (in asm/ptrace-abi.h) which can be reached by including linux/ptrace.h, the comment in compel/include/uapi/ptrace.h says: We'd want to include both sys/ptrace.h and linux/ptrace.h, hoping that most definitions come from either one or another. Alas, on Alpine/musl both files declare struct ptrace_peeksiginfo_args, so there is no way they can be used together. Let's rely on libc one. Since including linux/ptrace.h is not an option, define PTRACE_ARCH_PRCTL if it doesn't already exist. An interesting point to note is that in sys/ptrace.h, PTRACE_ARCH_PRCTL is an enum value so the preprocessor doesn't know about it. PT_ARCH_PRCTL is the preprocessor symbol that matches the value of PTRACE_ARCH_PRCTL. So look for PT_ARCH_PRCTL to decide if PTRACE_ARCH_PRCTL is available or not. Another interesting point to note is that AL2 ships with GCC 7 by default, which does not support the -mshstk option, causing other build failures. Luckily, it also ships GCC 10 which does have the option. Using GCC 10 lets the build succeed. Fixes: fc683cb ("compel: shstk: save CET state when CPU supports it") Signed-off-by: Pratyush Yadav <[email protected]>
Signed-off-by: Andrei Vagin <[email protected]>
Duplicate string in irmap_scan_path_add, otherwise it will free before parsing next configuration input. [ avagin: handle errors of xstrdup ] Signed-off-by: Liu Hua <[email protected]> Signed-off-by: Andrei Vagin <[email protected]>
Sometimes due to sigblockmask inheritance cgroupd can inherit SIGTERM blocked. That will lead cgroupd ignoring SIGTERM from stop_cgroupd() and CRIU will get stuck due to waiting for never-stopping cgroupd. I see this happening in lxc-checkpoint, also saw this in OpenVZ jenkins on cgroup_inotify00 test. Signed-off-by: Pavel Tikhomirov <[email protected]>
Before this fix, it could return MAP_FAILED which is ((void *) -1). Signed-off-by: Andrei Vagin <[email protected]>
It was added in v5.3-rc1~211^2~4^2~10. Fixes checkpoint-restore#2390 Signed-off-by: Andrei Vagin <[email protected]>
The CI tests with CentOS 7 have been disabled and removed [1,2]. This patch removes the obsolete Makefile targets for these tests. [1] checkpoint-restore@24bc083 [2] checkpoint-restore@f8466ca Signed-off-by: Radostin Stoyanov <[email protected]>
This patch extends CRIU dump with support for encryption of images using ChaCha20-Poly1305 authenticated-encryption in combination with X.509 certificates. The '--encrypt' option can be used with the dump/pre-dump commands to enable this functionality. When this option has been specified during dump, the GnuTLS library will be used to load a public key from X.509 certificate, and to generate a 256-bit random `token`. The token's value is then encrypted with the public key and the corresponding ciphertext is saved in `cipher.img`. During restore, if cipher.img exists in the images directory, the GnuTLS library will be used to load a private key from a corresponding PEM file to decrypt the token value. The token value is used with ChaCha20-Poly1305 to encrypt/decrypt all other CRIU images. The 256-bit token is used in combination with 96-bits `nonce` and 128-bits `tag` to protect data confidentiality and provide message authentication for each data entry. Example: criu dump --encrypt ... criu restore ... Signed-off-by: Radostin Stoyanov <[email protected]>
This patch extends ZDTM to run `criu dump` with the `--encrypt` option to test the encryption functionality of CRIU images. Signed-off-by: Radostin Stoyanov <[email protected]>
'opts' is defined in cr_options.h. This header will be included in a subsequent patch. We rename the local variable 'opts' to 'bpfmap_opts' to avoid variable shadowing. Signed-off-by: Radostin Stoyanov <[email protected]>
We calculate the total memory size needed for both keys and values and allocate a single contiguous memory region using a single mmap call. In a subsequent patch, this change would enable encrypting the combined memory region using a single pair of ChaCha20-Poly1305 tag and nonce. Signed-off-by: Radostin Stoyanov <[email protected]>
This patch extends dump_one_bpfmap_data() with support for encryption. Signed-off-by: Radostin Stoyanov <[email protected]>
During checkpoint, the contents of ghost images and pipe data is splice()-ed between file descriptors. To enable encryption for this data we introduce `tls_encrypt_file_data()` and `tls_decrypt_file_data()`. These functions read data from input file descriptor, perform encryption/decryption of the data, and write it to the corresponding output file descriptor. Signed-off-by: Radostin Stoyanov <[email protected]>
This patch extends CRIT with the ability to decode encrypted images. When `cipher.img` is present, crit will load the corresponding private key (from /etc/pki/criu/private/key.pem), decrypt the cipher token and use it to decrypt the protobuf entries in the image that is being decoded. Signed-off-by: Radostin Stoyanov <[email protected]>
cr_system() and cr_system_userns() are used to run external executables such as tar, ip, and iptables. These external tools are used to create image files in 3rd party format (i.e., raw images). In order to encrypt the output of these tools, and to decrypt their input, we replace the corresponding input/output file descriptor with a pipe, and perform encryption/decryption of the data. Signed-off-by: Radostin Stoyanov <[email protected]>
We use the AES-XTS block cipher to encrypt memory pages as it is designed to encrypt blocks of data with fixed-size (e.g. memory pages), allows the use of hardware acceleration available in modern CPUs, and uses a single initialization vector (IV), instead of per-page nonce, to ensure that encrypting the same plaintext with the same key results in different ciphertexts. In particular, XTS uses two 256-bits AES keys. One key is used to perform block encryption, and the other is used to encrypt a so-called "tweak value". The encrypted tweak value is further modified (with a Galois polynomial function) and XOR-ed with both the plaintext and ciphertext of each block. This method ensures that encrypting multiple blocks with identical data will produce different ciphertext. Since CRIU restores memory pages in the restorer context, this PIE code cannot be linked with libraries such as GnuTLS to perform decryption. Instead, we introduce a helper process to decrypt memory pages data. The restorer context communicates with this helper process using PIPEs. It sends the function arguments be used by preadv() and receives back its return value. The decrypted data is transferred to the target address space with process_vm_writev. Suggested-by: Daiki Ueno <[email protected]> Signed-off-by: Radostin Stoyanov <[email protected]>
The AES-XTS cipher does not provide integrity verification. In this patch we add a verification mechanism based on the HMAC-SHA-256 algorithm. In order to support iterative checkpointing and memory deduplication with encrypted memory, and to avoid storing HMAC for each memory page, we compute XOR for of HMAC value for all memory pages and store this value in cipher.img The XOR computation also allows us to address the problem that memory pages are read during restore in a different order then they are written during checkpoint. In addition, to ensure that memory pages are restored in correct order, we include the PID and VMA address associated with each page in the HMAC computation. The following example illustrates the HMAC value computation: H_n = HMAC(PID + VMA + MEMORY + KEY) hmac_value = H_1 ^ H_2 ^ ... ^ H_n - PID: PID associated with the memory page - VMA: virtual memory address associated with memory page - KEY: secret key - H_n: n-th memory page - hmac_value: value stored in cipther.img during checkpoint, and used for integrity verification during restore Signed-off-by: Radostin Stoyanov <[email protected]>
Measure the time for data encryption and decryption with stream and block ciphers. Signed-off-by: Radostin Stoyanov <[email protected]>
This script, similar to ssh-keygen and certtool, makes it easier to generate and install certificate and key to enable encryption support with CRIU. Signed-off-by: Radostin Stoyanov <[email protected]>
rst0git
force-pushed
the
encrypted-images
branch
from
July 10, 2024 12:52
6045665
to
3936cbf
Compare
* restorer can wait() for it when the restore stage is done. | ||
*/ | ||
ta->helpers = (pid_t *)rst_mem_align_cpos(RM_PRIVATE); | ||
child = rst_mem_alloc(sizeof(*child), RM_PRIVATE); |
Check failure
Code scanning / CodeQL
Inconsistent nullness check Error
The result of this call to rst_mem_alloc is not checked for null, but 91% of calls to rst_mem_alloc check for null.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request extends CRIU with support for encrypted images. A new cli option,
-e|--encrypt
, is used to enable this functionality with thedump
command.The implementation is based on the existing integration with GnuTLS, using ChaCha20-Poly1305 for protobuf and raw images, and AES-XTS for memory pages. The symmetric keys used for encryption are randomly generated, encrypted with a public key loaded from X.509 certificate and stored in
cipher.img
. During restore, ifcipher.img
exists, CRIU will load a corresponding private key from a PEM file and decrypt the symmetric keys.Usage example:
The following figure shows the results of performance evaluation, where CRIUsec includes the changes in this pull request, CRIU is used without encryption as a baseline, and GnuPG, OpenSSL, age are alternative solutions used with post-dump action-script for comparison.