v255 batch #403

bluca · 2024-05-26T12:09:55Z

No description provided.

… is missing Currently, SLEEP_NOT_ENOUGH_SWAP_SPACE (ENOSPC) is returned on all sorts of error conditions. But one important case that's worth differentiating from that is when the resume device is manually specified yet missing. Closes #32644 (cherry picked from commit 40eb83a)

Otherwise we might fail if PID 1 is currently accessing these files. Fixes #32692 (hopefully) (cherry picked from commit 65690de)

This can change between the call to homectl inspect and userdbctl user so let's ignore it along with the other disk fields. Fixes #32727 (cherry picked from commit 6c5d4f0)

This fixes build with old toolchains prior to Linux < 4.2 which do not have a definition for NFPROTO_NETDEV. (cherry picked from commit 41a94ae)

(cherry picked from commit 4591eff)

…acquired prefix Previously, even if a DNS server is in the acquired prefix, the route to the server might have gateway address. This makes the prefix route, which is always configured, is also handled as same as static routes, and do not use any gateway if the prefix route is the most suitable route to access the destination. The same change is also applied to route to NTP servers and semi-static routes. Fixes a regression introduced by 0ce86f5. Fixes #32715. (cherry picked from commit 0f3116f)

(cherry picked from commit e97bb36)

Also this makes several checks more strict. (cherry picked from commit 24e3792)

This should be useful when the test run as a service, e.g. running on a mkosi image. (cherry picked from commit e92d7b7)

This adds checks for the kernel bug caused by torvalds/linux@3ddc223, it will be fixed by https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/ (cherry picked from commit d22f2fb)

Follow-up for 9de324c. (cherry picked from commit a937fa9)

The state might be "freezing-by-parent" as well so let's take that into account. Fixes #32746 (cherry picked from commit 034e85c)

… destroy a curl context on exit If we destroy both an event loop and a curl contect object at the same time, then we get into this weird situation where curl wants us to reconfigure a timout event source right before destruction, which sd-event will refuse however, since it is already being shutdown. Hence, catch that and simply don't bother adjusting the timeout, since we cannot get back from there anyway. (cherry picked from commit c5ecf09)

The test-event test seems to be taking quite a bit more time than the other 'simple tests', which usually complete in < 1s. In case of a slower or loaded machine the default 30s timeout is not enough. (cherry picked from commit 381c3b6)

We want to eanble running tests as part of the build, but our builds run in VMs with networking disabled. (cherry picked from commit 19614a0)

(cherry picked from commit f7a6418)

Fixes #32808. (cherry picked from commit 05e64ea)

(cherry picked from commit d02a41a)

If tests are run during build time, without an already installed systemd they fail to resolve the sysusersdir and tpmfilesdir pkg-config variables. (cherry picked from commit 2aee829)

Fixes #32837. (cherry picked from commit 60dbecf)

.osrel is also optional, but sd-boot and bootctl requires it. So, let's keep .osrel section at least now. Fixes #32774. (cherry picked from commit 2e93331)

Otherwise we log a noisy error when we get ECONNRESET. (cherry picked from commit 2540036)

Avoid regressions like systemd/systemd#32856 Follow-up for 2ef7cdc (cherry picked from commit 88e7911)

Previously, one of the test route has the same address in destination and gateway. Even it is a test case, that's super spurious. Let's use a different address. (cherry picked from commit cd65075)

(cherry picked from commit cad510b)

(cherry picked from commit 5573263)

Fixes #32695. (cherry picked from commit 71f0487)

Otherwise, expected lines may not be processed or not sync()ed to disk. Fixes #32712. (cherry picked from commit c22a112)

Fixes #32731. (cherry picked from commit 272aae3)

Fixes #32697. (cherry picked from commit 0664c1c)

Due to the bug in kernel 6.9 caused by torvalds/linux@8debcf5, the net_id udev builtin does not work for netdevsim interface. So, eni99np1 cannot be used with kernel 6.9 anymore. Workaround for #32910. (cherry picked from commit f1f1be7)

Makes it easier to switch for debuggin (cherry picked from commit 5002b57)

Helped track down issue with session tracking (cherry picked from commit c275e01)

When running inside an LXC container the 'su' process will not be part of any unit or slice. manager_get_user_by_pid() which was used until v255 (included) does not fail if it cannot find a unit/slice, but simply returns 'not found'. Do the same in manager_get_session_by_pidref(). This was not detected as Semaphore CI does not reboot the testbed before the logind test, so the session is started by the old logind from the base distro, instead of the one being tested. Follow-up for 8494f56 Follow-up for 5099a50 Fixes systemd/systemd#32929 (cherry picked from commit eb56b56)

Otherwise, journal entries comes during sleep may not be read. Follow-up for c22a112. (cherry picked from commit 123acb2)

Fixes #32936. (cherry picked from commit 125cca1)

Coverity gets confused since the iterator change, so add an assert to indicate that this is allocated if n_old_groups is > 0 CID#1545922 Follow-up for 125cca1 (cherry picked from commit 5e30e6e)

Fixes systemd/systemd#32932 (comment). (cherry picked from commit f8ef1df)

Addresses: systemd/systemd#32907 (comment) (cherry picked from commit d3c14f7)

…rted by btrfs Fixup for e3828d7, as requested in systemd/systemd#32892 (comment). (cherry picked from commit 055b465)

(cherry picked from commit d735753)

Follow-up for ade0789 The change in behavior was partly intentional, as I think if both --wait and --pty are used, manually disconnecting from PTY forwarder should not result in systemd-run exiting with "Finished with ..." log. But we should check for --wait here. Closes #32953 (cherry picked from commit 2b4a691)

…pipe, and --wait (cherry picked from commit d73a47d)

Fixes systemd/systemd#32680 (comment). === May 21 02:45:08 TEST-74-AUX-UTILS.sh[2475]: + mountpoint /tmp/tmp.eaRV7lSbX2/mnt May 21 02:45:08 TEST-74-AUX-UTILS.sh[2476]: /tmp/tmp.eaRV7lSbX2/mnt is not a mountpoint May 21 02:45:08 TEST-74-AUX-UTILS.sh[2449]: + systemd-mount /dev/loop0 /tmp/tmp.eaRV7lSbX2/mnt May 21 02:45:08 systemd-mount[2477]: Failed to start transient mount unit: Unit tmp-tmp.eaRV7lSbX2-mnt.mount was already loaded or has a fragment file. === (cherry picked from commit 4a8ca3c)

Hopefully fixes issue like systemd/systemd#32680 (comment) systemd/systemd#32680 (comment) (cherry picked from commit e504f5a)

Otherwise, when stopping the service, the last command may not be started yet, and the service manager may not send SIGTERM signal to the last command, but send SIGKILL on timeout. === May 21 08:23:24 test19-exit-cgroup.sh[437]: + disown May 21 08:23:24 test19-exit-cgroup.sh[438]: + sleep infinity May 21 08:23:24 test19-exit-cgroup.sh[437]: + systemd-notify --ready May 21 08:23:24 test19-exit-cgroup.sh[437]: + sleep infinity May 21 08:23:24 test19-exit-cgroup.sh[441]: + systemctl stop one May 21 08:23:24 test19-exit-cgroup.sh[443]: + sleep infinity (snip) May 21 08:23:24 systemd[1]: one.service: Changed running -> stop-sigterm May 21 08:23:24 systemd[1]: Stopping one.service - /tmp/test19-exit-cgroup.sh "systemctl stop one"... May 21 08:23:24 systemd[1]: Received SIGCHLD from PID 441 (systemctl). May 21 08:23:24 systemd[1]: Child 437 (bash) died (code=killed, status=15/TERM) May 21 08:23:24 systemd[1]: one.service: Child 437 belongs to one.service. May 21 08:23:24 systemd[1]: one.service: Main process exited, code=killed, status=15/TERM (success) May 21 08:23:24 systemd[1]: Child 439 (bash) died (code=killed, status=15/TERM) May 21 08:23:24 systemd[1]: one.service: Child 439 belongs to one.service. May 21 08:23:24 systemd[1]: Child 441 (systemctl) died (code=killed, status=15/TERM) May 21 08:23:24 systemd[1]: one.service: Child 441 belongs to one.service. May 21 08:23:24 systemd[1]: Child 442 (bash) died (code=killed, status=15/TERM) May 21 08:23:24 systemd[1]: one.service: Child 442 belongs to one.service. (snip) May 21 08:24:54 systemd[1]: one.service: State 'stop-sigterm' timed out. Killing. May 21 08:24:54 systemd[1]: one.service: Killing process 443 (sleep) with signal SIGKILL. May 21 08:24:54 systemd[1]: one.service: Changed stop-sigterm -> stop-sigkill May 21 08:24:54 systemd[1]: Received SIGCHLD from PID 443 (sleep). May 21 08:24:54 systemd[1]: Child 443 (sleep) died (code=killed, status=9/KILL) May 21 08:24:54 systemd[1]: one.service: Child 443 belongs to one.service. May 21 08:24:54 systemd[1]: one.service: Control group is empty. May 21 08:24:54 systemd[1]: one.service: Failed with result 'timeout'. May 21 08:24:54 systemd[1]: one.service: Service restart not allowed. May 21 08:24:54 systemd[1]: one.service: Changed stop-sigkill -> failed May 21 08:24:54 systemd[1]: one.service: Job 738 one.service/stop finished, result=done May 21 08:24:54 systemd[1]: Stopped one.service - /tmp/test19-exit-cgroup.sh "systemctl stop one". May 21 08:24:54 systemd[1]: one.service: Unit entered failed state. May 21 08:24:54 systemd[1]: one.service: Releasing resources... === Fixes #32947. (cherry picked from commit a5edb9b)

On running cryptsetup, udevd detects two inotify events for the underlying device. Running the test on enough fast host, the expected symlinks based on UUID and disk label are created by the second event. During processing a uevent for a device, udevd disables the inotify watch for the device. If the test runs on slow system, the second inotify event may comes during a udev worker processing the synthesized uevent triggered by the first inotify event. Hence, no synthesized uevent for the second inotify event will be generated, and the expected symlinks will be never created. To prevent the issue, we need to lock the device during cryptsetup command is running. Fixes #32913. (cherry picked from commit be43c9b)

Follow-up for a610ba0. Fixes #32890. (cherry picked from commit 87ed87e)

As per the documentation, EACCES is only returned when F_SETLK is used, and only on some platforms, which doesn't seem to include Linux: https://github.com/torvalds/linux/blob/master/fs/locks.c F_OFD_SETLK is documented to only return EAGAIN, and F_SETLKW/F_OFD_SETLKW are blocking operations so this logic doesn't apply to them in the first place. Hence, only automatically convert EACCES into EAGAIN for F_SETLK operations, and propagate the original error in the other cases. This is important because in some cases we catch permission errors and gracefully fallback, which is not possible if the original error is lost. This is an issue in practice because, due to a kernel bug present before v6.2, AppArmor denies locking on file descriptors to LXC containers. We support all currently maintained LTS kernels, including v6.1, where despite a lot of effort and attempts over almost a year, the bugfix still hasn't been backported, as it is complex and requires large changes to AppArmor. On affected kernels, all services running with PrivateNetwork=yes fail and do not recover, instead of the normal behaviour of gracefully downgrading to PrivateNetwork=no. The integration tests in the Debian CI fail due to this issue: https://ci.debian.net/packages/s/systemd/testing/arm64/46828037/ (cherry picked from commit 06384eb)

When running in LXC with AppArmor we'll most likely get an error when creating a network namespace due to a kernel regression in < v6.2 affecting AppArmor, resulting in denials. Like other tests, avoid failing in case of permission issues and handle it gracefully. (cherry picked from commit 6ab21f2)

We want to avoid reinitialization of our global variables with static storage duration in case we get dlopened multiple times by the same application. This will avoid potential resource leaks that could have happened otherwise (e.g. leaking journal socket fd). (cherry picked from commit 9d8533b)

Before: /etc/kernel/install.conf:6: Unknown key name 'asdf' in section '(null)', ignoring. After: /etc/kernel/install.conf:6: Unknown key 'asdf', ignoring. Also make the message a bit better. (cherry picked from commit 600a740)

(cherry picked from commit 5f5ee2e)

So, we need to try to read timezone several times. Also, on failure, show journal of timedated instead of hostnamed, as the timezone is handled by timedated. Hopefully fixes #33007. (cherry picked from commit 1ef586a)

See also: https://lore.kernel.org/r/[email protected] (cherry picked from commit 100bed7)

keszybz · 2024-05-27T07:52:01Z

CI failures appear unrelated.

YHNdnzj and others added 30 commits May 26, 2024 12:11

TEST-81-GENERATORS: Do a lazy unmounts

88371d8

Otherwise we might fail if PID 1 is currently accessing these files. Fixes #32692 (hopefully) (cherry picked from commit 65690de)

TEST-46-HOMED: Ignore "Disk Usage" field as well

d8173fc

This can change between the call to homectl inspect and userdbctl user so let's ignore it along with the other disk fields. Fixes #32727 (cherry picked from commit 6c5d4f0)

basic/linux: Copy netfilter.h to the source tree

b23c636

This fixes build with old toolchains prior to Linux < 4.2 which do not have a definition for NFPROTO_NETDEV. (cherry picked from commit 41a94ae)

test: add basic tests for in_addr_prefix_covers_full()

3b4224b

(cherry picked from commit 4591eff)

test-network: do not fail if macvlan module is not available

24ed1bc

(cherry picked from commit e97bb36)

test-network: do not fail when /etc/protocols does not exist

5014337

Also this makes several checks more strict. (cherry picked from commit 24e3792)

test-network: introduce --no-journal option

8612a02

This should be useful when the test run as a service, e.g. running on a mkosi image. (cherry picked from commit e92d7b7)

test-network: check existence of kernel bug

c9ca1ad

This adds checks for the kernel bug caused by torvalds/linux@3ddc223, it will be fixed by https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/ (cherry picked from commit d22f2fb)

libcrypt-util: fix wrong errno value assignment

92776aa

Follow-up for 9de324c. (cherry picked from commit a937fa9)

TEST-38-FREEZER: Relax regex a little

720c07e

The state might be "freezing-by-parent" as well so let's take that into account. Fixes #32746 (cherry picked from commit 034e85c)

sd-event: increase test-event timeout to 120s

8a9ff4b

The test-event test seems to be taking quite a bit more time than the other 'simple tests', which usually complete in < 1s. In case of a slower or loaded machine the default 30s timeout is not enough. (cherry picked from commit 381c3b6)

libsystemd-network: skip dhcp server test in case of EAFNOSUPPORT

8aea22b

We want to eanble running tests as part of the build, but our builds run in VMs with networking disabled. (cherry picked from commit 19614a0)

libsystemd-network: remove double initialization

7288f8b

(cherry picked from commit f7a6418)

home: fix ownership of files copied from skelton directory

25a979d

Fixes #32808. (cherry picked from commit 05e64ea)

core: Fix assertion in parse_smbios_strings()

f3b72f3

(cherry picked from commit d02a41a)

test/test-rpm-macros.sh: add build directory to pkg-config search path

b7e3782

If tests are run during build time, without an already installed systemd they fail to resolve the sysusersdir and tpmfilesdir pkg-config variables. (cherry picked from commit 2aee829)

systemctl: fix "applying zero offset to null pointer" UBSan error

0d9929e

Fixes #32837. (cherry picked from commit 60dbecf)

pe-binary: .initrd section is optional for UKI

f917b42

.osrel is also optional, but sd-boot and bootctl requires it. So, let's keep .osrel section at least now. Fixes #32774. (cherry picked from commit 2e93331)

journal-importer: Consider ECONNRESET as EOF

4b73f85

Otherwise we log a noisy error when we get ECONNRESET. (cherry picked from commit 2540036)

test: add coverate for Compress=yes config option

7747f96

Avoid regressions like systemd/systemd#32856 Follow-up for 2ef7cdc (cherry picked from commit 88e7911)

test-network: use different destination from gateway

49c8968

Previously, one of the test route has the same address in destination and gateway. Even it is a test case, that's super spurious. Let's use a different address. (cherry picked from commit cd65075)

test: do not fill journal with "wait"

4cd5914

(cherry picked from commit cad510b)

test: do not fill journal with diff

7b7a893

(cherry picked from commit 5573263)

test: wait for partition processed by udevd

8bbd3e2

Fixes #32695. (cherry picked from commit 71f0487)

test: sync journal before reading journal

ab9f8ea

Otherwise, expected lines may not be processed or not sync()ed to disk. Fixes #32712. (cherry picked from commit c22a112)

test: wait for slice unit being (de)activated

3fb2019

Fixes #32731. (cherry picked from commit 272aae3)

test: wait for partition device being processed by udevd

eda2c2e

Fixes #32697. (cherry picked from commit 0664c1c)

bluca requested a review from keszybz May 26, 2024 12:15

yuwata and others added 25 commits May 26, 2024 14:05

semaphore: use variable for Salsa repo URL

4e1873c

Makes it easier to switch for debuggin (cherry picked from commit 5002b57)

logind: add one more debug log

faf7bde

Helped track down issue with session tracking (cherry picked from commit c275e01)

test: call journalctl --sync just before reading journals

0237ff8

Otherwise, journal entries comes during sleep may not be read. Follow-up for c22a112. (cherry picked from commit 123acb2)

btrfs-util: check current offset before read

cef19f1

Fixes #32936. (cherry picked from commit 125cca1)

btrfs-util: add assert to fix Coverity warning

980d965

Coverity gets confused since the iterator change, so add an assert to indicate that this is allocated if n_old_groups is > 0 CID#1545922 Follow-up for 125cca1 (cherry picked from commit 5e30e6e)

test: extend timeout for DHCP/NDisc tests

f286e81

Fixes systemd/systemd#32932 (comment). (cherry picked from commit f8ef1df)

test: add a brief comment for the chattr check

552c269

Addresses: systemd/systemd#32907 (comment) (cherry picked from commit d3c14f7)

shared/mountpoint-util: for old kernels, assume "norecovery" is suppo…

134d51f

…rted by btrfs Fixup for e3828d7, as requested in systemd/systemd#32892 (comment). (cherry picked from commit 055b465)

ptyfwd: add missing assertions for pty_forward_new

bffb7ce

(cherry picked from commit d735753)

man/systemd-run: beef up info regarding interaction between --pty, --…

f11f8cb

…pipe, and --wait (cherry picked from commit d73a47d)

test: wait for loop/backing_file attribute being removed

43c7ebb

Hopefully fixes issue like systemd/systemd#32680 (comment) systemd/systemd#32680 (comment) (cherry picked from commit e504f5a)

test: also flush and rotate journal before read

958215f

Follow-up for a610ba0. Fixes #32890. (cherry picked from commit 87ed87e)

man: mention that NFTSet is only available for system services

af65d03

(cherry picked from commit 5f5ee2e)

test: applying timezone is asynchronous

f1ae3ac

So, we need to try to read timezone several times. Also, on failure, show journal of timedated instead of hostnamed, as the timezone is handled by timedated. Hopefully fixes #33007. (cherry picked from commit 1ef586a)

blockdev-util: "partscan" sysattr now directly shows the enabled state

7cdfa27

See also: https://lore.kernel.org/r/[email protected] (cherry picked from commit 100bed7)

bluca force-pushed the v255-stable branch from 289c679 to 7cdfa27 Compare May 26, 2024 13:05

keszybz approved these changes May 27, 2024

View reviewed changes

keszybz merged commit 41fb19e into systemd:v255-stable May 27, 2024
42 of 44 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v255 batch #403

v255 batch #403

bluca commented May 26, 2024

keszybz commented May 27, 2024

v255 batch #403

v255 batch #403

Conversation

bluca commented May 26, 2024

keszybz commented May 27, 2024