Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recreate /var #143

Merged
merged 5 commits into from
Dec 19, 2023
Merged

Recreate /var #143

merged 5 commits into from
Dec 19, 2023

Conversation

wjt
Copy link
Member

@wjt wjt commented Dec 19, 2023

Sadly https://github.com/endlessm/eos-ostree-builder/pull/189 broke the image builder, because we create the buildroot by deploying an ostree, and the various things that set up /var on a fresh system are not all done here.

https://phabricator.endlessm.com/T4482

@wjt
Copy link
Member Author

wjt commented Dec 19, 2023

This is a draft because the build now fails later on while running update-catalog, triggered by some package installing an SGML catalog I guess:

Processing triggers for sgml-base (1.31) ...
cannot open /var/lib/sgml-base/supercatalog.new for writing: No such file or directory at /usr/sbin/update-catalog line 313.
dpkg: error processing package sgml-base (--configure):
 installed sgml-base package post-installation script subprocess returned error exit status 2
Errors were encountered while processing:
 docutils-common
 python3-docutils
 awscli
 sgml-base

@wjt wjt force-pushed the T4482-recreate-var branch from 5973ac8 to aaa5698 Compare December 19, 2023 10:49
@wjt
Copy link
Member Author

wjt commented Dec 19, 2023

Alright after creating a load of random files I now get this far:

+ 10:45:25 eib_image: ostree --repo=/var/cache/eos-image-builder/tmp/ostree-co/ostree/repo pull-local --disable-fsync --remote=eos /var/cache/eos-image-builder/content/ostree/eos os/eos/amd64/master
9079 metadata, 100452 content objects imported; 0 bytes content written
+ 10:45:31 eib_image: ostree --repo=/var/cache/eos-image-builder/tmp/ostree-co/ostree/repo pull eos ostree-metadata
error: open(O_TMPFILE): No such file or directory

Since a8d0732 we create the buildroot
with ostree, not mmdebstrap.

https://phabricator.endlessm.com/T4482
@wjt wjt force-pushed the T4482-recreate-var branch from aaa5698 to 0172f66 Compare December 19, 2023 11:04
@wjt
Copy link
Member Author

wjt commented Dec 19, 2023

This is effectively the same change as https://github.com/endlessm/eos-container-builder/pull/26

Comment on lines +192 to +196
# update-catalog requires /var/lib/sgml-base to exist
os.makedirs(os.path.join(self.builddir, 'var/lib/sgml-base'))

# update-xmlcatalog requires /var/lib/xml-core to exist
os.makedirs(os.path.join(self.builddir, 'var/lib/xml-core'))
Copy link
Member Author

@wjt wjt Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment in run-build seems prescient:

        # Pull the ostree. Ideally the eosminbase ostree would be used to
        # minimize the buildroot, but that's not released and would fail with
        # --use-production-ostree.

These packages are not part of eosminbase; so I believe these steps would not be necessary if we used that for the buildroot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I have a vague memory of seeing errors from sgml-base in my toolboxes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the errors fatal? Ideally those would have tmpfiles snippets so we didn't need to do anything like this. Or the scripts would, you know, attempt to create the directories instead of assuming they exist? But meh, this is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, unfortunately they are fatal, and a trigger failing is in turn fatal to the apt-get install invocation.

I agree that ideally they would have tmpfiles snippets – though I think the Way of the Future in OSTree is to put it in /usr/share/factory or whatever it's called?

@wjt wjt force-pushed the T4482-recreate-var branch from 0172f66 to b3059c0 Compare December 19, 2023 11:19
wjt added 4 commits December 19, 2023 11:20
We recently merged a change to completely wipe `/var` in the ostree,
replacing it with an empty directory. This is because on a real system,
`/var` from the ostree is not used: instead a persistent, read-write
directory is mounted over it.

It turns out that `apt update` is happy to create its own directory
hierarchy below `/var/lib`, but if that directory does not exist it fails
with:

    E: List directory /var/lib/apt/lists/partial is missing. - Acquire (2: No such file or directory)

Similarly, installing packages requires /var/lib/dpkg to exist, but we
have moved this to /usr

    E: Could not open lock file /var/lib/dpkg/lock-frontend - open (2: No such file or directory)
    E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), are you root?

On a real Endless OS system, `/var/lib` is initialised at boot by
`systemd-tmpfiles`. Do the same here. Move the dpkg database back to its
rightful home. Remove the tmpfiles snippet that eos-ostree-builder adds
to deal with the changes we are undoing.

https://phabricator.endlessm.com/T4482
Installing an sgml catalog causes update-catalog to be run. This Perl
script requires var/lib/sgml-base to exist. This directory is part of
the sgml-base package but of course is deleted with the rest of /var.

https://phabricator.endlessm.com/T4482
Installing, I guess, an XML DTD causes update-xmlcatalog to be run.
This Perl script requires `/var/lib/xml-core` to exist. This directory
is part of the `xml-core` package but of course is deleted with the rest
of `/var`.

https://phabricator.endlessm.com/T4482
@wjt wjt force-pushed the T4482-recreate-var branch from b3059c0 to e38381b Compare December 19, 2023 11:20
@wjt wjt marked this pull request as ready for review December 19, 2023 11:34
@wjt wjt requested a review from dbnicholson December 19, 2023 11:34
Copy link
Member

@dbnicholson dbnicholson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. I did not anticipate this fallout even though I should have since I worked through it in the container builder and this is basically the same thing.

A future improvement would be to actually use the containers instead of bespoke chroot setups. Then the container builder would be the one place where you'd do the conversion from immutable OS to mutable OS.

Comment on lines +192 to +196
# update-catalog requires /var/lib/sgml-base to exist
os.makedirs(os.path.join(self.builddir, 'var/lib/sgml-base'))

# update-xmlcatalog requires /var/lib/xml-core to exist
os.makedirs(os.path.join(self.builddir, 'var/lib/xml-core'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the errors fatal? Ideally those would have tmpfiles snippets so we didn't need to do anything like this. Or the scripts would, you know, attempt to create the directories instead of assuming they exist? But meh, this is fine.

@dbnicholson dbnicholson merged commit f7b216b into master Dec 19, 2023
2 checks passed
@dbnicholson dbnicholson deleted the T4482-recreate-var branch December 19, 2023 17:12
@wjt
Copy link
Member Author

wjt commented Dec 19, 2023

A future improvement would be to actually use the containers instead of bespoke chroot setups. Then the container builder would be the one place where you'd do the conversion from immutable OS to mutable OS.

That would indeed be good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants