Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export appstream catalogue for preinstalled Flatpak apps #126

Merged

Conversation

wjt
Copy link
Member

@wjt wjt commented Nov 10, 2023

Although we have added the name and summary for each app to the JSON manifest, there is no substitute for having all the data. XML compresses very well!

It is better to capture this at build time rather than fetching it on the fly later because apps can change or disappear, but we would like to be able to describe exactly what is in any given image.

https://phabricator.endlessm.com/T35013

@wjt wjt force-pushed the T35013-export-appstream-catalog-for-preinstalled-apps branch from 54f58a5 to e9abef8 Compare November 10, 2023 16:28
config/defaults.ini Outdated Show resolved Hide resolved
@wjt wjt force-pushed the T35013-export-appstream-catalog-for-preinstalled-apps branch 2 times, most recently from f2dfac6 to 8b743a5 Compare November 10, 2023 17:38
@wjt wjt marked this pull request as ready for review November 10, 2023 17:38
@wjt wjt force-pushed the T35013-export-appstream-catalog-for-preinstalled-apps branch from 8b743a5 to f6da493 Compare November 10, 2023 17:42
@wjt
Copy link
Member Author

wjt commented Nov 10, 2023

Triggered https://ci.endlessos.org/job/image-build-amd64/28044 to test this

Copy link
Member

@dbnicholson dbnicholson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not using compose to build the catalog is probably a good idea as we know the components are legal enough.

hooks/image/70-flatpak-appstream-catalog Outdated Show resolved Hide resolved
@wjt wjt force-pushed the T35013-export-appstream-catalog-for-preinstalled-apps branch from f6da493 to 6a6a8b8 Compare November 13, 2023 16:15
@wjt
Copy link
Member Author

wjt commented Nov 13, 2023

Triggered https://ci.endlessos.org/job/image-build-amd64/28072 to test this again

Copy link
Member

@dbnicholson dbnicholson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems nice. I left some minor comments and questions, but this seems good enough to go if the test build succeeds.

hooks/image/70-flatpak-appstream-catalog Show resolved Hide resolved
hooks/image/70-flatpak-appstream-catalog Outdated Show resolved Hide resolved
hooks/image/70-flatpak-appstream-catalog Show resolved Hide resolved
Although we have added the name and summary for each app to the JSON
manifest, there is no substitute for having all the data.

It is better to capture this at build time rather than fetching it on
the fly later because apps can change or disappear, but we would like to
be able to describe exactly what is in any given image.

The Flatpak repos we use have already done the hard work of
validating every app and runtime's exported AppStream metainfo. So
we can use the local copy of each remote's AppStream to find the
<component> for each installed app and runtime, and concatenate these
into a new catalog. This also avoids any issues where our local version
of AppStream tools and libraries are more strict than those used on the
remotes.

XML compresses reasonably well. Running this script against the apps
installed on my development system, the uncompressed file is 7.0 MiB.
Compressing with pigz at the maximum compression level (as implemented
here) comes out at 2.0 MiB and takes 77ms. Compressing with zstd at its
default compression level (3) comes out at 1.9 MiB and takes 68ms.
Compressing with zstd at its maximum compression level (19 if you don't
pass --ultra which comes with some caveats) reduces the file to 1.4 MiB
but takes 3.7s. Honestly I was hoping that zstd would perform better so
I would have an excuse to start introducing it into eos-image-builder
but it's just not worth it for these minimal gains; plus, the
[AppStream Catalog XML specification][0] specifies GZip compression, so
there's that.

[0]: https://www.freedesktop.org/software/appstream/docs/chap-CatalogData.html

https://phabricator.endlessm.com/T35013
@wjt wjt force-pushed the T35013-export-appstream-catalog-for-preinstalled-apps branch from 6a6a8b8 to 676ba2d Compare November 14, 2023 10:21
Previously, the HTTP headers for the gzipped AppStream catalog on S3
looked like this:

    HTTP/1.1 200 OK
    Accept-Ranges: bytes
    Content-Length: 216930
    Content-Type: application/xml
    Date: Tue, 14 Nov 2023 10:40:50 GMT
    ETag: "88efbfd50d380deaf29d0c8c9a9a0b6f"
    Last-Modified: Mon, 13 Nov 2023 17:27:33 GMT
    Server: AmazonS3
    x-amz-expiration: expiry-date="Thu, 14 Dec 2023 00:00:00 GMT", rule-id="nightly"
    x-amz-id-2: jJi5BW3s1nMqQxHfEbUwFM0hnzvKgeHT4bRfOekIkCqOtkQhsIiIhwLapdhvvq5WEoh/1UdT3Zo=
    x-amz-request-id: 2VNRZHTWG471BZMQ
    x-amz-server-side-encryption: AES256
    x-amz-version-id: ARqOJCikc6hWy_b7n8a4Bj068vI1FWBo

Note that the Content-Type is application/xml, and there is no
Content-Encoding header. This is incorrect because the XML is
compressed. It makes web browsers sad because they try to parse the
compressed data directly as XML.

I could see two ways to fix this:

1. Set Content-Encoding: gzip so that user agents know that they are
   getting a compressed XML document.
2. Set Content-Type to application/gzip.

I liked the idea of the former, and tested manually reuploading
https://s3-us-west-2.amazonaws.com/images-dl.endlessm.com/nightly/eos-amd64-amd64/master/base/231113-161734/eos-master-amd64-amd64.231113-161734.base.appstream.xml.gz
with --content-encoding=gzip. This has the advantage that you can browse
the file in your web browser and it renders it nicely as formatted XML.
However, if you save the file you get bad results in at least two
browsers:

1. Firefox saves the original compressed file with the incorrect name
   eos-master-amd64-amd64.231113-161734.base.appstream.xml.gz.gz (note
   the double extension, even though the file has only one layer of
   compression)
2. Chromium saves the uncompressed file as
   eos-master-amd64-amd64.231113-161734.base.appstream.xml.gz (note the
   .gz extension even though the file is uncompresssed)

So we take the second route. This way both browsers just save the
compressed file to disk.

https://phabricator.endlessm.com/T35013
@wjt
Copy link
Member Author

wjt commented Nov 14, 2023

I pushed one more commit to make S3 speak the truth about the file's content-type.

Test build at https://ci.endlessos.org/job/image-build-amd64/28082

Copy link
Member

@dbnicholson dbnicholson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I agree on the Content-Type handling. We/I played around with setting Content-Encoding: gzip in the past. It think it would work correctly if you had Content-Type: application/xml and the filename didn't have the .gz suffix. The purpose of Content-Encoding is that the client is in charge of decoding on transfer, which isn't what you wanted here. The nice thing when you do it in S3 is that clients can request with Accept-Encoding: gzip and S3 doesn't have to spend time compressing it on the fly. But, again, the intention here is that the user receives a gzipped XML file.

@dbnicholson dbnicholson merged commit d7a19cb into master Nov 16, 2023
2 checks passed
@dbnicholson dbnicholson deleted the T35013-export-appstream-catalog-for-preinstalled-apps branch November 16, 2023 06:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants