-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export appstream catalogue for preinstalled Flatpak apps #126
Export appstream catalogue for preinstalled Flatpak apps #126
Conversation
54f58a5
to
e9abef8
Compare
f2dfac6
to
8b743a5
Compare
8b743a5
to
f6da493
Compare
Triggered https://ci.endlessos.org/job/image-build-amd64/28044 to test this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not using compose to build the catalog is probably a good idea as we know the components are legal enough.
f6da493
to
6a6a8b8
Compare
Triggered https://ci.endlessos.org/job/image-build-amd64/28072 to test this again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems nice. I left some minor comments and questions, but this seems good enough to go if the test build succeeds.
Although we have added the name and summary for each app to the JSON manifest, there is no substitute for having all the data. It is better to capture this at build time rather than fetching it on the fly later because apps can change or disappear, but we would like to be able to describe exactly what is in any given image. The Flatpak repos we use have already done the hard work of validating every app and runtime's exported AppStream metainfo. So we can use the local copy of each remote's AppStream to find the <component> for each installed app and runtime, and concatenate these into a new catalog. This also avoids any issues where our local version of AppStream tools and libraries are more strict than those used on the remotes. XML compresses reasonably well. Running this script against the apps installed on my development system, the uncompressed file is 7.0 MiB. Compressing with pigz at the maximum compression level (as implemented here) comes out at 2.0 MiB and takes 77ms. Compressing with zstd at its default compression level (3) comes out at 1.9 MiB and takes 68ms. Compressing with zstd at its maximum compression level (19 if you don't pass --ultra which comes with some caveats) reduces the file to 1.4 MiB but takes 3.7s. Honestly I was hoping that zstd would perform better so I would have an excuse to start introducing it into eos-image-builder but it's just not worth it for these minimal gains; plus, the [AppStream Catalog XML specification][0] specifies GZip compression, so there's that. [0]: https://www.freedesktop.org/software/appstream/docs/chap-CatalogData.html https://phabricator.endlessm.com/T35013
6a6a8b8
to
676ba2d
Compare
Previously, the HTTP headers for the gzipped AppStream catalog on S3 looked like this: HTTP/1.1 200 OK Accept-Ranges: bytes Content-Length: 216930 Content-Type: application/xml Date: Tue, 14 Nov 2023 10:40:50 GMT ETag: "88efbfd50d380deaf29d0c8c9a9a0b6f" Last-Modified: Mon, 13 Nov 2023 17:27:33 GMT Server: AmazonS3 x-amz-expiration: expiry-date="Thu, 14 Dec 2023 00:00:00 GMT", rule-id="nightly" x-amz-id-2: jJi5BW3s1nMqQxHfEbUwFM0hnzvKgeHT4bRfOekIkCqOtkQhsIiIhwLapdhvvq5WEoh/1UdT3Zo= x-amz-request-id: 2VNRZHTWG471BZMQ x-amz-server-side-encryption: AES256 x-amz-version-id: ARqOJCikc6hWy_b7n8a4Bj068vI1FWBo Note that the Content-Type is application/xml, and there is no Content-Encoding header. This is incorrect because the XML is compressed. It makes web browsers sad because they try to parse the compressed data directly as XML. I could see two ways to fix this: 1. Set Content-Encoding: gzip so that user agents know that they are getting a compressed XML document. 2. Set Content-Type to application/gzip. I liked the idea of the former, and tested manually reuploading https://s3-us-west-2.amazonaws.com/images-dl.endlessm.com/nightly/eos-amd64-amd64/master/base/231113-161734/eos-master-amd64-amd64.231113-161734.base.appstream.xml.gz with --content-encoding=gzip. This has the advantage that you can browse the file in your web browser and it renders it nicely as formatted XML. However, if you save the file you get bad results in at least two browsers: 1. Firefox saves the original compressed file with the incorrect name eos-master-amd64-amd64.231113-161734.base.appstream.xml.gz.gz (note the double extension, even though the file has only one layer of compression) 2. Chromium saves the uncompressed file as eos-master-amd64-amd64.231113-161734.base.appstream.xml.gz (note the .gz extension even though the file is uncompresssed) So we take the second route. This way both browsers just save the compressed file to disk. https://phabricator.endlessm.com/T35013
I pushed one more commit to make S3 speak the truth about the file's content-type. Test build at https://ci.endlessos.org/job/image-build-amd64/28082 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I agree on the Content-Type
handling. We/I played around with setting Content-Encoding: gzip
in the past. It think it would work correctly if you had Content-Type: application/xml
and the filename didn't have the .gz
suffix. The purpose of Content-Encoding
is that the client is in charge of decoding on transfer, which isn't what you wanted here. The nice thing when you do it in S3 is that clients can request with Accept-Encoding: gzip
and S3 doesn't have to spend time compressing it on the fly. But, again, the intention here is that the user receives a gzipped XML file.
Although we have added the name and summary for each app to the JSON manifest, there is no substitute for having all the data. XML compresses very well!
It is better to capture this at build time rather than fetching it on the fly later because apps can change or disappear, but we would like to be able to describe exactly what is in any given image.
https://phabricator.endlessm.com/T35013