Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: range requests for wheel #66

Merged
merged 8 commits into from
Nov 2, 2023

Conversation

baszalmstra
Copy link
Contributor

An implementation to use range requests to read metadata from wheels. This speeds up metadata fetching for large wheels significantly.

@baszalmstra baszalmstra requested review from tdejager and wolfv October 28, 2023 13:16
crates/rattler_installs_packages/src/wheel.rs Show resolved Hide resolved
Comment on lines +158 to +159
// Get the size of the entry plus the header + size of the filename. We should also actually
// include bytes for the extra fields but we don't have that information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens now that we don't know this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cross our fingers that the entry doesnt include extra fields. But because we also align to the bufreader size most likely we dont need to do another range request anyway. In the worst case another range request is performed.

// Make sure we have the back part of the stream.
// Because the zip index is at the back
stream
.prefetch(stream.len().saturating_sub(16384)..stream.len())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is 16384?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good question! Thats my best guess at the size of the central archive directory (the index). Sometimes its more, which then causes another range requests but most of the times it less.

@wolfv
Copy link
Member

wolfv commented Nov 2, 2023

Ouch we have a conflict

@tdejager
Copy link
Contributor

tdejager commented Nov 2, 2023

Which is now fixed.

@tdejager tdejager merged commit 0d31366 into prefix-dev:main Nov 2, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants