-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Package.getFileData is a bottleneck sometimes #111
Comments
That's the annoying thing with tar files: They don't have an index. If libarchive has the right API for that (which I assume it does), we could read through the archive in its entirety once and create a hash map with the right offsets. We could then immediately jump to the correct positions. |
Or maybe just extract the archive if we detect that |
Skipping over large data chunks is waaaaaay faster than extracting anything and dealing with disk write and read I/O. |
Should this be done inside |
I'd say yes, because the data structure needed would hold very specific information about the individual archive. |
I was unable to find out such API. The only function that actually uses an |
Maybe we can count the number of translation files on the first |
While testing my own asgen backend I tried processing a package for the Stellarium project. It took more than 2 hours to process it. A little debugging revealed that the package
When asgen runs
compose
on the package, its callbacks repeatedly call toPackage.getFileData()
, which is implemented in the following way for all backends:Reading the
ArchiveDecompressor.readData()
code reveals that it iterates over the whole archive looking for a requested file. This results inO(n*m)
complexity wheren
is a number of files in the package andm
is the number of translation files.This clearly needs some caching/optimization, which can be made backend-agnostic, but since this is the first time I'm working with D I decided to first report the issue and hear back the options or suggestions on that.
The text was updated successfully, but these errors were encountered: