-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add osmium support for handling different kind of files #15
base: master
Are you sure you want to change the base?
Conversation
Thank you very much for your work! I will look over the code in the next days. How did you change the general workflow in OsmBuilder? Have you run any tests regarding memory consumption and parsing times? Is XML parsing now faster or slower than before? In general, I am still a bit hesitant to use libosmium here. It's a huge additional dependency. In particular, it introduces Boost as a dependency, which I would like to avoid. If the main goal is to support .pbf files, I still think it would be a better approach to just parse the .pbf files directly. But maybe I am wrong :) |
Currently this is still a WIP so currently what is done only reading the data through the libosmium.
Have you run any tests regarding memory consumption and parsing times?
Is XML parsing now faster or slower than before?
The idea was to keep the application "logic" as you have written it since there is still time required for me to understand in detail what is done there. Also please don't hesitate to:
Honestly might be good to:
This is a very practical application that ads a huge benefit for processing gtfs data for agencies that do no generate their shapes for GTFS. |
One more note: the pull request also contains some clang code improvement suggestions. |
@patrickbr I'm curious what's holding this PR back? Is it that you didn't have time/energy/motivation to review this yet, or is it the general direction (e.g. the Boost dependency) that you're unhappy with? I'm currently map-matching many GTFS feeds using pfaedle (thanks for this tool btw!), and it has to re-read a 12gb OSM XML file for every GTFS feed. I hope that reading ~700mb of |
I also noticed that pfaedle seems to read this file multiple times, once per matching iteration. In my case, it reads & parses the 12gb |
I must admit that having to handle a > 15 Go bz2 file instead of a 4,5 Go pbf file makes this lib harder to try. Thanks for the work on this PR ! |
Thank you again for all your efforts here. I have been hesitant to merge this PR because it would add major dependencies (libosmium and boost). I am not happy with that. Also, it was opened before a major refactoring and rewrite of large parts of the tool in 2021. The more sophisticated OSM formats (o5m, protobuf) are not that hard to parse, and I would still prefer a simple solution which just reads these formats directly, without going through libosmium. The main benefit that libosmium adds besides format parsing is reference resolution and the construction of ready-to-use geometrical objects. The techniques to do that are already there in the pfaedle code, all that is missing is a drop-in replacement of the XML parser with an o5m or protobuf parser. I have been working on that for a few months now. |
Still a work in progress and would kindly appreciate some review/feedback since some namings are not so clear.
Closes: #10