Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to osm2pgsql flex backend #4112

Open
9 of 16 tasks
pnorman opened this issue Apr 7, 2020 · 13 comments
Open
9 of 16 tasks

Move to osm2pgsql flex backend #4112

pnorman opened this issue Apr 7, 2020 · 13 comments
Assignees
Labels

Comments

@pnorman
Copy link
Collaborator

pnorman commented Apr 7, 2020

osm2pgsql has a new backend, the flex backend. This is a successor to the multi backend, and allows the style-supplied transformation to configure much more, including defining what tables to create.

Importantly for us, it allows callbacks for us to include information from relations on the geometries created from ways. This will allow us to unblock #596 and has the potential to let us deduplicate admin geometries at import time.

Besides offering features we need, developing transforms for it will help osm2pgsql uncover any issues in the backend and help with adoption of its new features.

As part of this, I'd like to make a few changes

  • Bring parent route relation information onto linear transportation elements (roads and rail) from ways
  • Build dedicated tables for transportation elements. This is the only table which will have z_order or the parent route relation information
  • Create a new table with geometries from ways that are members of admin relations and with fields populated from parent relations
  • Drop planet_osm_roads

This will leave us with with five tables

Table name Contents
planet_osm_point Current contents
planet_osm_polygon Current contents including admin areas as polygons without highway areas
planet_osm_line Current contents without roads, railways, or admin lines from relations
planet_osm_transport_line Roads and rail with information from parent route relations and z_order
planet_osm_transport_polygon Roads and rail polygons with z_order

This will allow us to

  • use route relations (Render ref from route relationship #596);
  • reduce the width of the current tables by only having dedicated oneway and per-mode access columns on roads, and not having addr, amenity, etc on roads;
  • deduplicate admin boundaries;

This is a long-running project, but because we can reuse our existing Lua code there's less work than the initial move to Lua transforms. I think we're also better at project management now. There are a few milestones, and some tasks external to osm-carto.

  • Develop partial v1 transforms
    • Replicate existing behavior
    • Add new tables
  • Take feedback back to osm2pgsql devs and have it addressed (external)
  • Develop complete v2 transforms and stand up a worldwide test instance
    • Fix bugs identified from v1 and any API changes
    • Remove features and columns from existing tables and change MML SQL
  • Take feedback back to osm2pgsql devs and have it addressed (external)
  • Have osm2pgsql release a version with any fixes needed (external)
  • Stand up a worldwide test instance
  • Get osm2pgsql backported to Ubuntu 20.04 and Debian Stable (external)
  • Release a new major version of osm-carto

Until we take the step of removing features from existing tables and changing the SQL, it's easy to maintain the fork. Once we take that step we start incurring cost from having to port changes from one branch to the other.

cc @joto @gravitystorm @lonvia

@pnorman pnorman added the lua label Apr 7, 2020
@pnorman pnorman self-assigned this Apr 7, 2020
@joto
Copy link
Contributor

joto commented Apr 7, 2020

As author of the flex backend code I welcome everybody who wants to try it out. Just be aware that it is still marked experimental and is not in a released osm2pgsql version yet. There are still some rough edges to smooth out. That being said, it would, of course, help to have more eyes on this to find any problems and move the new backend forward and get it released. Being able to support the needs of openstreetmap-carto is definitely very high on our list of things to do.

I would recommend to not use the stage 2 processing (for the relation tags on ways) at first, but try the flex backend without it. We don't know yet what the performance implications of the two-stage processing are in a real-world scenario and how solid the updating mechanism is, so just be careful with that. But even without that, the flex backend has many advantages.

@mboeringa
Copy link

mboeringa commented Apr 7, 2020

planet_osm_line Current contents without roads, railways, or admin lines from relations

@pnorman , for backwards compatibility with other styles and other uses of the current OpenStreetMap database as imported with the 'openstreetmap-carto' style, I strongly recommend to not remove roads, railway, or admin lines from the planet_osm_line table, but just leave the contents of the existing planet_osm_polygon/line/point tables as it is currently.

This allows current users that rely on a minimal three table setup of the tables (polygon/line/point), to continue doing that. Currently, the planet_osm_line table has all the information you need to also render roads, and you can thus ignore the planet_osm_road table and do your own selection of "roads" from it through SQL. Changing the contents of planet_osm_line to exclude roads and railways will block that option.

I know this means you can't optimize the planet_osm_line table, but I do think this is better for downstream users relying on a basic three table setup.

@gravitystorm
Copy link
Owner

I would recommend to not use the stage 2 processing (for the relation tags on ways) at first, but try the flex backend without it. We don't know yet what the performance implications of the two-stage processing are in a real-world scenario and how solid the updating mechanism is, so just be careful with that.

I think the only thing I have to add here is that the logic required for advanced relation processing in a flex lua file becomes quite complex quite quickly, and fairly impenetrable for someone like me who doesn't really know how osm2pgsql works internally. For example, knowing what features to mark, what should be done in which stage, etc. Then there's the challenge of ensuring that the lua code also processes updates correctly, again with things like marking the right features and re-processing them. While re-running an import of an extract is straightforward, checking updates is not.

I've found that I need to treat the flex lua file as code, rather than configuration, and I've ended up writing rspec integration tests to check that my lua code does the right thing for imports and updates (i.e. by creating mock xml and osc files, running the osm2pgsql binary, and then inspecting contents of the database, for each test). I hope this will be unnecessary in future, as more of the processing complexity is made available in the included helper functions (like grab_tag, which came out of my lua work and avoids some pass-by-reference pitfalls).

I can do things with flex that used to require having a custom fork of osm2pgsql, and I've got some stuff in development that wasn't possible even with that. But so far having this test framework has been useful both while developing my lua code, and also for retesting as osm2pgsql gets updated.

@pnorman
Copy link
Collaborator Author

pnorman commented Apr 27, 2020

I've created a new long-running branch that currently reimplements existing behavior with the flex backend.

I'm treating it as code and have fairly comprehensive unit tests. These don't require running osm2pgsql, but test the lua code, mocking out the osm2pgsql supplied objects and functions.

@pnorman
Copy link
Collaborator Author

pnorman commented Apr 27, 2020

@pnorman , for backwards compatibility with other styles and other uses of the current OpenStreetMap database as imported with the 'openstreetmap-carto' style, I strongly recommend to not remove roads, railway, or admin lines from the planet_osm_line table, but just leave the contents of the existing planet_osm_polygon/line/point tables as it is currently.

This allows current users that rely on a minimal three table setup of the tables (polygon/line/point), to continue doing that. Currently, the planet_osm_line table has all the information you need to also render roads, and you can thus ignore the planet_osm_road table and do your own selection of "roads" from it through SQL. Changing the contents of planet_osm_line to exclude roads and railways will block that option.

I know this means you can't optimize the planet_osm_line table, but I do think this is better for downstream users relying on a basic three table setup.

We're going to be breaking backwards compatibility regardless with the changes to the columns in those tables. Is there a use-case for keeping roads in planet_osm_line without columns that other styles depend on?

@mboeringa
Copy link

mboeringa commented Apr 27, 2020

Is there a use-case for keeping roads in planet_osm_line without columns that other styles depend on?

That depends. If you intend to remove the tags hstore column as well from planet_osm_line, than yes, there is not much sense in maintaining the roads in planet_osm_line, because there will be no way to restore the old behavior. On the other hand, if the tags column is still there, any missing columns could relatively easily be extracted from hstore by changing SQL.

Maybe it is better to ask for feedback about this on the mailing list or forums though.

I've created a [new long-running branch]([https://github.com/gravitystorm/openstreetmap-carto/tree/flex/master]](https://github.com/gravitystorm/openstreetmap-carto/tree/flex/master%5D) that currently reimplements existing behavior with the flex backend.

Thanks, that will definitely be of use to some, and likely makes the "backwards compatibility" issue also referred to above, largely a non-issue.

Will there also soon be a branch showing off some of the proposed changes in table structure as in your first post in this thread (#4112 (comment))? Would be interesting to see.

@pnorman
Copy link
Collaborator Author

pnorman commented Apr 28, 2020

  • Bring parent route relation information onto linear transportation elements (roads and rail) from ways

I gave this a think, and there are a few problems with this approach that I solved by instead making a planet_osm_route table that contains route membership information. This has a few advantages.

  • It avoids phase 2 processing, which would be expensive because every road would need to be flagged.
  • A linestring can have many parent relations, and this preserves that relationship better.
  • Network information would be difficult to condense into one column.
  • A separate table doesn't restrict what type of route relations are processed, allowing for walking and bicycle relations

@mboeringa

This comment has been minimized.

@jeisenbe
Copy link
Collaborator

I note that the flex backend was released in 1.3.0 of osm2pgsql a few months ago, and recently updated in release 1.4.0: https://github.com/openstreetmap/osm2pgsql/releases/tag/1.4.0 - and there is now extensive documentation at https://osm2pgsql.org/

@mboeringa
Copy link

@jeisenbe , I have just released an updated version of the flex script, that adds some nice new options, and additionally fixing a blocking issue due to the use of an outdated 'multi' option in the existing lua file. This also fixes the issue of the 'network' column not being set in the new 'route' table that is part of this flex version, as I noted in #4112 (comment).

See here: #4320

@pnorman
Copy link
Collaborator Author

pnorman commented Apr 16, 2021

I have the admin stuff half done locally. I've written the code to do it, but need to finish writing the test cases.

@pnorman
Copy link
Collaborator Author

pnorman commented Jun 26, 2021

I have the admin stuff half done locally. I've written the code to do it, but need to finish writing the test cases.

This is done and pushed.

@joto
Copy link
Contributor

joto commented Jun 4, 2024

I am restarting this effort to switch to osm2pgsql flex. It is sufficiently different from what was discussed in this issue that I created a new isse, see #4977 for all the details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants