Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reenable stops import to Nominatim/Photon #68

Open
hbruch opened this issue Nov 16, 2024 · 2 comments
Open

Reenable stops import to Nominatim/Photon #68

hbruch opened this issue Nov 16, 2024 · 2 comments
Assignees

Comments

@hbruch
Copy link
Contributor

hbruch commented Nov 16, 2024

In the past, a list of official stops extracted from DELFI zHV was used to enhance existing OSM stops with their ref:IFOPT or, if missing, added as Nominatim places, so they became available in photon via it's nominatim import.

This functionality should be reenabled for the existing installation. Probably, the nominatim_add_ifopt.py script needs to be updated.

@hbruch
Copy link
Contributor Author

hbruch commented Nov 19, 2024

Instead of the former nvbw_osm_matches.csv.gzip, https://data.mfdz.de/mfdz/delfi_osm_matches.csv.gz should be used.

Note: it currently contains the full match results. I.E. matches with low confidence as well as DELFI zHV stops which could not be matched and are apparently unserved. I suggest to handle the stops depending on their match_state:

MATCHED: A match with sufficiently high confidence could be established. Not pre-existing information should be copied from zHV to Nominatim.
NO_MATCH: No match could be established, but according to GTFS, the stop is served. The stop should be copied from zHV to Nominatim. (in (rare?) cases, this could create duplicates of stops wich exist in OSM and were not matched, e.g. because their distance to the ZHV stop is >400m
NO_MATCH_BUT_OTHER_PLATFORM_MATCHED: No match could be established, but according to GTFS, the stop is served and other QUAIs were matched successfully. The stop should be copied from zHV to Nominatim. This could create duplicates, e.g. where one real world platform has been recorded in zHV with different DHIDs.
NO_MATCH_AND_SEEMS_UNSERVED: Stop could not be matched, and seems unserved (no GTFS trips reference it). Sadly, zHV does not report if a stop is out of service, only it's last served date, which can be wrong... => do not import to Nominatim.
MATCHED_THOUGH_REVERSED_DIR: stops were matched, but the directions of their referenced trips seem to be reversed. Though DHIDs may be swapped, I suggest to copy missing information to Nomimatim. (Question: would IFOPTS be corrected on a new import run, if a later csv has corrected data?)
MATCHED_THOUGH_DISTANT: High probabiliy that either there is a matching issue or the zHV data is very far (>200m) apart. Imho, this would justify ignoring the match.
MATCHED_THOUGH_OSM_NO_NAME: the osm stop itself had no name. Not pre-existing information should be copied from zHV to Nominatim.

@lonvia
Copy link
Contributor

lonvia commented Nov 20, 2024

For information: the script is currently for the "old" mode of updating in Photon which hasn't worked properly since Nominatim 4.0. So the main work will be in adapting to the "new" method.

Regarding the new export file

The following columns are now named differently:

  • globaleID -> GlobaleId
  • lat -> zhv_lat
  • on -> zhv_lon

I've renamed them in the script. Let me know if you prefer to rename them in the export.

The script is also able add additional address information (fields "Landkreis", "Gemeinde", "Ortsteil"). If we want to keep that, I'd need these fields as well in the new export.

We could even go a bit further and tell Nominatim not to do any address computation at all and only use the information from these fields when importing into Photon. This should work out-of-the-box with current Nominatim/Photon. It's possible that this is better for searching in Photon because the street names that Nominatim adds might just add confusion. The downside is that PV stops from OSM will look different from the internal ones.

Matching

I'll add a list of match types to ignore to the script, so you can easily change the list later.

MATCHED_THOUGH_OSM_NO_NAME needs implementing. Should be simple enough. Just be aware that the additional data will be overwritten when the OSM data changes. So you'd need to run the script again after each OSM update to reinstate the names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants