-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable postalcode lookup #310
base: master
Are you sure you want to change the base?
Conversation
This diff enables lookup in the `postalcode` layer, guarded by a config flag. Because the postal code data can amount to a significant size (12.6 GB as of today), this setting is disabled by default. The new `lookupPostalCode` setting affects only the "local resolver" mode. If it's set to true, it will start the `postalcode` layer worker. The layer is considered "untrusted", just like the `neighbourhood` layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, fairly simple change but effective 👍
As you mentioned it's important to default this behaviour to off but allow opt-in.
This will be beneficial for smaller geographies and for testing the quality of algorithmically assigned postcodes.
Worth mentioning that there are two 'postcode' properties in Pelias, the most common being an 'address property', the other is in the admin hierarchy which allows records to be parented by a postcode but has never really been used. |
For that reason it might be required to 'special-case' postcode such that the hierarchy label is duplicated in the address properties. IIRC all the search queries target |
Nice, this feature has been quite in-demand over the years. Postal codes can be very tricky for admin lookup, which is part of why we've held off, but I don't think it hurts anything to merge it with the default set to disabled. Some ideas of future work that we might want (they should all come with further discussion):
I'll just link some relevant issues too, since we've discussed a lot of this before: |
@octmoraru one question, when you tested this, did the PIP service downloader give you any issues? As I recall it's set to default to admin only (it won't download the postalcode databases): https://github.com/pelias/pip-service/blob/f89c0f1e0796a388f1000629ff8113ddbc2db295/bin/download#L3 |
Thanks @missinglink and @orangejulius for your quick reactions!
You mean doing that in the importers, right? Indeed, that would be quite neat as it has the potential to fill some gaps in source dataset. However, even with the current implementation, not all is lost. If I read this code correctly, at runtime, for most (all?) endpoints
Yeah, that sounds reasonable. I guess that would be quite easy to implement by adding a new setting and changing the logic in src/pip/readStream.js
Thanks for sharing, it didn't cross my mind to search in pelias/pelias 🤦
I guess I've been using the whosonfirst downloader directly which by default downloads everything 😀 https://github.com/pelias/whosonfirst/blob/22f7ad76dbf65bf25f0f4a2af352ad45b72a2a94/bin/download |
Hi @orangejulius - what else would you like to see in this PR to have it merged? 🙂 |
Have you tested that this works as expected? I believe that this alone will not fix any issues in Pelias results since the PIP service is responsible for setting these fields but not this one, which is used for search and display. |
Ah, yeah so as @missinglink points out there are a couple objectives that one might wish to achieve: Adding postal codes to address or other recordsHaving postal code geometries available via the PIP service is a prerequisite for this, but not enough to solve this issue on its own. Additional work would be required to make sure that, at import time, postal codes are added to the appropriate fields on each individual address record. Allowing coarse reverse geocoding to return postalcodesThis would let one look up the postalcode for a given lat/lon, assuming the postalcode geometry is trustworthy. It sounds like @octmoraru says this works after this PR. In my estimation, that's the less common of the two needs, but it's better than nothing. 12.6GB of RAM is quite a bit, so it might not be a good idea to enable this feature in many cases. I'm on the fence if supporting something that will be a non-recommended configuration is worth it. |
Indeed, my primary intent for this PR was to make postalcodes available in
True, @orangejulius @missinglink - how would you feel if I were to duplicate |
I feel like the topic of whether algorithmically assigned postcodes are considered equal to authoritative postcodes is subjective, regional and domain-specific. Some installations may prefer the existing behavior where we only display authoritative postcodes, some may prefer to have increased coverage at the cost of the potential inaccuracies. For that reason I would like to put it behind a Another option which we've discussed in the past is to introduce a new field called This second option would allow us to make this the default immediately but indicate to the consumer a criteria which they could filter on after parsing the geojson result. |
Thanks for making it clearer for me! So out of these two options:
and
which one would you prefer? I can probably work at implementing either one of them in the near future. |
I think the two solutions aren't really different, more that the latter is an extension of the former. So at minimum:
In this situation you & others can elect to opt-in to this feature in their installation of Pelias, for testing, production, or whatever by setting the value to Then we live with it for a few months, hopefully you provide some feedback about how well it worked out for you and we evaluate how good/bad it is. Then later this year we approach the idea of making it the default setting.
I think you should focus on the first two steps since it solves your business need and also is easy to merge since it will be a no-op for everyone else (for now). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this PR for its simplicity and backward compatibility 👍
👋 I did some work for the Pelias project and would love for everyone to have a look at it and provide feedback.
Here's the reason for this change 🚀
Currently
wof-admin-lookup
does not support lookups in thepostalcode
layer.Here's what actually got changed 👏
This diff enables lookup in the
postalcode
layer, guarded by a config flag. Because the postal code data can amount to a significant size (~12.6 GB for the global postalcode dataset as of today), this setting is disabled by default.The new
lookupPostalCode
setting affects only the "local resolver" mode. If it's set to true, it will start thepostalcode
layer worker. The layer is considered "untrusted", just like theneighbourhood
layer.Here's how others can test the changes 👀
I've tested this diff locally by using pip-service:
pip
with the localwof-admin-lookup
version by running:npm link
in thewof-admin-lookup
foldernpm link pelias-wof-admin-lookup
in thepip-service
folderpip-service/sqlite
(I used France, as well as the "latest" sqlite from https://geocode.earth/data/whosonfirst/)imports.adminLookup.lookupPostalCodes
totrue
inpelias.json
, then start PIPpostalcode
layer: http://localhost:3102/2.2441473904672127/48.9052128712992?layers=postalcodeOn my machine, the entire global postal code dataset was loaded in ~6 minutes: