You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I played around a bit with libpostal and pypostal, and I am quite impressed. Kudos!
My country is
Germany
Here's how I'm using libpostal
Thinking about including it in the Odoo instance of where I work, but I am just in the (private) explorative stage yet. The use-case would be to deduplicate purchased leads.
Here's what I did
I tried using expand_address (from pypostal, but this is not a pypostal issue).
defnormalize_address(address: Address, languages: list[str]) ->NormalizedAddress:
"""Normalizes the fields of an address that make sense to be normalized, adds fields to the dict of the address with normalized values"""normalized_address=NormalizedAddress(
id=address.id,
name=address.name,
email=address.email,
company=address.company,
street=address.street,
building_number=address.building_number,
postcode=address.postcode,
city=address.city,
state=address.state,
country=address.country,
)
normalized_address.normalized_name=normalize_string(address.name.value)
normalized_address.normalized_email=normalize_email_address(address.email.value)
normalized_address.normalized_company=normalize_string(address.company.value)
normalized_address.normalized_street=normalize_address_string(
address.street,
languages=languages
)
normalized_address.normalized_building_number=normalize_string(address.building_number)
normalized_address.normalized_postcode=normalize_string(address.postcode)
normalized_address.normalized_city=normalize_string(address.city)
normalized_address.normalized_state=normalize_state_string(address.state, languages=languages)
normalized_address.normalized_country=normalize_country_string(
address.country,
languages=languages
)
returnnormalized_address
And this is the normalize_country_string function:
defnormalize_country_string(state: str, languages: list[str]) ->str:
"""Normalize state String, like e.g. "MA" for Massachusetts, by expanding it with pypostal"""parsed_country=postal.parser.parse_address(
state,
language=languages[0],
)
expanded_country=postal.expand.expand_address(parsed_country[0][0], languages=languages)
returnexpanded_country[0]
Here's what I got
Worked well with addresses, like sqr to Square
But what I got back from U.S.A. I got back usa
Here's what I was expecting
I was not able to expand the abbreviation of U.S.A. to e.g.: United States of America ar another representation. Maybe the library is not intended to do so, which would be completely fine with me. I was just wondering, if I made an Error or if that is intentional?
Here's what I think could be improved
More documentation, and maybe reStructuredText docstrings, instead of something doxygen-like in the python parts, because they can be better parsed by Python tools (like e.g. PyCharm)
The text was updated successfully, but these errors were encountered:
Geographic expansions are mostly not included since the implementation is simplistic, just cartesian product of the expansions to create keys for matching. It can lead to a lot of irrelevant results across languages. The library's not really designed for standardizing addresses for display though it can be cludged into doing so. You can always just include your own dictionary of expansions and look up the parsed result in that.
Hi!
I played around a bit with libpostal and pypostal, and I am quite impressed. Kudos!
My country is
Germany
Here's how I'm using libpostal
Thinking about including it in the Odoo instance of where I work, but I am just in the (private) explorative stage yet. The use-case would be to deduplicate purchased leads.
Here's what I did
I tried using expand_address (from pypostal, but this is not a pypostal issue).
This is my unittest:
This is the
normalize_address
function:And this is the
normalize_country_string
function:Here's what I got
Worked well with addresses, like
sqr
toSquare
But what I got back from
U.S.A.
I got backusa
Here's what I was expecting
I was not able to expand the abbreviation of
U.S.A.
to e.g.:United States of America
ar another representation. Maybe the library is not intended to do so, which would be completely fine with me. I was just wondering, if I made an Error or if that is intentional?Here's what I think could be improved
More documentation, and maybe reStructuredText docstrings, instead of something doxygen-like in the python parts, because they can be better parsed by Python tools (like e.g. PyCharm)
The text was updated successfully, but these errors were encountered: