There are many useful applications of libpostal, and it's an impressive library, but one I would caution against is for the purpose of address matching, at least as the 'primary' approach.
The problem is the hardest to parse addresses are also often the hardest to match, making the problem somewhat circular. I wrote about this more in a recent blog on address matching: https://www.robinlinacre.com/address_matching/
Ameo · 2h ago
I used this at a previous company with quite good success.
With relatively minimal effort, I was able to spin up a little standalone container that wrapped around the service and exposed a basic API to parse a raw address string and return it as structured data.
Address parsing is definitely an extremely complex problem space with practically infinite edge cases, but libpostal does just about as well as I could expect it to.
degamad · 2h ago
Ditto - I was impressed with how well it handled the weird edge cases in our data.
They've managed to create a great working implementation of a very, very small model of a very specific subset of language.
jandrese · 4h ago
Wow, ambitious project. Anybody who has tried to verify addresses can tell you that the staggering number of different formats and conventions around the world make it and almost intractable problem. So many countries have wildly informal standards and people putting down just whatever they want because the mailman "just knows".
monero-xmr · 3h ago
Maxmind is the quintessential example of what devs want to build in their heart of hearts. Low-touch sales but b2b. Almost a monopoly. Prints money for decades. Not a public company so they never increase costs to a usurious amount. Open source never quite meets the level needed
degamad · 1h ago
Previously:
<https://news.ycombinator.com/item?id=18775099> Libpostal: A C library for parsing/normalizing street addresses around the world - 117 points by polm23 on Dec 29, 2018 (25 comments)
<https://news.ycombinator.com/item?id=11173920> Libpostal: international street address parsing in C trained on OpenStreetMap (mapzen.com) 74 points by riordan on Feb 25, 2016 (7 comments)
Discussed on HN here: https://news.ycombinator.com/item?id=8907301
The problem is the hardest to parse addresses are also often the hardest to match, making the problem somewhat circular. I wrote about this more in a recent blog on address matching: https://www.robinlinacre.com/address_matching/
With relatively minimal effort, I was able to spin up a little standalone container that wrapped around the service and exposed a basic API to parse a raw address string and return it as structured data.
Address parsing is definitely an extremely complex problem space with practically infinite edge cases, but libpostal does just about as well as I could expect it to.
They've managed to create a great working implementation of a very, very small model of a very specific subset of language.
<https://news.ycombinator.com/item?id=18775099> Libpostal: A C library for parsing/normalizing street addresses around the world - 117 points by polm23 on Dec 29, 2018 (25 comments)
<https://news.ycombinator.com/item?id=11173920> Libpostal: international street address parsing in C trained on OpenStreetMap (mapzen.com) 74 points by riordan on Feb 25, 2016 (7 comments)