Question about --name-only matching #10

jesseclark · 2018-05-25T18:38:37Z

Hello,

I just ran the command-line dedupe tool over a dataset passing the --name-only flag and got this in the results (Note that I am parsing the JSON and dumping them to screen for quick reviewing so the format is different):

Osceola High
1111 Oak Ridge Dr, Osceola, WI 54020

SAME
>>> Osceola Middle
>>> 1029 Oak Ridge Dr, Osceola, WI 54020
>>> 1.0
>>> Osceola Elementary
>>> 250 10th Ave E, Osceola, WI 54020
>>> 1.0
>>> Osceola Intermediate
>>> 949 Education Ave, Osceola, WI 54020
>>> 0.9999840824

I wonder if you could help me understand why these four names which seem substantially different at a glance, get such high similarity scores from the deduper?

Thanks!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about --name-only matching #10

Question about --name-only matching #10

jesseclark commented May 25, 2018 •

edited

Loading

Question about --name-only matching #10

Question about --name-only matching #10

Comments

jesseclark commented May 25, 2018 • edited Loading

jesseclark commented May 25, 2018 •

edited

Loading