Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider weighting the title more than the year when processing lists of names #130

Open
chazlarson opened this issue Nov 25, 2021 · 1 comment

Comments

@chazlarson
Copy link

A recent list resulted in the following:

Green Room (2015)
matched Shelter 2015
should have been Green Room (2016)

The Conformist (1971)
missed
should have been The Conformist (1970)

Seven (1995)
matched Seven Landscapes 1995
should have been Se7en (1995)

Looper (2015)
matched Little Loopers 2015
should have been Looper (2012)

Shallow Grave (1995)
Matched Clueless (1995)
should have been Shallow Grave (1994)

The Place Beyond The Pines (2012)
missed
should have been The Place Beyond The Pines (2013)

In all those cases, it seems plainly apparent that the title is not a match. It seems like maybe searching on the title, then comparing years in the results would have a higher hit rate. For example, there is only one result for "The Place Beyond the Pines", and searching for most of the others by title turn up the correct title within a year of the requested one, which seems like a better guess than choosing "Clueless" in place of "Shallow Grave", for example.

@TheUltimateC0der
Copy link
Owner

Calculating the similarity between words is a pretty CPU intensive task. This is generally no problem. I will take a look on what I can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants