Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement domainExtractor for image and title, with a single implementation wikipedia #30

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

danielgranat
Copy link

First thanks for sharing this code. It's exactly what i needed, and i didn't find anything i liked in NodeJS.

This is more a request to get feedback then actual pull request.
The problem i encountered is that wikipedia does not work with the image and title extraction that's implemented now.
The Image is not in the header, but is the first image in the '.infobox'.
Title- When splitting the Title, the longest part is usually not the important part, like in 'Thomas Edison - Wikipedia, the free encyclopedia'
Trying to tackle this problem i saw 2 options:

  1. Re-factor the current extraction implementation to support wikipedia structure. I don't think it's a good option. First it will cause the code to be less readable. Second, what will happen when i need more customization?!
  2. Second option is to have something like domain specific plugins.

Obviously i decided to use the second option.

There is still work to be done and issues to address, but I would like to get your input on the proposed solution.

Thank you for your time!

@ageitgey
Copy link
Owner

Hey, thanks for the PR. I do agree that domain-specific plugins are a better path here than hacks on top of hacks in the main code.

I'll take a look at the PR in detail when I have a little free time and let you know what I think.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants