-
-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
It's set include_images=True, but there is no picture #610
Comments
I can indeed reproduce the bug. Images are not my priority, the corresponding code mostly consists of a series of contributions and it's not perfect. Let's see if someone can improve on this. |
Thank you very much, I found that most of the sites can't get pictures in the process, and this is just one of the cases |
Try it:
|
Thanks, it worked, I modified the source code of trafilatura and was able to solve part of the problem, but as I was using it I realized that most of the url's didn't work perfectly, there were too many adaptations needed, gave up! |
For further reference: see also #662. |
That's my code:
`from trafilatura import fetch_url, extract
url = 'https://shumeipai.nxez.com/2020/06/11/stanford-pupper-assembly-tutorial.html'
downloaded = fetch_url(url)
result = extract(downloaded, output_format='markdown', favor_recall=True, include_images=True, include_links=True)`
The text was updated successfully, but these errors were encountered: