-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Elsevier onion's ring API wrapping #18
base: master
Are you sure you want to change the base?
Conversation
This reverts commit a393d17.
add a generateIDs parameter and process
Intercepts the envelop given by the API and extracts the article element for further processing.
(cherry picked from commit cc0617f)
Thanks for this fix! I'm trying it with the Elsevier TDM responses and have a prefix like this
It doesn't quite work with the huggingface instance you set up though; I'm getting back
And then the connection is closed. I'll try to setup a Docker instance of your PR locally later, but thought I'd check with you first on this |
@avsm The huggingface instance has been redeployed with a different image, it was set up for testing the feature but then I needed back so it's normal that it does not work. |
Thanks, I set up my own Docker instance to test it out. The patch does work on some files, but for a lot I'm getting this backtrace now.
The file itself does pass XML validation (and has the fulltext response header, since that is logged by the xslt sheet). Can do some more debugging on this later on tomorrow. |
Can you please send me one file that work and one that does not work at luca AT sciencialab.com? I'll try to have a look this week |
Thanks @lfoppiano -- email sent. |
@avsm I checked the files you have and the one that fail don't have any body, but it seems they are just the bibliographical information. This particular (right side) article does not look like a scientific article, though: |
Quite right, I thought I'd filtered on article type, but in some cases the Elsevier API also seems to return a blank body where the PDF article isn't OCRed (for old papers). I'll try it tomorrow on clean papers after refining the query, but this PR is clearly an improvement already and good to merge from my perspective. Thanks for looking at those example files so promptly! |
This PR fixes the onion ring of flavourless JATS added by the Elsevier API
Can be tested on
Credits to: @laurentromary