Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with import leading to failed deduplication #96

Closed
rootsandberries opened this issue Feb 17, 2023 · 9 comments
Closed

Problem with import leading to failed deduplication #96

rootsandberries opened this issue Feb 17, 2023 · 9 comments

Comments

@rootsandberries
Copy link
Collaborator

This is the same problem as issue #91 (which I mis-described as an import problem, so creating a new more aptly named issue here). @LukasWallrich you said you fixed it, but when I updated the packed on my computer and ran the assessing databases vignette, I'm now getting zero overlap for PsychInfo, where before I was seeing overlap locally, but not in the vignette online.

@LukasWallrich
Copy link
Collaborator

At least we are now getting consistency ;/

Could you identify a pair of records that are wrongly not recognised as overlapping?

@rootsandberries
Copy link
Collaborator Author

I'm attaching a csv file with the PsycInfo records that overlap with Web of Science (both records contained in the file--Zotero is identifying these as duplicates). I should also mention that Trevor is running into something similar where a ton of records from Dimensions that should be overlapping are not being identified as duplicates...he may post a separate issue!

psyc_wos_duplicates.csv

@TNRiley
Copy link
Collaborator

TNRiley commented Feb 22, 2023

this was indeed related to #99 it appears that the raw dimensions .ris and the raw psycinfo .ris both had some type of issue. By importing the .ris into endnote first, then simply exporting it, the issue resolves. We still need to understand why this happens with specific metadata - potentially write a fix in for it OR a check/warning.

@TNRiley TNRiley closed this as completed Feb 22, 2023
@LukasWallrich LukasWallrich reopened this Feb 23, 2023
@LukasWallrich LukasWallrich changed the title No overlap detected when there should be Problem with import leading to failed deduplication Feb 23, 2023
@LukasWallrich
Copy link
Collaborator

Reopened and renamed this - will have a look now.

@LukasWallrich

This comment was marked as outdated.

@LukasWallrich
Copy link
Collaborator

Please ignore my comment - just saw that Trevor replaced the problematic file. Will check with the previous one.

@LukasWallrich
Copy link
Collaborator

It looks like I reintroduced a bug previously fixed in synthesisr - really not sure where I even got the old code version from. But this is fixed now, and works with psycinfo.

@TNRiley can you please check whether it also works with the Dimensions file you had trouble with? Then we can actually close this :)

@TNRiley
Copy link
Collaborator

TNRiley commented Feb 23, 2023

@LukasWallrich I'm now getting hung up on read_citations, it starts to run but just sits there, I'll check back later to see if there is an error, currently just running forever

@TNRiley
Copy link
Collaborator

TNRiley commented Feb 23, 2023

Looks good! - new impart ran a bit longer, but the fix looks good!

@TNRiley TNRiley closed this as completed Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants