-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deduplication & Table errors when uploading files without URL field (3 Web of Science .ris example) #179
Comments
I get an error because it is looking for the cite_string column and that does not exist in the dataframes. Is this because they are not declared in the function like cite_source and cite_label are? Line 41 in 415f910
|
I've uploaded the .ris and a script in the test file folder. |
Running these files in R, rather than the shiny throws an error for the record-level table. It appears that the URL column is the issue for some strange reason. Looking at the data the URL seems to be missing from the beginning with all these WoS files and others. I think that the second error with the table is due to the fact that this is the first time I've run test with ONLY WoS .ris files. When other database files are included the URL column is added and you end up with an NA for the WoS citations. Most likely the way to fix this is to make the record-level table not reliant on the URL column to run. Error in `dplyr::mutate()`:
ℹ In argument: `reference = generate_apa_reference(...)`.
Caused by error in `.data$url`:
! Column `url` not found in `.data`.
Run `rlang::last_trace()` to see where the error occurred. |
@LukasWallrich can you take a look at how the record_level_table and the generate_apa_reference functions can be changed to work if there is no URL column due to it not being included in any of the citation files/metadata? I've tried but have not been successful. Also removing the script for the testing from the test folder due to the CMD check failure. I'll keep the .ris files in "shinytest"in test folder |
@TNRiley I changed the record_level_table(), can you check if the issue persists? |
@LukasWallrich I'm still running into an error in both the shiny and R. I've been unsuccessful in troubleshooting, it's hung up on the weblink column needing to be a character type and despite converting and checking I still get the error. |
Also the record_level_table is written in a way that it uses the "citations" tibble, but all the tests I have been running and all examples which are on the vignettes are using "unique_citations" which seems correct. I believe the record_level_table function needs to be corrected to ensure it's pointing to the deduplicated unique_citations data, but need to review further. |
Sorry about that - this bug fix was too ad-hoc. I now moved the error and type checking into the generate_apa_reference() function, and also fixed the reference generation for single-name authors. I also added tests that should prevent URL and type errors (or other missing column issues) from reoccurring. @TNRiley can you test it again, and also keep an eye out for references that are misformatted? Re |
I tested it in local shiny and it worked, the small sample set of citations I viewed also looked good and was formatted correctly. |
This appears to be limited to a specific case study I'm working to document. These errors do not come up when I use other .ris files. My thought is that something is happening with the WOS identifier or something related.
Running three searches in Web of Science - (simple variations on a string strategy) - export full record for each
1st error occurs if you do any manual deduplication, all tables and visuals will show an error
2nd error that is constant regardless if you do manual deduplication is that the individual record table throws and error
Note: when deduplicating I do get the pop up that says that of the 1566 records, there were 642 unique records (which makes no sense as this is the number of records from v1 and there is complete overlap between v2 and v3) Furthermore, there is 1 set of potential duplicates which for some reason were not automatically identified despite all data coming from WOS (this could be a metadata issue so would need to be reviewed later)
The text was updated successfully, but these errors were encountered: