-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase robustness of calculate_record_counts #161
Comments
Also, I don't think it (the record summary table) currently works correctly to compare labels or strings rather than sources - can that be? In my test, I get NAs in the last two columns. |
I'm trying to think of a use case outside of comparing the sources. I don't see any need for it to compare strings or labels.
Edited it does rely on the label being "search" which could potentially be removed without other changes. I'll take a look at this in a bit more depth. |
Looked at this again and remembered why I had set this to filter by the search label. The table is used to show the impact of sources/methods. If we remove the need for the label, it will show any screened or final data as well, which I currently don't see a use case for and believe it would just clutter things up. I think there are many ways in which we could change this to ensure it is usable for other use cases, but I'm not able to envision those at this time. I can add the following warning to the function and details to the documentation. if (!"search" %in% n_unique$cite_label) { #' @details |
If the data is set up correctly, final and screened should not be sources
but labels, and always be a subset of search - so that filtering for search
should not do anything?
For that reason, I would find it convenient if you could just upload data
to shiny without having to label everything (but 2) as search. Would it
make sense to use blank labels instead of search if search does not exist?
…On Wed, 21 Jun 2023, 12:26 Trevor Riley, ***@***.***> wrote:
Does this function need to rely on a search label? I would have expected
it to work without any labels. If it needs the label, that should be
documented, and a warning be issued when there is no search label.
Looked at this again and remembered why I had set this to filter by the
search label. The table is used to show the impact of sources/methods. If
we remove the need for the label, it will show any screened or final data
as well, which I currently don't see a use case for and believe it would
just clutter things up. I think there are many ways in which we could
change this to ensure it is usable for other use cases, but I'm not able to
envision those at this time.
I can add the following warning to the function and details to the
documentation.
if (!"search" %in% n_unique$cite_label) {
warning("Source's must be tagged as 'search' in the 'cite_label' field.")
#' @details <https://github.com/details>
#' The function works on three main inputs: unique_citations, citations,
and n_unique. Each of these dataframes contains
#' a column corresponding to the database source.
#'
#' Additionally, n_unique dataframe contains two specific columns:
'cite_source' and 'cite_label', provided by the user.
#' 'cite_source' column represents the source of the citation and
'cite_label' represents another variable assigned to the
#' citation (normally search, screened, final).
#'
#' The function groups by 'cite_source' and filters by 'cite_label' ==
"search". These steps are essential for correctly
#' calculating unique records count.
—
Reply to this email directly, view it on GitHub
<#161 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOK6NGNSGZOBMP5MUDHULRTXMLK57ANCNFSM6AAAAAAZJAVZDQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
The labels would be set as search, screened, or final. We only want to see this information broken down by source, so the function is filtering for only the search records. If we didn't filter by search the table would pull the screened and final data too, which we don't want. Does this make sense, I'm not sure if I'm misunderstanding your point ;) |
I forgot that the new Typing in "search" many times just does not seem to be very user-friendly on Shiny - but if we can document clearly that this should be done (best on the page where users upload data so that this is correct pre-deduplication) and throw a warning when it is not present then that's good enough for now. |
I'm not sure I can think of a way around it currently. On the current web shiny there is an option to add a label before you click upload file. If we could add this function to the new shiny where someone could select multiple files and add a label to all of them at the same time, that would make it easy. We'd still need to add the information on the upload page that lets users know that they should use "search" for source files, I think that would work. |
@LukasWallrich just circling back to this one and I wanted to see if you still wanted to make adjustments here. Maybe we can run through some use case examples in the shiny once it is complete and see if this requires changes? |
Closing this as changes would require multiple functions to be significantly changed, beyond this function. I'll add instructions for users. If needed we can look at the internal structure of the count functions in the future. |
Does this function need to rely on a search label? I would have expected it to work without any labels. If it needs the label, that should be documented, and a warning be issued when there is no search label.
The text was updated successfully, but these errors were encountered: