Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect duplicate samples upon file upload #598

Closed
vdkkia opened this issue May 11, 2021 · 3 comments · Fixed by #1520
Closed

Detect duplicate samples upon file upload #598

vdkkia opened this issue May 11, 2021 · 3 comments · Fixed by #1520
Assignees
Milestone

Comments

@vdkkia
Copy link
Collaborator

vdkkia commented May 11, 2021

  • Add the ability to detect duplicate samples within a project on upload in case all sample_attributes are the same.
  • Let the user select what to do with the duplicate samples (skip, update)
@stuzart
Copy link
Member

stuzart commented May 27, 2021

the duplication would be within the context of the sample type and project

@stuzart stuzart added this to the v1.12.0 milestone Jun 24, 2021
@stuzart stuzart removed this from the v1.12.0 milestone Jun 16, 2022
@rabuono
Copy link
Collaborator

rabuono commented Apr 11, 2023

As part of Samples WG:

Scenario assumes a generally accepted “ID”. If it is a new uuid or the classical SEEK ID it is independent of the scenario.
Suggestion: Optimally the ID / UUID column in the excel is locked and should not be allowed to be changed.

  1. Sample type with 1 Sample (Sample ID is 1) exists.

  2. Spreadsheet containing Sample of ID=1 is uploaded to the instance for Sample extraction. Spreadsheet has a column with the ID, for this example the ID value is 1

    • Spreadsheet is either downloaded from the instance (not possible currently, as a new feature for this is needed), or just a continuation of the offline filling of the downloaded spreadsheet by the User.
  3. Options

    1. ID column not changed. A row with ID=1 exists → Sample exists!
      To allow for updating of Sample of ID=1, check attributes values:
      All values are the same: SEEK does nothing
      At least one attribute value changed (not the ID column): Ask the user if Sample ID=1 should be updated. A batch operation, all changes or no changes.

    2. There is no value for ID → Sample does not exist: Create Sample
      To avoid (erroneous) duplication of sample information by the User, an extra step can be added to check if that combination of values already exists in another registered Sample of that Sample Type
      Yes: “This Sample might already exist, are you sure you want to create it again?”
      No: Create Sample

    3. There is a value for ID that does not exist in the database → Sample does not exist. Throw error “This Sample does not exist”. An User is not allowed to set a desired value for ID

Samples should be unique in a Sample Type in all items inside a Study.

@rabuono rabuono moved this to In Progress in DataHub May 3, 2023
@rabuono rabuono added this to DataHub May 3, 2023
@rabuono rabuono moved this from In Progress to Todo in DataHub May 3, 2023
@stuzart stuzart added this to the v1.14.0 milestone May 3, 2023
@kdp-cloud kdp-cloud moved this from Todo to In Progress in DataHub Jun 1, 2023
@kdp-cloud kdp-cloud self-assigned this Jun 1, 2023
@rabuono rabuono moved this to In Progress in Samples Working Group Jun 21, 2023
@kdp-cloud kdp-cloud moved this from In Progress to Coded in DataHub Aug 7, 2023
@floradanna
Copy link
Collaborator

Testing document: Samples download and upload - DataHub

@kdp-cloud kdp-cloud linked a pull request Aug 23, 2023 that will close this issue
@kdp-cloud kdp-cloud modified the milestones: v1.14.0, v1.15.0 Sep 7, 2023
@github-project-automation github-project-automation bot moved this from In Progress to Done in Samples Working Group Sep 20, 2023
@kdp-cloud kdp-cloud moved this from Coded to Released in DataHub Sep 27, 2023
@kdp-cloud kdp-cloud moved this from Released to Merged in DataHub Sep 27, 2023
@kdp-cloud kdp-cloud moved this from Merged to Released in DataHub Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Released
Status: Done
Status: Done
Development

Successfully merging a pull request may close this issue.

6 participants