Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate differences between Semi-supervised and Hybrid #35

Open
maximskorik opened this issue May 31, 2022 · 2 comments
Open

Investigate differences between Semi-supervised and Hybrid #35

maximskorik opened this issue May 31, 2022 · 2 comments
Labels
documentation Improvements or additions to documentation invalid This doesn't seem right

Comments

@maximskorik
Copy link
Member

maximskorik commented May 31, 2022

hybrid.R is a refactored version of Hybrid method, originally implemented in semi.sup.R. Both scripts follow the same procedure, that being:

  1. feature extraction
  2. time adjustment
  3. feature alignment
  4. feature augmenting from a table of known compounds
  5. weaker-signal recovery
  6. second time adjustment
  7. second feature alignment
  8. feature augmenting from a table of known compounds

More detailed description on Emory's web.


However, the original and refactored tools produce different outputs. hybrid.R tends to detect more features and shift retention times of features that match between two outputs. For features that differ they often do so at just the 3rd or 4th decimal place of m/z. This is apparent by the fact that rounding features to less decimal places leads to more aligned features between the two outputs (example).


Since semi.sup.R is a part of apLCMS Two-Step Hybrid, understanding where the two tools differ would make possible to replace semi.sup() with hybrid() and split apLCMS Two-Step Hybrid into separate Galaxy tools.


Inputs: https://umsa.cerit-sc.cz/u/hechth/h/unnamed-history-2 QC_1.raw & QC_2.raw

Output difference between two methods:

tool 1st step of two-step method (semi.sup.R) hybrid
num of features 2770 2964
overlap (mz) 1066 1066
overlap; rounded to 4. place (mz) 1911 1911
overlap; rounded to 3. place (mz) 2690 2690
average rt deviation between matching peaks (s) 36.4
@maximskorik maximskorik added documentation Improvements or additions to documentation invalid This doesn't seem right labels May 31, 2022
@hechth
Copy link
Member

hechth commented Jun 14, 2022

@maximskorik do we have a test comparing semi.sup and hybrid?

@maximskorik
Copy link
Member Author

@maximskorik do we have a test comparing semi.sup and hybrid?

@hechth no. I've never pushed that branch and imprudently deleted it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation invalid This doesn't seem right
Projects
Status: 📋 Backlog
Development

No branches or pull requests

2 participants