-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update retention times to alkane retention time indices for GC-MS #731
Comments
I was able to implement the second option; however, am now receiving the following error when I try to do my peak picking: Error: BiocParallel errors |
Hi, the function expects strictly ordered RT for spectra. Could it be that you output integers as RT, and that two consecutive spectra have the same value ? Can you crank up the decimals, even if they were useless for a human observer ? Yours, Steffen |
Hi, I just realised that possibly the PR #732 by @philouail might solve a similar use case to yours ? Yours, Steffen |
Thanks Steffen! PR #732 is a potential approach that could work, although I'm not sure if our alkane mixes were run frequently enough to be used for alignment. I've actually gotten my code to work after increasing the number of decimals, and it seems to be greatly improving our alignment (less missing values) so far. Thanks for all your help. |
Note: we are currently adding the function to force retention times to be sorted to |
Hello!
We are running large studies (>1000 samples) on GC-HRMS. Peak detection has been working beautifully. Over the course of the run, we have drifts in retention times due to column aging and periodic column changes. To help account for this effect, we run alkane mixes within each batch (96 samples). These are used to both monitor retention time changes, and calculate retention time indices for comparison to spectral databases. To improve alignment of our data, I would like to loop each batch and replace the raw retention times with retention time indices calculated from the alkane mix in that batch. Once updated, we will extract and align peaks using the traditional XCMS workflow.
My question is what is the most appropriate way to do this? I could think of a few ways, with varying efficiency.
Read in files for each batch using readMSData(). Replace raw retention time with rtime(raw_data)<- Calc_RI. Write to new mzML file using writeMSData(). Obviously this would be very slow and inefficient.
Read in files for all samples readMSData(). Replace raw retention time OR adjusted retention time with rtime(raw_data)<- Calc_RI seperately for each batch. Continue with data extraction. Adjusted retention time would be preferred so the original retention time could be retained
Read in files for all samples readMSData(). Detect peaks using findChromPeaks. Replace peak retention times with Calc_RI. Continue with data extraction. However, I could not get this to work, since it would not let me assign adjusted retention times manually.
Any ideas on how to best do this? One concern I had is how functions that call back to the mzML file (such as fillChromPeaks()) would handle the very different retention time indices. Another issue I'm running into is it won't let me subset XCMSnExp by samples using the [] command. Based on the documentation, I thought this was possible.
Thanks!
SessionInfo() is below.
R version 4.3.3 (2024-02-29)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.5
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/New_York
tzcode source: internal
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] Spectra_1.12.0 microbenchmark_1.4.10 doParallel_1.0.17 iterators_1.0.14 foreach_1.5.2 xcms_4.0.2
[7] MSnbase_2.28.1 ProtGenerics_1.34.0 S4Vectors_0.40.2 mzR_2.36.0 Rcpp_1.0.12 Biobase_2.62.0
[13] BiocGenerics_0.48.1 BiocParallel_1.36.0
loaded via a namespace (and not attached):
[1] bitops_1.0-7 rlang_1.1.3 magrittr_2.0.3 clue_0.3-65 MassSpecWavelet_1.68.0
[6] matrixStats_1.2.0 compiler_4.3.3 vctrs_0.6.5 pkgconfig_2.0.3 MetaboCoreUtils_1.10.0
[11] crayon_1.5.2 XVector_0.42.0 utf8_1.2.4 preprocessCore_1.64.0 MultiAssayExperiment_1.28.0
[16] zlibbioc_1.48.0 GenomeInfoDb_1.38.7 progress_1.2.3 DelayedArray_0.28.0 prettyunits_1.2.0
[21] cluster_2.1.6 R6_2.5.1 RColorBrewer_1.1-3 limma_3.58.1 GenomicRanges_1.54.1
[26] SummarizedExperiment_1.32.0 snow_0.4-4 IRanges_2.36.0 Matrix_1.6-5 splines_4.3.3
[31] igraph_2.0.2 tidyselect_1.2.0 abind_1.4-5 codetools_0.2-19 affy_1.80.0
[36] lattice_0.22-5 tibble_3.2.1 plyr_1.8.9 survival_3.5-8 pillar_1.9.0
[41] affyio_1.72.0 BiocManager_1.30.22 MatrixGenerics_1.14.0 MALDIquant_1.22.2 ncdf4_1.22
[46] generics_0.1.3 RCurl_1.98-1.14 hms_1.1.3 ggplot2_3.5.0 munsell_0.5.0
[51] scales_1.3.0 MsExperiment_1.4.0 glue_1.7.0 MsFeatures_1.10.0 lazyeval_0.2.2
[56] tools_4.3.3 mzID_1.40.0 robustbase_0.99-2 QFeatures_1.12.0 vsn_3.70.0
[61] RANN_2.6.1 fs_1.6.3 XML_3.99-0.16.1 grid_4.3.3 impute_1.76.0
[66] MsCoreUtils_1.14.1 colorspace_2.1-0 GenomeInfoDbData_1.2.11 cli_3.6.2 fansi_1.0.6
[71] S4Arrays_1.2.1 dplyr_1.1.4 AnnotationFilter_1.26.0 pcaMethods_1.94.0 gtable_0.3.4
[76] DEoptimR_1.1-3 digest_0.6.34 SparseArray_1.2.4 multtest_2.58.0 lifecycle_1.0.4
[81] statmod_1.5.0 MASS_7.3-60.0.1
The text was updated successfully, but these errors were encountered: