q-values after consensus peak normalization are all 1 #160

kelly-sovacool · 2023-12-05T01:04:49Z

Regardless of dataset used.

This causes an error when running chipseeker_peakplot:

Command error:
  Warning message:
  package 'dplyr' was built under R version 4.3.2
  GRanges object with 58429 ranges and 2 metadata columns:
            seqnames            ranges strand |      pvalue    qvalue
               <Rle>         <IRanges>  <Rle> |   <numeric> <numeric>
        [1]     chr1   3671441-3671716      * | 4.16506e-12         1
        [2]     chr1   4142568-4142910      * | 2.12097e-11         1
        [3]     chr1   4228249-4228601      * | 7.22163e-12         1
        [4]     chr1   4297125-4297414      * | 3.61542e-12         1
        [5]     chr1   4332522-4332831      * | 1.30793e-11         1
        ...      ...               ...    ... .         ...       ...
    [58425]     chrY 90804913-90805435      * | 3.87469e-11         1
    [58426]     chrY 90807619-90808047      * | 4.66823e-11         1
    [58427]     chrY 90808620-90809110      * | 4.36221e-11         1
    [58428]     chrY 90810807-90811291      * | 3.35543e-11         1
    [58429]     chrY 90828800-90829091      * | 4.38834e-12         1
    -------
    seqinfo: 21 sequences from an unspecified genome; no seqlengths
  chr1 dosen't contain signal higher than 1
  chr10 dosen't contain signal higher than 1
  chr11 dosen't contain signal higher than 1
  chr12 dosen't contain signal higher than 1
  chr13 dosen't contain signal higher than 1
  chr14 dosen't contain signal higher than 1
  chr15 dosen't contain signal higher than 1
  chr16 dosen't contain signal higher than 1
  chr17 dosen't contain signal higher than 1
  chr18 dosen't contain signal higher than 1
  chr19 dosen't contain signal higher than 1
  chr2 dosen't contain signal higher than 1
  chr3 dosen't contain signal higher than 1
  chr4 dosen't contain signal higher than 1
  chr5 dosen't contain signal higher than 1
  chr6 dosen't contain signal higher than 1
  chr7 dosen't contain signal higher than 1
  chr8 dosen't contain signal higher than 1
  chr9 dosen't contain signal higher than 1
  chrX dosen't contain signal higher than 1
  chrY dosen't contain signal higher than 1
  Error in UseMethod("group_by") :
    no applicable method for 'group_by' applied to an object of class "list"
  Calls: covplot -> getChrCov -> %>% -> summarise -> group_by
  Execution halted

The text was updated successfully, but these errors were encountered:

kelly-sovacool · 2023-12-05T14:59:58Z

Could this be a bug in how normalized q-values are calculated, or is it a feature of this normalization method?

https://github.com/CCBR/nf-modules/blob/60d50f4c45a50378cad70b49013f51750617caaa/modules/CCBR/custom/normalizepeaks/templates/normalize_peaks.R#L45-L61

kelly-sovacool · 2024-01-03T14:00:51Z

We are building a chipseq analysis pipeline and want to be able to create a consensus peak set across samples. We found a normalization method to account for variation in read depth and quality across samples that takes the -log10(p-value) and divides that by the sum of all -log10(p-values) divided by 1 million to create a "score per million". However, we'd also like to have normalized q-values (a.k.a FDR / Benjamini-Hochberg). Our first attempt to calculate normalized q-values resulted in all of them being nearly 1. We're not sure if there's a problem in how we are calculating it or if it's a feature of this normalization method.

Description of the normalization method from the supplementary methods from Corces et al. 2018:

Our implementation in R:

#' Normalization method from Corces et al.
#' https://doi.org/10.1126/science.aav1898
corces <- function(x) {
  return(x / sum(x) / 10^6)
}

#' Normalize peaks
normalize <- function(peak_dat, norm_method = corces) {
  return(
    peak_dat %>%
      mutate(
        pvalue_norm = norm_method(pvalue_neglog10),
        qvalue_norm = 10^(-pvalue_norm) %>% p.adjust(method = "BH") %>% -log10(.)
      ) %>%
      select(-c(pvalue, qvalue)) %>%
      rename(pvalue = pvalue_norm, qvalue = qvalue_norm)
  )
}

kelly-sovacool added the bug Something isn't working label Dec 5, 2023

kelly-sovacool changed the title ~~q-values after consensus normalization are all 1~~ q-values after consensus peak normalization are all 1 Dec 5, 2023

kelly-sovacool mentioned this issue Dec 8, 2023

Normalize p-values for consensus peaks #151

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

q-values after consensus peak normalization are all 1 #160

q-values after consensus peak normalization are all 1 #160

kelly-sovacool commented Dec 5, 2023

kelly-sovacool commented Dec 5, 2023

kelly-sovacool commented Jan 3, 2024 •

edited

Loading

q-values after consensus peak normalization are all 1 #160

q-values after consensus peak normalization are all 1 #160

Comments

kelly-sovacool commented Dec 5, 2023

kelly-sovacool commented Dec 5, 2023

kelly-sovacool commented Jan 3, 2024 • edited Loading

kelly-sovacool commented Jan 3, 2024 •

edited

Loading