Whether to use FindAlllMarkers features or AverageExpression features for displaying over-represented motif based on chromvar data #1784
Unanswered
revolvefire
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello @timoast
Thanks for the wonderful tool again!
I am trying to figure out the important motifs for each cluster and create a heatmap based on the results.
I ran the default RunChromVAR function:
skin <- RunChromVAR(
object = object,
genome = BSgenome.Mmusculus.UCSC.mm10
)
ran FindAllMarkers
differential.activity <- FindAllMarkers(
object = object,
only.pos = TRUE,
mean.fxn = rowMeans,
fc.name = "avg_diff"
)
Then I've selected the top motifs (in this case top4) per cluster after performing, as shown below:
What I had in mind was to draw a heatmap using AverageExpression:
agg <- AverageExpression(skin,
assay = "chromvar",
return.seurat = TRUE,
layer = "data"
)
I planned to use the top motifs I found with FindAllMarkers as the features.
But now I am wondering whether I should simply rank the averaged z-scores of motifs per cluster from AverageExpression and use them as the features for the heatmap.
What I want is to display, in a heatmap, the most over-represented motif in a pseudo-bulk manner per cluster.
I'm slightly unsure if using FindAllMarkers for ChromVAR z-scores is appropriate in the first place, even though we set row means and everything.
Do you think the Wilcoxon-based p-value is appropriate for these ChromVAR z-scores?
On a side note, I found that almost 30 motifs per cluster have a p-value of 0.000000e+00, and I was wondering whether this isn't too unusual.
2-2. Also on a side note.., how are these top markers still ranked even though they all had the same p-value of 0.000000e+00? Do they still have small differences in their actual p values that aren't reflected in the output
(FYI, the top 5 ranked motifs per cluster generally made biological sense according to my knowledge of the specific tissue).
Back to the original question, do you think I should proceed with the original plan of using the FindAllMarkers results as features, which would account for comparisons between clusters and the rest (and potentially provide a more accurate ranking of the over-represented motifs per cluster), or should I create a new ranking based solely on the AverageExpression results and use that instead? I mean chromvar scores are background corrected... so maybe it is not completely wrong to approach this way?
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions