Whether to use FindAlllMarkers features or AverageExpression features for displaying over-represented motif based on chromvar data #1784

revolvefire · 2024-09-19T14:14:34Z

revolvefire
Sep 19, 2024

Hello @timoast

Thanks for the wonderful tool again!

I am trying to figure out the important motifs for each cluster and create a heatmap based on the results.

I ran the default RunChromVAR function:

skin <- RunChromVAR(
object = object,
genome = BSgenome.Mmusculus.UCSC.mm10
)

ran FindAllMarkers

differential.activity <- FindAllMarkers(
object = object,
only.pos = TRUE,
mean.fxn = rowMeans,
fc.name = "avg_diff"
)

Then I've selected the top motifs (in this case top4) per cluster after performing, as shown below:

What I had in mind was to draw a heatmap using AverageExpression:

agg <- AverageExpression(skin,
assay = "chromvar",
return.seurat = TRUE,
layer = "data"
)

I planned to use the top motifs I found with FindAllMarkers as the features.

But now I am wondering whether I should simply rank the averaged z-scores of motifs per cluster from AverageExpression and use them as the features for the heatmap.

What I want is to display, in a heatmap, the most over-represented motif in a pseudo-bulk manner per cluster.

I'm slightly unsure if using FindAllMarkers for ChromVAR z-scores is appropriate in the first place, even though we set row means and everything.

Do you think the Wilcoxon-based p-value is appropriate for these ChromVAR z-scores?
On a side note, I found that almost 30 motifs per cluster have a p-value of 0.000000e+00, and I was wondering whether this isn't too unusual.
2-2. Also on a side note.., how are these top markers still ranked even though they all had the same p-value of 0.000000e+00? Do they still have small differences in their actual p values that aren't reflected in the output
(FYI, the top 5 ranked motifs per cluster generally made biological sense according to my knowledge of the specific tissue).
Back to the original question, do you think I should proceed with the original plan of using the FindAllMarkers results as features, which would account for comparisons between clusters and the rest (and potentially provide a more accurate ranking of the over-represented motifs per cluster), or should I create a new ranking based solely on the AverageExpression results and use that instead? I mean chromvar scores are background corrected... so maybe it is not completely wrong to approach this way?

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whether to use FindAlllMarkers features or AverageExpression features for displaying over-represented motif based on chromvar data #1784

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Whether to use FindAlllMarkers features or AverageExpression features for displaying over-represented motif based on chromvar data #1784

revolvefire Sep 19, 2024

Replies: 0 comments

revolvefire
Sep 19, 2024