Skip to content

For the GSEApy ssGSEA module, why should all genes in the gene sets be present in the gene expression table? #203

Answered by zqfang
joshscurll asked this question in Q&A
Discussion options

You must be logged in to vote

Hi @joshscurll , sorry for reply late

You are right. In math, there're no requirement for all expressed gene input. Also, it's not a programmatic implementation reason

You could do whatever gene list you think is reasonable for the calculation.

What I mean is that all genes in the gene_set file (GMT) are ideally found in the gene expression table. (internally, a missing gene will be thrown away. however, some interesting genes will be dropped due to it's not found in the gene expression table)

Another reason to include all expressed genes is to "enrich" the given gene_set with a reasonable background data distribution for calculation.

But again, it's not mandatory

Hope it help.

I will upd…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@joshscurll
Comment options

Answer selected by joshscurll
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants