Model Terminology

Key Terms

The celda package is a reference implementation for several Bayesian hierarchical models useful for clustering single cell RNA-seq data. Our group uses specific terminology in reference to the data being modeled and to parts of the models themselves, which is outlined below. Issues, questions, and code contributions should use the terminology below:

Cell Population: a specific cluster of cells; one cell cluster label amongst those returned from celda_C / celda_CG
Gene Module: a specific cluster of genes; one gene cluster amongst those returned from celda_G / celda_CG

Model Parameters / Shorthands

We use various shorthand terms in the code which implements the celda models. An explanation of each is below:

Variables

C = Cell
S or s = Sample
G = Gene
TS = Transcriptional State
CP = Cell population
n = counts of transcripts
m = counts of cells
K = Total number of cell populations
L = Total number of transcriptional states
nM = Number of cells
nG = Number of genes
nS = Number of samples

Count matrices

All n.* variables contain counts of transcripts

n.CP.by.TS = Number of counts in each Cellular Population per Transcriptional State
n.TS.by.C = Number of counts in each Transcriptional State per Cell
n.CP.by.G = Number of counts in each Cellular Population per Gene
n.by.G = Number of counts per gene (i.e. rowSums)
n.by.TS = Number of counts per Transcriptional State All m.* variables contain counts of cells
m.CP.by.S = Number of cells in each Cellular Population per Sample
nG.by.TS = Number of genes in each Transcriptional State

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Terminology

Key Terms

Model Parameters / Shorthands

Variables

Count matrices

Clone this wiki locally