Request to understand the LMM models in alpha, beta diversity and Differencial abundance. #67

OrsonMM · 2024-10-28T14:37:00Z

Dear team MicrobiomeStat,

I am appreciate very much your software contribution. I am new using Lineal Mixed models.
Please can you suggest me If I am used my data correctly.

In my experiment, I have this variables:

asv variable : Taxonomical abundances of DADA2 output
treat variable: 4 differents (A, B, C and D)
time variable : 3 differents time points (1,2,3)
sample_treatment_time variable: 5 independent samples for each treatment and their respective replicates over time (60 samples in total).

My question is what is the asv community that are affected by Treat, Time or interaction of these Treat:Time.

I am enter my variables for model in MicrobiomeStat:

group.var = Treat
subject.var = sample_treatment_time
time.var = Time

Please can you explain me how is the ecuation form :

In the manual I am not sure if use the same model for alpha and beta diversity and for diferential abundance of AVS.

I understand that use : y ~ time.var + group.var + time.var : group.var + (1 | subject.var)
is correct ??

Greats

cafferychen777 · 2024-10-28T15:26:46Z

Dear Orson,

Thank you for your interest in MicrobiomeStat and for reaching out with your question about Linear Mixed Models (LMM). We appreciate your detailed description of your experimental design.

From your description, I can see you have:

4 treatments (A, B, C, D)
3 time points
5 independent samples per treatment with replicates over time
A total of 60 samples

While the model formula you suggested (y ~ time.var + group.var + time.var:group.var + (1|subject.var)) is generally appropriate for longitudinal microbiome data analysis, to better assist you, could you please specify which MicrobiomeStat function(s) you are using?

Each function might have slightly different implementations to accommodate the specific needs of alpha diversity, beta diversity, and differential abundance analyses.

Once you clarify which function(s) you're working with, I can provide more specific guidance about the model implementation.

Best regards

OrsonMM · 2024-10-28T15:57:00Z

Hi Caffery Yang,

Thank's for rapid response,

I understand based on your response that each function generate a different ecuation model.
I have more doubts in these functions:

alpha diversity

alpha_time_diversity <- generate_alpha_trend_test_long(
  data.obj = rarefy_data_genus,
  alpha.name = c("shannon", "simpson", "observed_species", "chao1", "ace","pielou"),
  depth = NULL,
  time.var = "Time",
  subject.var = "sample_treatment_time",
  group.var = "Treat",
  adj.vars = NULL
  )

Beta diversity

beta_diversity <- generate_beta_trend_test_long(
  data.obj = rarefy_data_genus,
  dist.obj = NULL,
  subject.var = "sample_treatment_time",   # random effect - I am not understand if is a slope or intercept ramdom  
  time.var = "Time", # Fixed effect 
  group.var = "Treat",
  adj.vars = NULL,
  dist.name = c("Jaccard")
)

beta_diversity_volatility <- generate_beta_volatility_test_long(
  data.obj = rarefy_data_genus,
  dist.obj = NULL,
  subject.var = "sample_treatment_time",
  time.var = "Time",
  group.var = "Treat",
  adj.vars = NULL,
  dist.name = c("BC","Jaccard","UniFrac","JS")
)

DA

Here, I prefered used linda because I can put the ecuation.
(But I am not sure if its correct)

model_1 <- linda(
  feature.dat = genus_normalizated_data$feature.tab,
  meta.dat = genus_data$meta.dat,
  formula = '~ Time + Treat + Treat:Time + (1 | sample_treatment_time)', 
  feature.dat.type = c('proportion'),
  prev.filter = 0.1,
  mean.abund.filter = 0,
  max.abund.filter = 0,
  is.winsor = TRUE,
  outlier.pct = 0.03,
  adaptive = TRUE,
  zero.handling = c('imputation'),
  pseudo.cnt = 0.5,
  corr.cut = 0.1,
  p.adj.method = "fdr",
  alpha = 0.05,
  n.cores = 20,
  verbose = TRUE
)

cafferychen777 · 2024-10-28T16:08:26Z

Hi Orson,

Thank you for your detailed follow-up questions about the model equations in MicrobiomeStat. I'll explain how each function implements its statistical models:

Alpha Diversity Analysis
For your alpha_time_diversity call, the function implements a linear mixed effects model of the form:

alpha_diversity ~ Treat * Time + (1 + Time | Sample_Time)

This model includes:

Fixed effects: Treatment, Time, and their interaction (Treat * Time)
Random effects: Both random intercepts AND random slopes for Time nested within each Sample
This allows each sample to have its own trajectory over time

Beta Diversity Analysis
For your beta_diversity call, the function attempts two model structures in order of complexity:

First tries:

Jaccard_distance ~ Treat * Time + (1 + Time | Sample_Time)

If that fails to converge, automatically simplifies to:

Jaccard_distance ~ Treat * Time + (1 | Sample_Time)

For your beta_diversity_volatility call, this is actually a different type of analysis. It:

First calculates volatility (rate of change between consecutive timepoints) for each subject
Then fits a simple linear model: volatility ~ Treat
Differential Abundance Analysis (linda)
Your formula is well-structured:

abundance ~ Time + Treat + Treat:Time + (1 | Sample_Time)

This model:

Tests main effects of Time and Treatment
Tests their interaction
Includes random intercepts for each Sample
The function also applies CLR transformation to abundances and handles zeros/outliers appropriately

Some suggestions for your analysis:

For the alpha and beta trend analyses, the default inclusion of random slopes is appropriate for longitudinal data but may not converge with only 3 timepoints. Don't worry if this happens - the functions will automatically simplify to random intercepts.
Make sure your "Sample_Time" variable uniquely identifies samples that are measured repeatedly. Each independent sample should have a consistent identifier across its timepoints.
For linda, you could consider matching the alpha/beta diversity models by using:

~ Time + Treat + Treat:Time + (1 + Time | Sample_Time)

Though your current random intercept model is also perfectly valid.

Overall, your implementation looks appropriate for your experimental design (4 treatments, 3 timepoints, 5 replicates per treatment-timepoint combination). Let me know if you need any clarification about specific aspects of these models.

Best regards,
Chen

cafferychen777 · 2024-10-28T16:09:57Z

PS: I'd like to encourage you to explore MicrobiomeStat's rich visualization capabilities to complement your statistical analyses.

OrsonMM · 2024-10-28T16:27:04Z

I appreciated so much your help @cafferychen777

OrsonMM changed the title ~~Request to understand the MLM models in alpha, beta diversity and Differencial abundance.~~ Request to understand the LMM models in alpha, beta diversity and Differencial abundance. Oct 28, 2024

cafferychen777 added the question Further information is requested label Oct 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request to understand the LMM models in alpha, beta diversity and Differencial abundance. #67

Request to understand the LMM models in alpha, beta diversity and Differencial abundance. #67

OrsonMM commented Oct 28, 2024 •

edited

Loading

cafferychen777 commented Oct 28, 2024

OrsonMM commented Oct 28, 2024 •

edited

Loading

cafferychen777 commented Oct 28, 2024

cafferychen777 commented Oct 28, 2024

OrsonMM commented Oct 28, 2024

Request to understand the LMM models in alpha, beta diversity and Differencial abundance. #67

Request to understand the LMM models in alpha, beta diversity and Differencial abundance. #67

Comments

OrsonMM commented Oct 28, 2024 • edited Loading

cafferychen777 commented Oct 28, 2024

OrsonMM commented Oct 28, 2024 • edited Loading

cafferychen777 commented Oct 28, 2024

cafferychen777 commented Oct 28, 2024

OrsonMM commented Oct 28, 2024

OrsonMM commented Oct 28, 2024 •

edited

Loading

OrsonMM commented Oct 28, 2024 •

edited

Loading