This vignette provides all reproducible codes for our article:
Yiwen Wang, Kim-Anh Lê Cao
Abstract: Microbial communities have been increasingly studied in recent years to investigate their role in ecological habitats. However, microbiome studies are difficult to reproduce or replicate as they may suffer from confounding factors that are unavoidable in practice, and originate from biological, technical or computational sources. In this review, we define batch effects as unwanted variation introduced by confounding factors that are not related to any factors of interest. Computational and analytical methods are required to remove or account for batch effects. However, inherent microbiome data characteristics (e.g. sparse, compositional, multivariate) challenge the development and application of batch effect adjustment methods to either account or correct for batch effects. We present commonly encountered sources of batch effects that we illustrate in several case studies. We discuss the limitations of current methods, which often have assumptions that are not met due to the peculiarities of microbiome data. We provide practical guidelines for assessing the efficiency of the methods based on visual and numerical outputs and a thorough tutorial to reproduce the analyses conducted in this review.
Keywords: Unwanted variation; batch sources; systematic batch effects; methods selection; methods assessment
This article has been published and available as:
Yiwen Wang, Kim-Anh Lê Cao, Managing batch effects in microbiome data, Briefings in Bioinformatics, Volume 21, Issue 6, November 2020, Pages 1954-1970.
Link: https://doi.org/10.1093/bib/bbz105.
We have also implemented a bookdown: https://evayiwenwang.github.io/Managing_batch_effects/.