Merge pull request #31 from tcrombie/dev/WF

update article
AndersenLab · Oct 10, 2023 · 957453c · 957453c
2 parents 41af4a5 + 475aa8f
commit 957453c
Show file tree

Hide file tree

Showing 11 changed files with 458 additions and 212 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,27 +1,27 @@
 Package: easyXpress
 Title: A Package to Read, Process, and Analyze Worm Data Generated from CellProfiler’s WormToolbox
-Version: 1.1.0
+Version: 2.0.0
 Authors@R: c(
-    person(given = "Joy",
-           family = "Nyaanga",
-           role = c("aut", "cre"),
-           email = "[email protected]",
-           comment = c(ORCID = "0000-0002-1402-9213")),
     person(given = "Tim",
            family = "Crombie",
-           role = "aut",
+           role = c("aut", "cre"),
            email = "[email protected]",
            comment = c(ORCID = "0000-0002-5645-4154")),
+    person(given = "Joy",
+           family = "Nyaanga",
+           role = "aut",
+           email = "[email protected]",
+           comment = c(ORCID = "0000-0002-1402-9213")),
     person(given = "Sam",
            family = "Widmayer",
            role = "aut",
            email = "[email protected]",
            comment = c(ORCID = "0000-0002-1200-4768")))
 Description: A workflow for the reading, processing, and visualization
-    of images obtained from the Molecular Devices ImageExpress Nano Imager
-    and processed with CellProfiler's WormToolbox.
-    Includes a powerful suite of functions for the rapid processing
-    and analysis of large high-throughput image based datasets.
+    of nematode morphology data extracted from images using  
+    CellProfiler's WormToolbox.
+    Includes a powerful suite of functions to process
+    large high-throughput image based datasets.
 Depends:
     R (>= 3.5.0)
 Imports:

diff --git a/R/delta.R b/R/delta.R
@@ -6,7 +6,7 @@
 #' @param data A data frame output from any \code{WF} function.
 #' @param ... <[`dynamic-dots`][rlang::dyn-dots]> Variable(s) used to group data. It is recommended to group data to independent bleaches for all strains. Variable names in data are supplied separated by commas and without quotes.
 #' For example, the typical variables for grouping are \code{Metadata_Experiment, bleach, strain}.
-#' @param WF Select \code{"filter"} or \code{"ignore"}. The default, \code{"filter"}, will filter out all flagged wells before calculating the delta from control.
+#' @param WF Select \code{"filter"} or \code{"ignore"}. The default, \code{"filter"}, will filter out all flagged wells before calculating the delta from control, if present.
 #' \code{"ignore"} will calculate the delta including all flagged data. Be careful using \code{"ignore"}, it is only included for diagnostic purposes.
 #' @param vars The well summary statistics to perform the delta calculation on. These are supplied in a character vector. For example, the default is set to \code{c("median_wormlength_um", "cv_wormlength_um")}.
 #' @param doseR Logical, is this dose response data? The default, \code{doseR = FALSE},
@@ -51,10 +51,18 @@ delta <- function(data, ..., WF = "filter", vars = c("median_wormlength_um", "cv
     stop(message(message(paste0(capture.output(knitr::kable(example)), collapse = "\n"))))
   }
 
+  # get any user flags from data
+  uf1 <- names(data %>% dplyr::select(contains("_WellFlag")))
+
   # filter wells if needed
   if(WF == "filter") {
+    if(length(uf1) == 0) {
+      message(glue::glue("No flagged wells detected."))
+      d <- data
+    } else {
     d <- easyXpress::filterWF(data = data, rmVars = T)
     message(glue::glue("Flagged wells are filtered from the data."))
+    }
   } else {
     d <- data
     message(glue::glue('Flagged wells are NOT being filtered from the data. This is NOT the recommended approach. Please consider WF = "filter".'))

diff --git a/R/regEff.R b/R/regEff.R
@@ -11,7 +11,7 @@
 #' \code{<c.var>_reg_coeff}, and \code{<c.var>_reg_sig}.
 #' 2) A diagnostic plot, \code{<out>$p1}, showing the effect of the confounding variable (\code{c.var}) on the dependent variable (\code{d.var}).
 #' 3) A diagnositc plot, \code{<out>$p2}, showing the regression coefficients of the confounding variable (\code{c.var}) on the y-axis and the dependent variable (\code{d.var}) on the x-axis.
-#' 4) A list, \code{<out>$models} of model outputs for each group An ANOVA summary table with the effects.
+#' 4) A list, \code{<out>$models} of model objects for each group.
 #' @export
 
 regEff <- function(data, ..., d.var, c.var) {

diff --git a/README.md b/README.md
@@ -1,10 +1,10 @@
 # easyXpress <img src="man/figures/logo.png" alt="hex" align = "right" width="130" />
 
 ## Overview 
-This package is designed for the reading, processing, and visualization of images obtained from the Molecular Devices ImageExpress Nano Imager, and processed with CellProfiler's WormToolbox.
+This package is designed for reading, processing, and visualizing of nematode morphology data extracted from images using CellProfiler's WormToolbox.
 
 ## Installation
-`easyXpress` is specialized for use with CellProfiler generated worm image data. The package is rather specific to use in the Andersen Lab and, therefore, is not available from CRAN. To install easyXpress you will need the [`devtools`](https://github.com/hadley/devtools) package. You can install `devtools` and `easyXpress` using the commands below:
+`easyXpress` is specialized for use with image data produced by the [`cellprofiler-nf` nextflow pipeline](https://github.com/AndersenLab/cellprofiler-nf). To install `easyXpress` you will need the [`devtools`](https://github.com/hadley/devtools) package. You can install `devtools` and `easyXpress` using the commands below:
 
 ```r
 install.packages("devtools")
@@ -18,21 +18,16 @@ The functionality of the package can be broken down into three main goals:
 
 + Flagging and pruning anomalous data points.
 
-+ Generating diagnositic images.
++ Generating diagnostic images.
 
-For more information about implementing CellProfiler to generate data used by the `easyXpress` package, see [`AndersenLab/cellprofiler-nf`](https://github.com/AndersenLab/cellprofiler-nf).
+For more information about implementing `cellprofiler-nf` to generate data used by the `easyXpress` package, see [`AndersenLab/cellprofiler-nf`](https://github.com/AndersenLab/cellprofiler-nf).
 
 ## Directory structure
 
-Because so much information must be transferred alongside the plate data, the directory structure 
-from which you are reading is critically important. Below is an example of a correct project directory structure. 
-The `cp_data` directory contains an `.RData` file sourced directly from the default output folder for a CellProfiler run. 
-The `processed_images` directory contains `.png` files from the CellProfiler run. There should be one `.png` file 
-for each well included in your analysis. The `design` directory contains the `.csv` file having all the variables necessary
-to describe your experiment (i.e. drug names, drug concentrations, strain names, food types, etc.).    
+The directory structure holding data is critically important. Below is an example of a correct project directory structure. 
+The `cp_data` directory contains an `.RData` file output by `cellprofiler-nf`. The `processed_images` directory contains `_overlay.png` files output by `cellprofiler-nf`. There should be one `.png` file for each well included in your analysis. The `design` directory contains a `.csv` with all the variables necessary to describe your experiment (i.e. experiment names, drug names, drug concentrations, strain names, food types, etc.).    
 
-If you do not have condition information (i.e. drug names, drug concentrations, strain names, food types, etc.) 
-you do not need the `design` directory.
+If you do not have condition information you do not need the `design` directory.
 
 ```
 /projects/20200812_example
@@ -64,25 +59,12 @@ and experiment name separated by underscores.
 
 ### File naming
 
-The processed image files should be formatted with the experiment data, name of the experiment, the plate number, 
-the magnification used for imaging, and the well name. All processed image files must be saved as `.png` files. 
-In the file named `20191119-growth-p01-m2x_A01_overlay.png` the first section `20191119` is the experiment date, 
-`growth` is the name of the experiment, `p01` is the plate number, `m2x` is the magnification used for imaging, 
-and `A01` is the well name.
+The processed image files should be formatted with the experiment data, name of the experiment, the plate number, the magnification used for imaging, and the well name. All processed image files must be saved as `.png` files. In the file named `20191119-growth-p01-m2x_A01_overlay.png` the first section `20191119` is the experiment date, `growth` is the name of the experiment, `p01` is the plate number, `m2x` is the magnification used for imaging, and `A01` is the well name.
 
 ## Package Overview 
-The complete easyXpress package consists of nine functions: 
-`readXpress`, `modelSelection`, `edgeFlag`, `setFlags`, `process`, `Xpress`, `viewPlate`, `viewWell`, and `viewDose`.
+The `easyXpress` package consists of six function classes that work together to clean and process experimental data. The `tidy` functions will help pre-process raw images to get them ready for submission to the `cellprofiler-nf` pipeline. The `ObjectFlag` or `OF` functions help to flag problematic data output from `cellprofiler-nf`. The `WellFlag` or `WF` functions work to flag anomalous summary statistics for micro-plate wells. Throughout the data cleaning workflow, the `check` and `view` function classes are used to validate whether the flag functions are properly applied. All other functions serve to facilitate the cleaning process and do not have a standardized naming convention.
 
-For more detailed information regarding use of these functions, see the vignette: **A walk-through of easyXpress**.
-This can be done in R -
-
-```r
-library(easyXpress)
-browseVignettes(package = "easyXpress")
-```
-
-<img src="man/figures/Overview.png" width=600 />
+For more detailed information regarding use of these functions, see the article: **Dose Response Processing**.
 
 ### Citation
 

diff --git a/man/delta.Rd b/man/delta.Rd
diff --git a/man/easyXpress-package.Rd b/man/easyXpress-package.Rd
diff --git a/man/regEff.Rd b/man/regEff.Rd