From 2d9a2d3589e11d1459f68a2f39bfcbfe8ea02afd Mon Sep 17 00:00:00 2001 From: nicholascarey Date: Sat, 29 Aug 2020 13:48:47 +0100 Subject: [PATCH] Increment version number --- DESCRIPTION | 2 +- R/peaks.R | 151 ++++++++++++++++++++++++++++----------------------- man/peaks.Rd | 99 ++++++++++++++++++--------------- 3 files changed, 140 insertions(+), 112 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index cf8ccb9..d5b9160 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,7 +1,7 @@ Package: caRey Type: Package Title: Collection of Useful R Functions -Version: 0.2.0 +Version: 0.2.1 Author: Nicholas Carey Maintainer: Nicholas Carey Description: Collection of Useful R Functions diff --git a/R/peaks.R b/R/peaks.R index f695af4..935846c 100644 --- a/R/peaks.R +++ b/R/peaks.R @@ -5,54 +5,62 @@ #' @details `peaks` scans a vector of numeric values and identifies peaks and #' troughs. #' -#' The required factor (`span`) and two optional factors (smoothing and +#' The required input (`span`) and two optional inputs (smoothing and #' `height`), need to be balanced to successfully identify peaks and troughs. -#' Characteristics such as data noisiness, amplitude, etc will affect how -#' important these are and how successful the identification process is. +#' Characteristics such as data noisiness, amplitude, wavelength etc will +#' affect how important each of these are and how successful the +#' identification process is. #' -#' **Span** +#' **span** #' -#' The most important parameter in determining a peak is the \code{span}. This -#' sets the threshold for identification; to be designated a peak, a value -#' (after smoothing) must be the highest value within the `span` window -#' (lowest value for troughs). +#' The most important parameter in determining peaks is the `span`, which sets +#' the threshold for identification. A rolling window of width `span` moves +#' across the data, and to be designated a peak a value (after any smoothing) +#' must be the highest value within that window (or the lowest value for +#' troughs). The `span` window can be entered as an integer number of values +#' (e.g. `span = 11`), or if between 0 and 1 a proportion of the total data +#' length (e.g. `span = 0.1`). Note: strictly speaking, the function tests +#' `floor(span/2)` values before and after each central value, therefore any +#' even `span` inputs are rounded up. That is `span = 10` and `span = 11` will +#' both result in an effective moving window of 11 values, with the central +#' value tested against the 5 values before and after it. #' #' **Smoothing** #' -#' For noisy data there is optional smoothing functionality via -#' `smooth_method`. See [smooth()] for the methods available and appropriate -#' `smooth.n` values. `smooth.method = "spline"` works particularly well for -#' oscillating data. +#' For noisy data there is optional smoothing functionality via the +#' `smooth_method` input. See [smooth()] for the methods available and +#' appropriate `smooth.n` values. `smooth.method = "spline"` works +#' particularly well for oscillating data. #' -#' **Height** +#' **Prominence** #' -#' Peak identification is also affected by the optional `height` argument. -#' This sets a threshold of 'prominence' for peak identification (or 'depth' -#' for a trough). This should be a value between 0 and 1 describing a -#' proportion of the total `range` of the data. Only peaks or troughs -#' containing a proportional range equal to or above this within their `span` +#' Peak identification is also affected by the optional `prominence` argument. +#' This sets a threshold of 'prominence' for peak identification (equivalent +#' to 'depth' for a trough). This should be a value between 0 and 1 describing +#' a proportion of the total range of the data. Only peaks or troughs +#' containing a range equal to or above this proportion within their `span` #' window will be retained. Essentially the higher this is set, the more -#' prominent the peaks, or deeper the troughs, must be to be identified. To -#' help with appropriate values, range elements are included in the output. -#' Peaks and troughs can have different `height` thresholds: simply enter them -#' as a vector of two values in peak-trough order. E.g. `height = c(0.2, -#' 0.1)`. +#' prominent the peaks must be to be identified (or deeper the troughs). To +#' help with choosing appropriate values, ranges for peaks and troughs are +#' included in the output. Peaks and troughs can have different `prominence` +#' thresholds: simply enter them as a vector of two values in peak-trough +#' order. E.g. `prominence = c(0.2, 0.1)`. #' -#' **Matching values** +#' **Equal values within a `span` window** #' -#' If there happens to be equal values within the `span` range, the first -#' occurrence is designated as the peak. If peaks are being missed because of -#' this, use a lower `span`. If there are many instances of equal values, try -#' smoothing the data. +#' If there happens to be equal values within a `span` window, the first +#' occurrence is designated as the peak or trough. If peaks or troughs are +#' being missed because of this, use a lower `span`. If there are many +#' instances of equal values, try smoothing the data. #' #' **Partial windows** #' #' A rolling window of width `span` is used across the data to identify peaks. #' At the start and end of the data vector this window will overlap the start #' and end. The default `partial = TRUE` input tells the function to attempt -#' to identify peaks near the start/end of the data where the `span` input -#' would not supply enough data points. Change this to `FALSE` if you see odd -#' matching behaviour at the start or end of the data. +#' to identify peaks near the start/end of the data where the `span` width +#' window is not complete. Change this to `FALSE` if you see odd matching +#' behaviour at the start or end of the data. #' #' **Plot** #' @@ -93,21 +101,23 @@ #' #' - `call` - the function call #' -#' @usage peaks(x, span = NULL, partial = TRUE, height = NULL, smooth.method = -#' NULL, smooth.n = NULL, plot = TRUE, plot.which = "b") +#' @usage peaks(x, span = NULL, prominence = NULL, partial = TRUE, smooth.method +#' = NULL, smooth.n = NULL, plot = TRUE, plot.which = "b") #' #' @param x numeric. A numeric vector. -#' @param span integer. Sets window size for peak (or trough) identification; to -#' be designated a peak, a value must have \code{span} *lower* values on -#' *both* sides of it (higher for troughs). -#' @param partial logical. Default TRUE. Should the function attempt to identify -#' peaks or troughs at the start or end of the vector where `span` is -#' truncated. See Details. -#' @param height numeric. Value between 0 and 1. Sets threshold for peak or +#' @param span numeric. Sets window size for peak (or trough) identification; to +#' be designated a peak, a value must be the highest value (lowest for +#' troughs) within a rolling window of width `span`. Can be entered as either +#' a window of number of values, or value between 0 and 1 of proportion of +#' total data length. See Details. +#' @param prominence numeric. Value between 0 and 1. Sets threshold for peak or #' trough 'prominence'. See Details. +#' @param partial logical. Default TRUE. Should the function attempt to identify +#' peaks or troughs at the start or end of the vector where the `span` window +#' is truncated? See Details. #' @param smooth.method string. Method by which to smooth data before peak -#' identification. Optional, default is `NULL`. See [smooth()]. -#' @param smooth.n string. Smoothing factor. See [smooth()]. +#' identification. Optional. Default is `NULL`. See [smooth()]. +#' @param smooth.n numeric. Smoothing factor. See [smooth()]. #' @param plot logical. Plots the result. #' @param plot.which string. What to plot: "p" for peaks, "t" for troughs, or #' the default "b" for both. @@ -117,14 +127,14 @@ #' #' @author Nicholas Carey - \email{nicholascarey@gmail.com} #' @importFrom zoo rollapply +#' @importFrom dplyr between #' @md #' @export - peaks <- function(x, span = NULL, + prominence = NULL, partial = TRUE, - height = NULL, smooth.method = NULL, smooth.n = NULL, plot = TRUE, @@ -135,20 +145,27 @@ peaks <- function(x, ## Stop if no span if(is.null(span)) stop("peaks: please enter a 'span' value.") - - ## make heights zero if null - if(is.null(height)) { - height_p <- 0 - height_t <- 0 - ## separate to peak and trough heights - } else if(length(height) == 1) { - height_p <- height - height_t <- height - } else if(length(height) == 2) { - height_p <- height[1] - height_t <- height[2] + if(!(dplyr::between(span, 0, 1)) && span %% 1 != 0) + stop("peaks: 'span' should be a value between 0 and 1, or an integer greater than 1.") + + ## make span + if(dplyr::between(span, 0, 1)) + span <- round(length(x) * span) + span <- floor(span/2) + + ## make prominence zero if null + if(is.null(prominence)) { + prominence_p <- 0 + prominence_t <- 0 + ## separate to peak and trough prominence + } else if(length(prominence) == 1) { + prominence_p <- prominence + prominence_t <- prominence + } else if(length(prominence) == 2) { + prominence_p <- prominence[1] + prominence_t <- prominence[2] } else { - stop("peaks: 'height' input should be NULL, a single numeric value, or vector of two values.") + stop("peaks: 'prominence' input should be NULL, a single numeric value, or vector of two values.") } ## set logicals @@ -167,13 +184,13 @@ peaks <- function(x, df <- data.frame(test_val, z) - # Height range ------------------------------------------------------------ + # prominence range ------------------------------------------------------------ ## determine y range within each span - heightrange <- zoo::rollapply(z, width = span * 2 + 1, FUN = range, align = "center", partial = TRUE) - heightrange <- abs(heightrange[,1] - heightrange[,2]) + prominencerange <- zoo::rollapply(z, width = span * 2 + 1, FUN = range, align = "center", partial = TRUE) + prominencerange <- abs(prominencerange[,1] - prominencerange[,2]) ## convert to proportion of total - heightrange <- heightrange / abs(diff(range(z))) + prominencerange <- prominencerange / abs(diff(range(z))) # Detect peaks ------------------------------------------------------------ @@ -199,9 +216,9 @@ peaks <- function(x, ## index of peaks peaks <- which(peaks) - ## subset to peaks with range greater than 'height' input - peaks <- peaks[heightrange[peaks] >= height_p] - peakrange <- heightrange[peaks] + ## subset to peaks with range greater than 'prominence' input + peaks <- peaks[prominencerange[peaks] >= prominence_p] + peakrange <- prominencerange[peaks] # Detect troughs ---------------------------------------------------------- @@ -222,13 +239,13 @@ peaks <- function(x, } troughs <- which(troughs) - troughs <- troughs[heightrange[troughs] >= height_t] - troughrange <- heightrange[troughs] + troughs <- troughs[prominencerange[troughs] >= prominence_t] + troughrange <- prominencerange[troughs] # Assemble both for output ------------------------------------------------ both <- sort(c(peaks,troughs)) - bothrange <- heightrange[both] + bothrange <- prominencerange[both] # Assemble output --------------------------------------------------------- diff --git a/man/peaks.Rd b/man/peaks.Rd index 29c1617..0ba59d5 100644 --- a/man/peaks.Rd +++ b/man/peaks.Rd @@ -4,27 +4,29 @@ \alias{peaks} \title{peaks} \usage{ -peaks(x, span = NULL, partial = TRUE, height = NULL, smooth.method = - NULL, smooth.n = NULL, plot = TRUE, plot.which = "b") +peaks(x, span = NULL, prominence = NULL, partial = TRUE, smooth.method + = NULL, smooth.n = NULL, plot = TRUE, plot.which = "b") } \arguments{ \item{x}{numeric. A numeric vector.} -\item{span}{integer. Sets window size for peak (or trough) identification; to -be designated a peak, a value must have \code{span} \emph{lower} values on -\emph{both} sides of it (higher for troughs).} +\item{span}{numeric. Sets window size for peak (or trough) identification; to +be designated a peak, a value must be the highest value (lowest for +troughs) within a rolling window of width \code{span}. Can be entered as either +a window of number of values, or value between 0 and 1 of proportion of +total data length. See Details.} -\item{partial}{logical. Default TRUE. Should the function attempt to identify -peaks or troughs at the start or end of the vector where \code{span} is -truncated. See Details.} - -\item{height}{numeric. Value between 0 and 1. Sets threshold for peak or +\item{prominence}{numeric. Value between 0 and 1. Sets threshold for peak or trough 'prominence'. See Details.} +\item{partial}{logical. Default TRUE. Should the function attempt to identify +peaks or troughs at the start or end of the vector where the \code{span} window +is truncated? See Details.} + \item{smooth.method}{string. Method by which to smooth data before peak -identification. Optional, default is \code{NULL}. See \code{\link[=smooth]{smooth()}}.} +identification. Optional. Default is \code{NULL}. See \code{\link[=smooth]{smooth()}}.} -\item{smooth.n}{string. Smoothing factor. See \code{\link[=smooth]{smooth()}}.} +\item{smooth.n}{numeric. Smoothing factor. See \code{\link[=smooth]{smooth()}}.} \item{plot}{logical. Plots the result.} @@ -38,53 +40,62 @@ Identify peaks and troughs in oscillating data. \code{peaks} scans a vector of numeric values and identifies peaks and troughs. -The required factor (\code{span}) and two optional factors (smoothing and +The required input (\code{span}) and two optional inputs (smoothing and \code{height}), need to be balanced to successfully identify peaks and troughs. -Characteristics such as data noisiness, amplitude, etc will affect how -important these are and how successful the identification process is. - -\strong{Span} - -The most important parameter in determining a peak is the \code{span}. This -sets the threshold for identification; to be designated a peak, a value -(after smoothing) must be the highest value within the \code{span} window -(lowest value for troughs). +Characteristics such as data noisiness, amplitude, wavelength etc will +affect how important each of these are and how successful the +identification process is. + +\strong{span} + +The most important parameter in determining peaks is the \code{span}, which sets +the threshold for identification. A rolling window of width \code{span} moves +across the data, and to be designated a peak a value (after any smoothing) +must be the highest value within that window (or the lowest value for +troughs). The \code{span} window can be entered as an integer number of values +(e.g. \code{span = 11}), or if between 0 and 1 a proportion of the total data +length (e.g. \code{span = 0.1}). Note: strictly speaking, the function tests +\code{floor(span/2)} values before and after each central value, therefore any +even \code{span} inputs are rounded up. That is \code{span = 10} and \code{span = 11} will +both result in an effective moving window of 11 values, with the central +value tested against the 5 values before and after it. \strong{Smoothing} -For noisy data there is optional smoothing functionality via -\code{smooth_method}. See \code{\link[=smooth]{smooth()}} for the methods available and appropriate -\code{smooth.n} values. \code{smooth.method = "spline"} works particularly well for -oscillating data. +For noisy data there is optional smoothing functionality via the +\code{smooth_method} input. See \code{\link[=smooth]{smooth()}} for the methods available and +appropriate \code{smooth.n} values. \code{smooth.method = "spline"} works +particularly well for oscillating data. -\strong{Height} +\strong{Prominence} -Peak identification is also affected by the optional \code{height} argument. -This sets a threshold of 'prominence' for peak identification (or 'depth' -for a trough). This should be a value between 0 and 1 describing a -proportion of the total \code{range} of the data. Only peaks or troughs -containing a proportional range equal to or above this within their \code{span} +Peak identification is also affected by the optional \code{prominence} argument. +This sets a threshold of 'prominence' for peak identification (equivalent +to 'depth' for a trough). This should be a value between 0 and 1 describing +a proportion of the total range of the data. Only peaks or troughs +containing a range equal to or above this proportion within their \code{span} window will be retained. Essentially the higher this is set, the more -prominent the peaks, or deeper the troughs, must be to be identified. To -help with appropriate values, range elements are included in the output. -Peaks and troughs can have different \code{height} thresholds: simply enter them -as a vector of two values in peak-trough order. E.g. \code{height = c(0.2, 0.1)}. +prominent the peaks must be to be identified (or deeper the troughs). To +help with choosing appropriate values, ranges for peaks and troughs are +included in the output. Peaks and troughs can have different \code{prominence} +thresholds: simply enter them as a vector of two values in peak-trough +order. E.g. \code{prominence = c(0.2, 0.1)}. -\strong{Matching values} +\strong{Equal values within a \code{span} window} -If there happens to be equal values within the \code{span} range, the first -occurrence is designated as the peak. If peaks are being missed because of -this, use a lower \code{span}. If there are many instances of equal values, try -smoothing the data. +If there happens to be equal values within a \code{span} window, the first +occurrence is designated as the peak or trough. If peaks or troughs are +being missed because of this, use a lower \code{span}. If there are many +instances of equal values, try smoothing the data. \strong{Partial windows} A rolling window of width \code{span} is used across the data to identify peaks. At the start and end of the data vector this window will overlap the start and end. The default \code{partial = TRUE} input tells the function to attempt -to identify peaks near the start/end of the data where the \code{span} input -would not supply enough data points. Change this to \code{FALSE} if you see odd -matching behaviour at the start or end of the data. +to identify peaks near the start/end of the data where the \code{span} width +window is not complete. Change this to \code{FALSE} if you see odd matching +behaviour at the start or end of the data. \strong{Plot}