From 4cde13f7fbbc0e9b8ff2b64563c5d6a2b4f38f15 Mon Sep 17 00:00:00 2001 From: Louis Aslett Date: Wed, 13 Nov 2024 15:22:22 +0000 Subject: [PATCH] Tweaks to added documentation --- R/CalcTraces.R | 14 ++++++++++---- R/CladeMat.R | 11 +++++++---- R/FB.R | 9 +++++++-- man/Backward.Rd | 5 ++++- man/CalcTraces.Rd | 14 ++++++++++---- man/CladeMat.Rd | 11 +++++++---- man/Forward.Rd | 5 ++++- 7 files changed, 49 insertions(+), 20 deletions(-) diff --git a/R/CalcTraces.R b/R/CalcTraces.R index 2262f15..6248959 100644 --- a/R/CalcTraces.R +++ b/R/CalcTraces.R @@ -1,8 +1,8 @@ #' Fast Calculation of Matrix Trace and Hilbert Schmidt Norm #' -#' Provides multithreaded calculation of trace and Hilbert Schmidt Norm of a matrix PMP (where P is a projection matrix) without explicitly forming PMP. +#' Provides multithreaded calculation of trace and Hilbert Schmidt Norm of a matrix \eqn{PMP} (where \eqn{P} is a projection matrix and \eqn{M} is real symmetric) without explicitly forming \eqn{PMP}. #' -#' P here is assumed to have the form I-QQ' for some matrix Q of orthogonal columns +#' \eqn{P} here is assumed to have the form \eqn{I-QQ'} for some matrix \eqn{Q} of orthogonal columns. #' #' @param M #' a real symmetric R matrix @@ -13,13 +13,19 @@ #' @param J #' `crossprod(Q, M)` #' @param from_recipient -#' haplotype index at which to start trace calculation -- useful for distributed computation (experimental feature, more documentation to come TODO) +#' haplotype index at which to start trace calculation --- useful for distributed computation (experimental feature, more documentation to come) #' @param nthreads #' the number of CPU cores to use. #' By default uses the `parallel` package to detect the number of physical cores. #' #' @return -#' a list containing three elements, the first is the trace `tr(PMP)`, the second is the *squared* Hilbert Schmidt Norm of PMP `tr((PMP)'PMP)`, the third is the diag of `PMP`. +#' A list containing three elements: +#' +#' \describe{ +#' \item{`trace`}{the trace, \eqn{\mathrm{tr}(PMP)};} +#' \item{`hsnorm2`}{the *squared* Hilbert Schmidt Norm of \eqn{PMP}, \eqn{\mathrm{tr}((PMP)'PMP)};} +#' \item{`diag`}{the diagonal of \eqn{PMP}.} +#' } #' #' @examples #' # TODO diff --git a/R/CladeMat.R b/R/CladeMat.R index c6982d0..d00d4b6 100644 --- a/R/CladeMat.R +++ b/R/CladeMat.R @@ -18,17 +18,20 @@ #' a matrix with half the number of rows and columns as the corresponding forward/backward tables. #' This matrix is overwritten in place with the clade matrix result for performance reasons. #' @param unit.dist -#' the change in distance that is expected to correspond to a single mutation (typically \eqn{-log(\mu)}) for the LS model) +#' the change in distance that is expected to correspond to a single mutation (typically \eqn{-\log(\mu)}) for the LS model) #' @param thresh -#' a regularization parameter: differences distances must exceed this threshold (in `unit.dist` units) in order to used in forming the local relatedness matrix. Defaults to `0.2`. +#' a regularization parameter: differences distances must exceed this threshold (in `unit.dist` units) in order to used in forming the local relatedness matrix. +#' Defaults to `0.2`. #' @param max1var -#' a logical regularization parameter. When TRUE, differences in distances exceeding 1 `unit.dist` are set to 1 (so that any edge in the latent ancestral tree with multiple mutations on them are treated as if only one mutation was on it). +#' a logical regularization parameter. +#' When `TRUE`, differences in distances exceeding 1 `unit.dist` are set to 1 (so that any edge in the latent ancestral tree with multiple mutations on them are treated as if only one mutation was on it). #' @param nthreads #' the number of CPU cores to use. #' By default uses the `parallel` package to detect the number of physical cores. #' #' @return -#' A list, the first element contains a list of tied nearest neighbors (one for each haplotype). Other elements return other information to allow for efficient removal of singletons and sprigs by [PruneCladeMat()]. +#' A list, the first element contains a list of tied nearest neighbours (one for each haplotype). +#' Other elements of the returned list are for internal use by [PruneCladeMat()] to allow for efficient removal of singletons and sprigs. #' #' @examples #' # TODO diff --git a/R/FB.R b/R/FB.R index 5d3116b..1023c80 100644 --- a/R/FB.R +++ b/R/FB.R @@ -6,9 +6,12 @@ #' `Forward` implements the forward algorithm to advance the Li and Stephens rescaled hidden Markov model forward probabilities to a new target variant. #' Naturally, this can only propagate a table to variants downstream of its current position. #' -#' For mathematical details please see Section 2 of the kalis paper (https://doi.org/10.1186/s12859-024-05688-8). +#' For mathematical details please see Section 2 of the kalis paper (Aslett and Christ, 2024). #' Note that the precise formulation of the forward equation is determined by whether the flag `use.spiedel` is set in the parameters provided in `pars`. #' +#' @references +#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}. +#' #' @param fwd a `kalisForwardTable` object, as returned by #' [MakeForwardTable()]. #' @param pars a `kalisParameters` object, as returned by @@ -111,7 +114,7 @@ Forward <- function(fwd, #' variant. #' Naturally, this can only propagate a table to variants upstream of its current position. #' -#' For mathematical details please see Section 2 of the kalis paper (https://doi.org/10.1186/s12859-024-05688-8). +#' For mathematical details please see Section 2 of the kalis paper (Aslett and Christ, 2024). #' Note that the precise formulation of the backward equation is determined by whether the flag `use.spiedel` is set in the parameters provided in `pars`. #' #' **Beta-theta space** @@ -123,6 +126,8 @@ Forward <- function(fwd, #' A backward table in beta-theta space (with `beta.theta = TRUE`) can be propagated to an upstream variant without incorporating that variant, thereby moving to beta space (`beta.theta = FALSE`), and vice versa. #' However, while a backward table in beta space (`beta.theta = FALSE`) can be updated to incorporate the current variant, a backward table that is already in beta-theta space can not move to beta space without changing variants -- that would involve "forgetting" the current variant (see Examples). #' +#' @references +#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}. #' #' @param bck a `kalisBackwardTable` object, as returned by #' [MakeBackwardTable()]. diff --git a/man/Backward.Rd b/man/Backward.Rd index 7cc3256..021e524 100644 --- a/man/Backward.Rd +++ b/man/Backward.Rd @@ -45,7 +45,7 @@ The table is updated in-place. variant. Naturally, this can only propagate a table to variants upstream of its current position. -For mathematical details please see Section 2 of the kalis paper (https://doi.org/10.1186/s12859-024-05688-8). +For mathematical details please see Section 2 of the kalis paper (Aslett and Christ, 2024). Note that the precise formulation of the backward equation is determined by whether the flag \code{use.spiedel} is set in the parameters provided in \code{pars}. \strong{Beta-theta space} @@ -99,6 +99,9 @@ bck try(Backward(bck, pars, 125, beta.theta = FALSE)) bck +} +\references{ +Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", \emph{BMC Bioinformatics}, \strong{25}(1). Available at: \doi{10.1186/s12859-024-05688-8}. } \seealso{ \code{\link[=MakeBackwardTable]{MakeBackwardTable()}} to generate a backward table; diff --git a/man/CalcTraces.Rd b/man/CalcTraces.Rd index db508ef..cf45ffe 100644 --- a/man/CalcTraces.Rd +++ b/man/CalcTraces.Rd @@ -22,19 +22,25 @@ CalcTraces( \item{J}{\code{crossprod(Q, M)}} -\item{from_recipient}{haplotype index at which to start trace calculation -- useful for distributed computation (experimental feature, more documentation to come TODO)} +\item{from_recipient}{haplotype index at which to start trace calculation --- useful for distributed computation (experimental feature, more documentation to come\if{html}{\out{}})} \item{nthreads}{the number of CPU cores to use. By default uses the \code{parallel} package to detect the number of physical cores.} } \value{ -a list containing three elements, the first is the trace \code{tr(PMP)}, the second is the \emph{squared} Hilbert Schmidt Norm of PMP \verb{tr((PMP)'PMP)}, the third is the diag of \code{PMP}. +A list containing three elements: + +\describe{ +\item{\code{trace}}{the trace, \eqn{\mathrm{tr}(PMP)};} +\item{\code{hsnorm2}}{the \emph{squared} Hilbert Schmidt Norm of \eqn{PMP}, \eqn{\mathrm{tr}((PMP)'PMP)};} +\item{\code{diag}}{the diagonal of \eqn{PMP}.} +} } \description{ -Provides multithreaded calculation of trace and Hilbert Schmidt Norm of a matrix PMP (where P is a projection matrix) without explicitly forming PMP. +Provides multithreaded calculation of trace and Hilbert Schmidt Norm of a matrix \eqn{PMP} (where \eqn{P} is a projection matrix and \eqn{M} is real symmetric) without explicitly forming \eqn{PMP}. } \details{ -P here is assumed to have the form I-QQ' for some matrix Q of orthogonal columns +\eqn{P} here is assumed to have the form \eqn{I-QQ'} for some matrix \eqn{Q} of orthogonal columns. } \examples{ # TODO diff --git a/man/CladeMat.Rd b/man/CladeMat.Rd index 70836af..1d7eee6 100644 --- a/man/CladeMat.Rd +++ b/man/CladeMat.Rd @@ -25,17 +25,20 @@ This table must be at the same variant location as argument \code{fwd}.} \item{M}{a matrix with half the number of rows and columns as the corresponding forward/backward tables. This matrix is overwritten in place with the clade matrix result for performance reasons.} -\item{unit.dist}{the change in distance that is expected to correspond to a single mutation (typically \eqn{-log(\mu)}) for the LS model)} +\item{unit.dist}{the change in distance that is expected to correspond to a single mutation (typically \eqn{-\log(\mu)}) for the LS model)} -\item{thresh}{a regularization parameter: differences distances must exceed this threshold (in \code{unit.dist} units) in order to used in forming the local relatedness matrix. Defaults to \code{0.2}.} +\item{thresh}{a regularization parameter: \if{html}{\out{}} differences distances must exceed this threshold (in \code{unit.dist} units) in order to used in forming the local relatedness matrix. +Defaults to \code{0.2}.} -\item{max1var}{a logical regularization parameter. When TRUE, differences in distances exceeding 1 \code{unit.dist} are set to 1 (so that any edge in the latent ancestral tree with multiple mutations on them are treated as if only one mutation was on it).} +\item{max1var}{a logical regularization parameter. +When \code{TRUE}, differences in distances exceeding 1 \code{unit.dist} are set to 1 (so that any edge in the latent ancestral tree with multiple mutations on them are treated as if only one mutation was on it).} \item{nthreads}{the number of CPU cores to use. By default uses the \code{parallel} package to detect the number of physical cores.} } \value{ -A list, the first element contains a list of tied nearest neighbors (one for each haplotype). Other elements return other information to allow for efficient removal of singletons and sprigs by \code{\link[=PruneCladeMat]{PruneCladeMat()}}. +A list, the first element contains a list of tied nearest neighbours (one for each haplotype). +Other elements of the returned list are for internal use by \code{\link[=PruneCladeMat]{PruneCladeMat()}} to allow for efficient removal of singletons and sprigs. } \description{ Constructs a clade matrix using forward and backward tables. diff --git a/man/Forward.Rd b/man/Forward.Rd index aa6aad1..7784f31 100644 --- a/man/Forward.Rd +++ b/man/Forward.Rd @@ -40,7 +40,7 @@ The table is updated in-place. \code{Forward} implements the forward algorithm to advance the Li and Stephens rescaled hidden Markov model forward probabilities to a new target variant. Naturally, this can only propagate a table to variants downstream of its current position. -For mathematical details please see Section 2 of the kalis paper (https://doi.org/10.1186/s12859-024-05688-8). +For mathematical details please see Section 2 of the kalis paper (Aslett and Christ, 2024). Note that the precise formulation of the forward equation is determined by whether the flag \code{use.spiedel} is set in the parameters provided in \code{pars}. } \examples{ @@ -68,6 +68,9 @@ fwd Forward(fwd, pars, 50) fwd +} +\references{ +Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", \emph{BMC Bioinformatics}, \strong{25}(1). Available at: \doi{10.1186/s12859-024-05688-8}. } \seealso{ \code{\link[=MakeForwardTable]{MakeForwardTable()}} to generate a forward table;