Skip to content

Commit

Permalink
Add kalis paper to all v1 functions and tidy other references in v1 f…
Browse files Browse the repository at this point in the history
…unction docs
  • Loading branch information
louisaslett committed Nov 13, 2024
1 parent 1ad8095 commit a9ed50a
Show file tree
Hide file tree
Showing 22 changed files with 102 additions and 25 deletions.
9 changes: 9 additions & 0 deletions R/CacheHaplotypes.R
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,9 @@ assign("L", NA, envir = pkgVars) # must be integer
#'
#'
#'
#' @references
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' @param haps can be the name of a file from which the haplotypes are to be read, or can be an R matrix containing only 0/1s.
#' See Details section for supported file types.
#' @param loci.idx an optional vector of indices specifying the variants to load into the cache, indexed from 1.
Expand Down Expand Up @@ -219,6 +222,9 @@ CacheHaplotypes.err <- function(err) {
#' To achieve higher performance, kalis internally represents haplotypes in an efficient raw binary format in memory which cannot be directly viewed or manipulated in R.
#' This function enables you to copy whole or partial views of haplotypes/variants out of this low-level format and into a standard R matrix of 0's and 1's.
#'
#' @references
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' @param loci.idx which variants to retrieve from the cache, specified as a (vector) index.
#' This enables specifying variants by offset in the order they were loaded into the cache (from 1 to the number of variants).
#' @param hap.idx which haplotypes to retrieve from the cache, specified as a (vector) index.
Expand Down Expand Up @@ -295,6 +301,9 @@ QueryCache <- function(loci.idx = NULL, hap.idx = NULL) {
#' In particular, this cache sits outside R's memory management and will never be garbage collected (unless R is quit or the package is unloaded).
#' Therefore, this function is provided to enable freeing the memory used by this cache.
#'
#' @references
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' @return Nothing is returned.
#'
#' @seealso [CacheHaplotypes()] to create a haplotype cache;
Expand Down
3 changes: 3 additions & 0 deletions R/CacheSummary.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#' Retrieve information about the haplotype cache
#'
#' @references
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' @return
#' `CacheSummary()` prints information about the current state of the kalis cache.
#' Also invisibly returns a vector giving the dimensions of the cached haplotype data (num variants, num haplotypes), or `NULL` if the cache is empty.
Expand Down
3 changes: 3 additions & 0 deletions R/IndividualSequenceIO_H5.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@
#'
#' Note that if `hdf5.file` exists but does not contain a dataset named `haps`, then `WriteHaplotypes` will simply create a `haps` dataset within the existing file.
#'
#' @references
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' @param hdf5.file the name of the file which the haplotypes are to be written to.
#' @param haps a vector or a matrix where each column is a haplotype to be stored in the file `hdf5.file`.
#' @param hap.ids a character vector naming haplotypes when writing, or which haplotypes are to be read.
Expand Down
25 changes: 13 additions & 12 deletions R/Parameters.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,16 @@
#'
#' **NOTE:** the corresponding haplotype data *must* have already been inserted into the kalis cache by a call to [CacheHaplotypes()], since this function performs checks to confirm the dimensionality matches.
#'
#' TODO: add kalis paper cross ref.
#' See page 3 in Supplemental Information for the original ChromoPainter paper (Lawson et al., 2012) for motivation behind our parameterisation, which is as follows:
#'
#' \deqn{\rho = 1 - \exp(-s \times cM^\gamma)}{\rho = 1 - exp(-s * cM^\gamma)}
#'
#' For a complete description, see the main kalis paper, Aslett and Christ (2024).
#'
#' @references
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' Lawson, D.J., Hellenthal, G., Myers, S. and Falush, D. (2012). "Inference of population structure using dense haplotype data", *PLoS genetics*, **8**(1). Available at: \doi{10.1371/journal.pgen.1002453}.
#'
#' @param cM a vector specifying the recombination distance between variants in centimorgans.
#' Note element i of this vector should be the distance between variants `i` and `i+1` (not `i` and `i-1`), and thus length one less than the number of variants.
Expand All @@ -25,10 +30,6 @@
#'
#' @seealso [Parameters()] to use the resulting recombination probabilities to construct a `kalisParameters` object.
#'
#' @references
#' Lawson, D. J., Hellenthal, G., Myers, S., & Falush, D. (2012). Inference of
#' population structure using dense haplotype data. *PLoS genetics*, **8**(1).
#'
#' @examples
#' # Load the mini example data and recombination map from the package built-in #' # dataset
#' data("SmallHaps")
Expand Down Expand Up @@ -119,6 +120,13 @@ CalcRho <- function(cM = 0, s = 1, gamma = 1, floor = TRUE) {
#'
#' Note that there is a computational cost associated with non-uniform copying probabilities, so it is recommended to leave the default of uniform probabilities when appropriate (**Note:** *do not* specify a uniform matrix when uniform probabilities are intended, since this would end up incurring the computational cost of non-uniform probabilities).
#'
#' @references
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' Lawson, D.J., Hellenthal, G., Myers, S.R. and Falush, D. (2012). "Inference of population structure using dense haplotype data", *PLoS Genetics*, **8**(1). Available at: \doi{10.1371/journal.pgen.1002453}.
#'
#' Speidel, L., Forest, M., Shi, S. and Myers, S.R. (2019). "A method for genome-wide genealogy estimation for thousands of samples", *Nature Genetics*, **51**, p. 1321-1329. Available at: \doi{10.1038/s41588-019-0484-x}.
#'
#' @param rho recombination probability vector (must be \eqn{L-1} long).
#' See [CalcRho()] for assistance constructing this from a recombination
#' map/distances.
Expand All @@ -140,13 +148,6 @@ CalcRho <- function(cM = 0, s = 1, gamma = 1, floor = TRUE) {
#' @seealso [MakeForwardTable()] and [MakeBackwardTable()] which construct table objects which internally reference a parameters environment;
#' [Forward()] and [Backward()] which propagate those tables according to the Li and Stephens model.
#'
#' @references
#' Lawson, D. J., Hellenthal, G., Myers, S., & Falush, D. (2012). Inference of
#' population structure using dense haplotype data. *PLoS genetics*, **8**(1).
#'
#' Speidel, L., Forest, M., Shi, S., & Myers, S. (2019). A method for
#' genome-wide genealogy estimation for thousands of samples. *Nature Genetics*, **51**(1321–1329).
#'
#' @examples
#' # Load the mini example data and recombination map from the package built-in #' # dataset
#' data("SmallHaps")
Expand Down
14 changes: 11 additions & 3 deletions R/Probs.R
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,9 @@
#' Typically, that is simply \eqn{N \times N}{N x N} for \eqn{N} haplotypes.
#' However, if kalis is being run in a distributed manner, `M` will be a \eqn{N \times R}{N x R} matrix where \eqn{R} is the number of recipient haplotypes on the current machine.
#'
#' @references
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' @param fwd a forward table as returned by [MakeForwardTable()] and propagated to a target variant by [Forward()].
#' Must be at the same variant as `bck` (unless `bck` is in "beta-theta space" in which case if must be downstream ... see [Backward()] for details).
#' @param bck a backward table as returned by [MakeBackwardTable()] and propagated to a target variant by [Backward()].
Expand Down Expand Up @@ -164,6 +167,11 @@ PostProbs <- function(fwd, bck, unif.on.underflow = FALSE, M = NULL, beta.theta.
#' Typically, that is simply \eqn{N \times N}{N x N} for \eqn{N} haplotypes.
#' However, if kalis is being run in a distributed manner, `M` will be a \eqn{N \times R}{N x R} matrix where \eqn{R} is the number of recipient haplotypes on the current machine.
#'
#' @references
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' Speidel, L., Forest, M., Shi, S. and Myers, S.R. (2019). "A method for genome-wide genealogy estimation for thousands of samples", *Nature Genetics*, **51**, p. 1321-1329. Available at: \doi{10.1038/s41588-019-0484-x}.
#'
#' @param fwd a forward table as returned by [MakeForwardTable()] and propagated to a target variant by [Forward()].
#' Must be at the same variant as `bck` (unless `bck` is in "beta-theta space" in which case if must be downstream ... see [Backward()] for details).
#' @param bck a backward table as returned by [MakeBackwardTable()] and propagated to a target variant by [Backward()].
Expand All @@ -184,9 +192,6 @@ PostProbs <- function(fwd, bck, unif.on.underflow = FALSE, M = NULL, beta.theta.
#'
#' If you wish to plot this matrix or perform clustering, you may want to symmetrize the matrix first.
#'
#' @references
#' Speidel, L., Forest, M., Shi, S., & Myers, S. (2019). A method for genome-wide genealogy estimation for thousands of samples. *Nature Genetics*, **51**(1321–1329).
#'
#' @seealso
#' [PostProbs()] to calculate the posterior marginal probabilities \eqn{p_{ji}}{p_(j,i)};
#' [Forward()] to propagate a Forward table to a new variant;
Expand Down Expand Up @@ -319,6 +324,9 @@ input_checks_for_probs_and_dist_mat <- function(fwd,bck,beta.theta.opts = NULL)
#'
#' Clusters the given distance matrix and generates a heatmap to display it.
#'
#' @references
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' @param x
#' a distance matrix, such as returned by [DistMat()].
#' @param cluster.method
Expand Down
4 changes: 3 additions & 1 deletion R/SmallHaps-data.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,9 @@
#' @keywords datasets
#'
#' @references
#' Kelleher, J., Etheridge, A. M., & McVean, G. (2016). Efficient coalescent simulation and genealogical analysis for large sample sizes. *PLoS computational biology*, **12**(5).
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' Kelleher, J., Etheridge, A.M. and McVean, G. (2016) "Efficient coalescent simulation and genealogical analysis for large sample sizes", *PLoS Computational Biology*, **12**(5). Available at: \doi{10.1371/journal.pcbi.1004842}.
#'
#' @examples
#' data("SmallHaps")
Expand Down
12 changes: 12 additions & 0 deletions R/TableMaker.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@
#'
#' Since each column corresponds to an independent Li and Stephens hidden Markov model (ie for each recipient), it is possible to create a partial forward table object which corresponds to a subset of recipients using the `from_recipient` and `to_recipient` arguments.
#'
#' @references
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' @param pars a `kalisParameters` object specifying the genetics parameters to be associated with this forward table.
#' These parameters can be set up by using the [Parameters()] function.
#' @param from_recipient first recipient haplotype included if creating a partial forward table.
Expand Down Expand Up @@ -130,6 +133,9 @@ print.kalisForwardTable <- function(x, ...) {
#'
#' Since each column corresponds to an independent Li and Stephens hidden Markov model (ie for each recipient), it is possible to create a partial backward table object which corresponds to a subset of recipients using the `from_recipient` and `to_recipient` arguments.
#'
#' @references
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' @param pars a `kalisParameters` object specifying the genetics parameters to be associated with this backward table.
#' These parameters can be set up by using the [Parameters()] function.
#' @param from_recipient first recipient haplotype included if creating a partial backward table.
Expand Down Expand Up @@ -249,6 +255,9 @@ print.kalisBackwardTable <- function(x, ...) {
#'
#' This function is therefore designed to enable explicit copying of tables.
#'
#' @references
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' @param to a `kalisForwardTable` or `kalisBackwardTable` object which is to be copied into.
#' @param from a `kalisForwardTable` or `kalisBackwardTable` object which is to be copied from.
#'
Expand Down Expand Up @@ -340,6 +349,9 @@ CopyTable <- function(to, from) {
#' It is *much* faster to reset a forward/backward table rather than remove and make a new one.
#' This function marks a table as reset so that it will be propagated as if freshly allocated.
#'
#' @references
#' Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", *BMC Bioinformatics*, **25**(1). Available at: \doi{10.1186/s12859-024-05688-8}.
#'
#' @param tbl a `kalisForwardTable` or `kalisBackwardTable` object
#' which is to be reset.
#'
Expand Down
3 changes: 3 additions & 0 deletions man/CacheHaplotypes.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions man/CacheSummary.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 5 additions & 3 deletions man/CalcRho.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions man/ClearHaplotypeCache.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions man/CopyTable.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 3 additions & 1 deletion man/DistMat.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions man/MakeBackwardTable.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions man/MakeForwardTable.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions man/Parameters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit a9ed50a

Please sign in to comment.