From 3095cc43395c90e093a43a0c2145a96cde86fb39 Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Tue, 18 Jun 2024 13:07:01 +0000 Subject: [PATCH] build based on 8a8c9b0 --- dev/.documenter-siteinfo.json | 2 +- dev/IO/index.html | 2 +- dev/ancestors/index.html | 2 +- dev/examples/index.html | 2 +- dev/framework/index.html | 2 +- dev/index.html | 24 ++++++++++++------------ dev/models/index.html | 6 +++--- dev/optimization/index.html | 2 +- dev/simulation/index.html | 2 +- dev/viz/index.html | 6 +++--- 10 files changed, 25 insertions(+), 25 deletions(-) diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 9a0fc86..2f2268e 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-06-13T15:04:05","documenter_version":"1.4.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-06-18T13:06:57","documenter_version":"1.4.1"}} \ No newline at end of file diff --git a/dev/IO/index.html b/dev/IO/index.html index 43e3419..6796ed0 100644 --- a/dev/IO/index.html +++ b/dev/IO/index.html @@ -1,2 +1,2 @@ -Input/Output · MolecularEvolution.jl

Input/Output

MolecularEvolution.write_nexusFunction
write_nexus(fname::String,tree::FelNode)

Writes the tree as a nexus file, suitable for opening in eg. FigTree. Data in the node_data dictionary will be converted into annotations. Only tested for simple node_data formats and types.

source
MolecularEvolution.populate_tree!Function
populate_tree!(tree::FelNode, starting_message, names, data; init_all_messages = true, tolerate_missing = 1)

Takes a tree, and a starting_message (which will serve as the memory template for populating messages all over the tree). starting_message can be a message (ie. a vector of Partitions), but will also work with a single Partition (although the tree) will still be populated with a length-1 vector of Partitions. Further, as long as obs2partition is implemented for your Partition type, the leaf nodes will be populated with the data from data, matching the names on each leaf. When a leaf on the tree has a name that doesn't match anything in names, then if

  • tolerate_missing = 0, an error will be thrown
  • tolerate_missing = 1, a warning will be thrown, and the message will be set to the uninformative message (requires identity!(::Partition) to be defined)
  • tolerate_missing = 2, the message will be set to the uninformative message, without warnings (requires identity!(::Partition) to be defined)
source
MolecularEvolution.write_fastaFunction
write_fasta(filepath::String, sequences::Vector{String}; seq_names = nothing)

Writes a fasta file from a vector of sequences, with optional seq_names.

source
+Input/Output · MolecularEvolution.jl

Input/Output

MolecularEvolution.write_nexusFunction
write_nexus(fname::String,tree::FelNode)

Writes the tree as a nexus file, suitable for opening in eg. FigTree. Data in the node_data dictionary will be converted into annotations. Only tested for simple node_data formats and types.

source
MolecularEvolution.populate_tree!Function
populate_tree!(tree::FelNode, starting_message, names, data; init_all_messages = true, tolerate_missing = 1)

Takes a tree, and a starting_message (which will serve as the memory template for populating messages all over the tree). starting_message can be a message (ie. a vector of Partitions), but will also work with a single Partition (although the tree) will still be populated with a length-1 vector of Partitions. Further, as long as obs2partition is implemented for your Partition type, the leaf nodes will be populated with the data from data, matching the names on each leaf. When a leaf on the tree has a name that doesn't match anything in names, then if

  • tolerate_missing = 0, an error will be thrown
  • tolerate_missing = 1, a warning will be thrown, and the message will be set to the uninformative message (requires identity!(::Partition) to be defined)
  • tolerate_missing = 2, the message will be set to the uninformative message, without warnings (requires identity!(::Partition) to be defined)
source
MolecularEvolution.write_fastaFunction
write_fasta(filepath::String, sequences::Vector{String}; seq_names = nothing)

Writes a fasta file from a vector of sequences, with optional seq_names.

source
diff --git a/dev/ancestors/index.html b/dev/ancestors/index.html index 15f47b1..3de612c 100644 --- a/dev/ancestors/index.html +++ b/dev/ancestors/index.html @@ -66,4 +66,4 @@ 0.0305 - true value: 0.0177 0.0913 - true value: 0.0485 0.0542 - true value: 0.075 -0.498 - true value: 0.589

Functions

MolecularEvolution.marginal_state_dictFunction
marginal_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{Partition}}())

Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and returns a dictionary mapping nodes to their marginal reconstructions (ie. P(state|all observations,model)). A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.

source
MolecularEvolution.cascading_max_state_dictFunction
cascading_max_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{Partition}}())

Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and returns a dictionary mapping nodes to their inferred ancestors under the following scheme: the state that maximizes the marginal likelihood is selected at the root, and then, for each node, the maximum likelihood state is selected conditioned on the maximized state of the parent node and the observations of all descendents. A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.

source
MolecularEvolution.endpoint_conditioned_sample_state_dictFunction
endpoint_conditioned_sample_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{Partition}}())

Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and draws samples under the model conditions on the leaf observations. These samples are stored in the nodemessagedict, which is returned. A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.

source
+0.498 - true value: 0.589

Functions

MolecularEvolution.marginal_state_dictFunction
marginal_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{Partition}}())

Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and returns a dictionary mapping nodes to their marginal reconstructions (ie. P(state|all observations,model)). A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.

source
MolecularEvolution.cascading_max_state_dictFunction
cascading_max_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{Partition}}())

Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and returns a dictionary mapping nodes to their inferred ancestors under the following scheme: the state that maximizes the marginal likelihood is selected at the root, and then, for each node, the maximum likelihood state is selected conditioned on the maximized state of the parent node and the observations of all descendents. A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.

source
MolecularEvolution.endpoint_conditioned_sample_state_dictFunction
endpoint_conditioned_sample_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{Partition}}())

Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and draws samples under the model conditions on the leaf observations. These samples are stored in the nodemessagedict, which is returned. A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.

source
diff --git a/dev/examples/index.html b/dev/examples/index.html index 4836ccb..bbf7fdc 100644 --- a/dev/examples/index.html +++ b/dev/examples/index.html @@ -371,4 +371,4 @@ end end
Site 153: P(β>α)=0.9074
 Site 158: P(β>α)=0.9266
-Site 160: P(β>α)=0.9547

And let's visualize one of those sites:

gridplot(alpha_ind_vec,beta_ind_vec,grid_values, weighted_mat[:,160]./sum(weighted_mat[:,160]))

+Site 160: P(β>α)=0.9547

And let's visualize one of those sites:

gridplot(alpha_ind_vec,beta_ind_vec,grid_values, weighted_mat[:,160]./sum(weighted_mat[:,160]))

diff --git a/dev/framework/index.html b/dev/framework/index.html index 4176937..c7deacf 100644 --- a/dev/framework/index.html +++ b/dev/framework/index.html @@ -1,2 +1,2 @@ -The MolecularEvolution.jl Framework · MolecularEvolution.jl

The MolecularEvolution.jl Framework

The organizing principle is that the core algorithms, including Felsenstein's algorithm, but also a related family of message passing algorithms and inference machinery, are implemented in a way that does not refer to any specific model or even to any particular data type.

Partitions and BranchModels

A Partition is a probabilistic representation of some kind of state. Specifically, it needs to be able to represent P(obs|state) and P(obs,state) when considered as functions of state. So it will typically be able to assign a probability to any possible value of state, and is unnormalized - not required to sum or integrate to 1 over all values of state. As an example, for a discrete state with 4 categories, this could just be a vector of 4 numbers.

For a Partition type to be usable by MolecularEvolution.jl, the combine! function needs to be implemented. If you have P(obsA|state) and P(obsB|state), then combine! calculates P(obsA,obsB|state) under the assumption that obsA and obsB are conditionally independent given state. MolecularEvolution.jl tries to avoid allocating memory, so combine!(dest,src) places in dest the combined Partition in dest. For a discrete state with 4 categories, this is simply element-wise multiplication of two state vectors.

A BranchModel defines how Partition distributions evolve along branches. Two functions need to be implemented: backward! and forward!. We imagine our trees with the root at the top, and forward! moves from root to tip, and backward! moves from tip to root. backward!(dest::P,src::P,m::BranchModel,n::FelNode) takes a src Partition, representing P(obs-below|state-at-bottom-of-branch), and modifies the dest Partition to be P(obs-below|state-at-top-of-branch), where the branch in question is the branch above the FelNode n. forward! goes in the opposite direction, from P(obs-above,state-at-top-of-branch) to P(obs-above,state-at-bottom-of-branch), with the Partitions now, confusingly, representing joint distributions.

Messages

Nodes on our trees work with messages, where a message is a vector of Partition structs. This is in case you wish to model multiple different data types on the same tree. Often, all the messages on the tree will just be arrays containing a single Partition, but if you're accessing them you need to remember that they're in an array!

Trees

Each node in our tree is a FelNode ("Fel" for "Felsenstein"). They point to their parent nodes, and an array of their children, and they store their main vector of Partitions, but also cached versions of those from their parents and children, to allow certain message passing schemes. They also have a branchlength field, which tells eg. forward! and backward! how much evolution occurs along the branch above (ie. closer to the root) that node. They also allow for an arbitrary dictionary of node_data, in case a model needs any other branch-specific parameters.

The set of algorithms needs to know which model to use for which partition, so the assumption made is that they'll see an array of models whose order will match the partition array. In general, we might want the models to vary from one branch to another, so the central algorithms take a function that associates a FelNode->Vector{:<BranchModel}. In the simpler cases where the model does not vary from branch to branch, or where there is only a single Partition, and thus a single model, the core algorithms have been overloaded to allow you to pass in a single model vector or a single model.

Algorithms

Felsenstein's algorithm recursively computes, for each node, the probability of all observations below that node, given the state at that node. Felsenstein's algorithm can be decomposed into the following combination of backward! and combine! operations:

At the root node, we wind up with $P(O_{all}|R)$, where $R$ is the state at the root, and we can compute $P(O_{all}) = \sum_{R} P(O_{all}|R) P(R)$.

Technicalities

Scaling constants

Coming soon.

Root state

Coming soon.

Functions

MolecularEvolution.combine!Function
combine!(dest::P, src::P) where P<:Partition

Combines evidence from two partitions of the same type, storing the result in dest. Note: You should overload this for your own Partititon types.

source
MolecularEvolution.forward!Function
forward!(dest::Partition, source::Partition, model::BranchModel, node::FelNode)

Propagate the source partition forwards along the branch to the destination partition, under the model. Note: You should overload this for your own BranchModel types.

source
MolecularEvolution.backward!Function
backward!(dest::Partition, source::Partition, model::BranchModel, node::FelNode)

Propagate the source partition backwards along the branch to the destination partition, under the model. Note: You should overload this for your own BranchModel types.

source
+The MolecularEvolution.jl Framework · MolecularEvolution.jl

The MolecularEvolution.jl Framework

The organizing principle is that the core algorithms, including Felsenstein's algorithm, but also a related family of message passing algorithms and inference machinery, are implemented in a way that does not refer to any specific model or even to any particular data type.

Partitions and BranchModels

A Partition is a probabilistic representation of some kind of state. Specifically, it needs to be able to represent P(obs|state) and P(obs,state) when considered as functions of state. So it will typically be able to assign a probability to any possible value of state, and is unnormalized - not required to sum or integrate to 1 over all values of state. As an example, for a discrete state with 4 categories, this could just be a vector of 4 numbers.

For a Partition type to be usable by MolecularEvolution.jl, the combine! function needs to be implemented. If you have P(obsA|state) and P(obsB|state), then combine! calculates P(obsA,obsB|state) under the assumption that obsA and obsB are conditionally independent given state. MolecularEvolution.jl tries to avoid allocating memory, so combine!(dest,src) places in dest the combined Partition in dest. For a discrete state with 4 categories, this is simply element-wise multiplication of two state vectors.

A BranchModel defines how Partition distributions evolve along branches. Two functions need to be implemented: backward! and forward!. We imagine our trees with the root at the top, and forward! moves from root to tip, and backward! moves from tip to root. backward!(dest::P,src::P,m::BranchModel,n::FelNode) takes a src Partition, representing P(obs-below|state-at-bottom-of-branch), and modifies the dest Partition to be P(obs-below|state-at-top-of-branch), where the branch in question is the branch above the FelNode n. forward! goes in the opposite direction, from P(obs-above,state-at-top-of-branch) to P(obs-above,state-at-bottom-of-branch), with the Partitions now, confusingly, representing joint distributions.

Messages

Nodes on our trees work with messages, where a message is a vector of Partition structs. This is in case you wish to model multiple different data types on the same tree. Often, all the messages on the tree will just be arrays containing a single Partition, but if you're accessing them you need to remember that they're in an array!

Trees

Each node in our tree is a FelNode ("Fel" for "Felsenstein"). They point to their parent nodes, and an array of their children, and they store their main vector of Partitions, but also cached versions of those from their parents and children, to allow certain message passing schemes. They also have a branchlength field, which tells eg. forward! and backward! how much evolution occurs along the branch above (ie. closer to the root) that node. They also allow for an arbitrary dictionary of node_data, in case a model needs any other branch-specific parameters.

The set of algorithms needs to know which model to use for which partition, so the assumption made is that they'll see an array of models whose order will match the partition array. In general, we might want the models to vary from one branch to another, so the central algorithms take a function that associates a FelNode->Vector{:<BranchModel}. In the simpler cases where the model does not vary from branch to branch, or where there is only a single Partition, and thus a single model, the core algorithms have been overloaded to allow you to pass in a single model vector or a single model.

Algorithms

Felsenstein's algorithm recursively computes, for each node, the probability of all observations below that node, given the state at that node. Felsenstein's algorithm can be decomposed into the following combination of backward! and combine! operations:

At the root node, we wind up with $P(O_{all}|R)$, where $R$ is the state at the root, and we can compute $P(O_{all}) = \sum_{R} P(O_{all}|R) P(R)$.

Technicalities

Scaling constants

Coming soon.

Root state

Coming soon.

Functions

MolecularEvolution.combine!Function
combine!(dest::P, src::P) where P<:Partition

Combines evidence from two partitions of the same type, storing the result in dest. Note: You should overload this for your own Partititon types.

source
MolecularEvolution.forward!Function
forward!(dest::Partition, source::Partition, model::BranchModel, node::FelNode)

Propagate the source partition forwards along the branch to the destination partition, under the model. Note: You should overload this for your own BranchModel types.

source
MolecularEvolution.backward!Function
backward!(dest::Partition, source::Partition, model::BranchModel, node::FelNode)

Propagate the source partition backwards along the branch to the destination partition, under the model. Note: You should overload this for your own BranchModel types.

source
diff --git a/dev/index.html b/dev/index.html index af6d6d6..5f2ea2f 100644 --- a/dev/index.html +++ b/dev/index.html @@ -9,29 +9,29 @@ sample_down!(tree, bm_model) #And plot the log likelihood as a function of the parameter value ll(x) = log_likelihood!(tree,BrownianMotion(0.0,x)) -plot(0.7:0.001:1.6,ll, xlabel = "variance per unit time", ylabel = "log likelihood")

MolecularEvolution.LazyDownType

Constructors

LazyDown(stores_obs)
-LazyDown() = LazyDown(x::FelNode -> true)

Description

Indicate that we want to do a downward pass, e.g. sample_down!. The function passed to the constructor takes a node::FelNode as input and returns a Bool that decides if node stores its observations.

source
MolecularEvolution.LazyPartitionType

Constructor

LazyPartition{PType}()

Initialize an empty LazyPartition that is meant for wrapping a partition of type PType.

Description

With this data structure, you can wrap a partition of choice. The idea is that in some message passing algorithms, there is only a wave of partitions which need to actualize. For instance, a wave following a root-leaf path, or a depth-first traversal. In which case, we can be more economical with our memory consumption. With a worst case memory complexity of O(log(n)), where n is the number of nodes, functionality is provided for:

  • log_likelihood!
  • felsenstein!
  • sample_down!
Note

For successive felsenstein! calls, we need to extract the information at the root somehow after each call. This can be done with e.g. total_LL or site_LLs.

Further requirements

Suppose you want to wrap a partition of PType with LazyPartition:

  • If you're calling log_likelihood! and felsenstein!:
    • obs2partition!(partition::PType, obs) that transforms an observation to a partition.
  • If you're calling sample_down!:
    • partition2obs(partition::PType) that returns the most likely state from a partition, inverts obs2partition!.
source
MolecularEvolution.LazyUpType

Constructor

LazyUp()

Description

Indicate that we want to do an upward pass, e.g. felsenstein!.

source
Base.:==Method
==(t1, t2)
-Defaults to pointer equality
source
MolecularEvolution.SWM_prob_gridMethod
SWM_prob_grid(part::SWMPartition{PType}) where {PType <: MultiSitePartition}

Returns a matrix of probabilities for each site, for each model (in the probability domain - not logged!) as well as the log probability offsets

source
MolecularEvolution._mapreduceMethod

Internal function. Helper for bfsmapreduce and dfsmapreduce

source
MolecularEvolution.backward!Method
backward!(dest::Partition, source::Partition, model::BranchModel, node::FelNode)

Propagate the source partition backwards along the branch to the destination partition, under the model. Note: You should overload this for your own BranchModel types.

source
MolecularEvolution.bfs_mapreduceMethod

Performs a BFS map-reduce over the tree, starting at a given node For each node, mapreduce is called as: mapreduce(currnode::FelNode, prevnode::FelNode, aggregator) where prev_node is the previous node visited on the path from the start node to the current node It is expected to update the aggregator, and not return anything.

Not exactly conventional map-reduce, as map-reduce calls may rely on state in the aggregator added by map-reduce calls on other nodes visited earlier.

source
MolecularEvolution.branchlength_optim!Method
branchlength_optim!(tree::FelNode, models; partition_list = nothing, tol = 1e-5, bl_optimizer::UnivariateOpt = GoldenSectionOpt())

Uses golden section search, or optionally Brent's method, to optimize all branches recursively, maintaining the integrity of the messages. Requires felsenstein!() to have been run first. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partitionlist (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over (but you probably want to optimize branch lengths with all models). tol is the absolute tolerance for the bloptimizer which defaults to golden section search, and has Brent's method as an option by setting bl_optimizer=BrentsMethodOpt().

source
MolecularEvolution.brents_method_minimizeMethod
brents_method_minimize(f, a::Real, b::Real, transform, t::Real; ε::Real=sqrt(eps()))

Brent's method for minimization.

Given a function f with a single local minimum in the interval (a,b), Brent's method returns an approximation of the x-value that minimizes f to an accuaracy between 2tol and 3tol, where tol is a combination of a relative and an absolute tolerance, tol := ε|x| + t. ε should be no smaller 2*eps, and preferably not much less than sqrt(eps), which is also the default value. eps is defined here as the machine epsilon in double precision. t should be positive.

The method combines the stability of a Golden Section Search and the superlinear convergence Successive Parabolic Interpolation has under certain conditions. The method never converges much slower than a Fibonacci search and for a sufficiently well-behaved f, convergence can be exptected to be superlinear, with an order that's usually atleast 1.3247...

Examples

julia> f(x) = exp(-x) - cos(x)
+plot(0.7:0.001:1.6,ll, xlabel = "variance per unit time", ylabel = "log likelihood")

MolecularEvolution.LazyDownType

Constructors

LazyDown(stores_obs)
+LazyDown() = LazyDown(x::FelNode -> true)

Description

Indicate that we want to do a downward pass, e.g. sample_down!. The function passed to the constructor takes a node::FelNode as input and returns a Bool that decides if node stores its observations.

source
MolecularEvolution.LazyPartitionType

Constructor

LazyPartition{PType}()

Initialize an empty LazyPartition that is meant for wrapping a partition of type PType.

Description

With this data structure, you can wrap a partition of choice. The idea is that in some message passing algorithms, there is only a wave of partitions which need to actualize. For instance, a wave following a root-leaf path, or a depth-first traversal. In which case, we can be more economical with our memory consumption. With a worst case memory complexity of O(log(n)), where n is the number of nodes, functionality is provided for:

  • log_likelihood!
  • felsenstein!
  • sample_down!
Note

For successive felsenstein! calls, we need to extract the information at the root somehow after each call. This can be done with e.g. total_LL or site_LLs.

Further requirements

Suppose you want to wrap a partition of PType with LazyPartition:

  • If you're calling log_likelihood! and felsenstein!:
    • obs2partition!(partition::PType, obs) that transforms an observation to a partition.
  • If you're calling sample_down!:
    • partition2obs(partition::PType) that returns the most likely state from a partition, inverts obs2partition!.
source
MolecularEvolution.SWM_prob_gridMethod
SWM_prob_grid(part::SWMPartition{PType}) where {PType <: MultiSitePartition}

Returns a matrix of probabilities for each site, for each model (in the probability domain - not logged!) as well as the log probability offsets

source
MolecularEvolution.backward!Method
backward!(dest::Partition, source::Partition, model::BranchModel, node::FelNode)

Propagate the source partition backwards along the branch to the destination partition, under the model. Note: You should overload this for your own BranchModel types.

source
MolecularEvolution.bfs_mapreduceMethod

Performs a BFS map-reduce over the tree, starting at a given node For each node, mapreduce is called as: mapreduce(currnode::FelNode, prevnode::FelNode, aggregator) where prev_node is the previous node visited on the path from the start node to the current node It is expected to update the aggregator, and not return anything.

Not exactly conventional map-reduce, as map-reduce calls may rely on state in the aggregator added by map-reduce calls on other nodes visited earlier.

source
MolecularEvolution.branchlength_optim!Method
branchlength_optim!(tree::FelNode, models; partition_list = nothing, tol = 1e-5, bl_optimizer::UnivariateOpt = GoldenSectionOpt())

Uses golden section search, or optionally Brent's method, to optimize all branches recursively, maintaining the integrity of the messages. Requires felsenstein!() to have been run first. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partitionlist (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over (but you probably want to optimize branch lengths with all models). tol is the absolute tolerance for the bloptimizer which defaults to golden section search, and has Brent's method as an option by setting bl_optimizer=BrentsMethodOpt().

source
MolecularEvolution.brents_method_minimizeMethod
brents_method_minimize(f, a::Real, b::Real, transform, t::Real; ε::Real=sqrt(eps()))

Brent's method for minimization.

Given a function f with a single local minimum in the interval (a,b), Brent's method returns an approximation of the x-value that minimizes f to an accuaracy between 2tol and 3tol, where tol is a combination of a relative and an absolute tolerance, tol := ε|x| + t. ε should be no smaller 2*eps, and preferably not much less than sqrt(eps), which is also the default value. eps is defined here as the machine epsilon in double precision. t should be positive.

The method combines the stability of a Golden Section Search and the superlinear convergence Successive Parabolic Interpolation has under certain conditions. The method never converges much slower than a Fibonacci search and for a sufficiently well-behaved f, convergence can be exptected to be superlinear, with an order that's usually atleast 1.3247...

Examples

julia> f(x) = exp(-x) - cos(x)
 f (generic function with 1 method)
 
 julia> m = brents_method_minimize(f, -1, 2, identity, 1e-7)
-0.5885327257940255

From: Richard P. Brent, "Algorithms for Minimization without Derivatives" (1973). Chapter 5.

source
MolecularEvolution.cascading_max_state_dictMethod
cascading_max_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{Partition}}())

Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and returns a dictionary mapping nodes to their inferred ancestors under the following scheme: the state that maximizes the marginal likelihood is selected at the root, and then, for each node, the maximum likelihood state is selected conditioned on the maximized state of the parent node and the observations of all descendents. A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.

source
MolecularEvolution.char_proportionsMethod
char_proportions(seqs, alphabet::String)

Takes a vector of sequences and returns a vector of the proportion of each character across all sequences. An example alphabet argument is MolecularEvolution.AAstring.

source
MolecularEvolution.colored_seq_drawMethod
colored_seq_draw(x, y, str::AbstractString; color_dict=Dict(), font_size=8pt, posx=hcenter, posy=vcenter)

Draw an arbitrary sequence. color_dict gives a mapping from characters to colors (default black). Default options for nucleotide colorings and amino acid colorings are given in the constants NUC_COLORS and AA_COLORS. This can be used along with compose_dict for drawing sequences at nodes in a tree (see tree_draw). Returns a Compose container.

source
MolecularEvolution.combine!Method
combine!(dest::P, src::P) where P<:Partition

Combines evidence from two partitions of the same type, storing the result in dest. Note: You should overload this for your own Partititon types.

source
MolecularEvolution.deepequalsMethod
deepequals(t1, t2)

Checks whether two trees are equal by recursively calling this on all fields, except :parent, in order to prevent cycles. In order to ensure that the :parent field is not hiding something different on both trees, ensure that each is consistent first (see: istreeconsistent).

source
MolecularEvolution.discrete_name_color_dictMethod
discrete_name_color_dict(newt::AbstractTreeNode,tag_func; rainbow = false, scramble = false, darken = true, col_seed = nothing)

Takes a tree and a tag_func, which converts the leaf label into a category (ie. there should be <20 of these), and returns a color dictionary that can be used to color the leaves or bubbles.

Example tagfunc: function tagfunc(nam::String) return split(nam,"_")[1] end

For prettier colors, but less discrimination: rainbow = true To randomize the rainbow color assignment: scramble = true col_seed is currently set to white, and excluded from the list of colors, to make them more visible.

Consider making your own version of this function to customize colors as you see fit.

Example use: numleaves = 50 Nefunc(t) = 1*(e^-t).+5.0 newt = simtree(numleaves,Nefunc,1.0,nstart = rand(1:numleaves)); newt = ladderize(newt) tagfunc(nam) = mod(sum(Int.(collect(nam))),7) dic = discretenamecolordict(newt,tagfunc,rainbow = true); treedraw(newt,linewidth = 0.5mm,labelcolor_dict = dic)

source
MolecularEvolution.endpoint_conditioned_sample_state_dictMethod
endpoint_conditioned_sample_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{Partition}}())

Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and draws samples under the model conditions on the leaf observations. These samples are stored in the nodemessagedict, which is returned. A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.

source
MolecularEvolution.felsenstein!Method
felsenstein!(node::FelNode, models; partition_list = nothing)

Should usually be called on the root of the tree. Propagates Felsenstein pass up from the tips to the root. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partition_list (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.

source
MolecularEvolution.felsenstein_down!Method
felsenstein_down!(node::FelNode, models; partition_list = 1:length(tree.message), temp_message = copy_message(tree.message))

Should usually be called on the root of the tree. Propagates Felsenstein pass down from the root to the tips. felsenstein!() should usually be called first. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partition_list (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.

source
MolecularEvolution.forward!Method
forward!(dest::Partition, source::Partition, model::BranchModel, node::FelNode)

Propagate the source partition forwards along the branch to the destination partition, under the model. Note: You should overload this for your own BranchModel types.

source
MolecularEvolution.gappy_Q_from_symmetric_rate_matrixMethod
gappy_Q_from_symmetric_rate_matrix(sym_mat, gap_rate, eq_freqs)

Takes a symmetric rate matrix and gap rate (governing mutations to and from gaps) and returns a gappy rate matrix. The equilibrium frequencies are multiplied on column-wise.

source
MolecularEvolution.get_phylo_treeMethod
get_phylo_tree(molev_root::FelNode; data_function = (x -> Tuple{String,Float64}[]))

Converts a FelNode tree to a Phylo tree. The data_function should return a list of tuples of the form (key, value) to be added to the Phylo tree data Dictionary. Any key/value pairs on the FelNode node_data Dict will also be added to the Phylo tree.

source
MolecularEvolution.golden_section_maximizeMethod

Golden section search.

Given a function f with a single local minimum in the interval [a,b], gss returns a subset interval [c,d] that contains the minimum with d-c <= tol.

Examples

julia> f(x) = -(x-2)^2
+0.5885327257940255

From: Richard P. Brent, "Algorithms for Minimization without Derivatives" (1973). Chapter 5.

source
MolecularEvolution.cascading_max_state_dictMethod
cascading_max_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{Partition}}())

Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and returns a dictionary mapping nodes to their inferred ancestors under the following scheme: the state that maximizes the marginal likelihood is selected at the root, and then, for each node, the maximum likelihood state is selected conditioned on the maximized state of the parent node and the observations of all descendents. A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.

source
MolecularEvolution.char_proportionsMethod
char_proportions(seqs, alphabet::String)

Takes a vector of sequences and returns a vector of the proportion of each character across all sequences. An example alphabet argument is MolecularEvolution.AAstring.

source
MolecularEvolution.colored_seq_drawMethod
colored_seq_draw(x, y, str::AbstractString; color_dict=Dict(), font_size=8pt, posx=hcenter, posy=vcenter)

Draw an arbitrary sequence. color_dict gives a mapping from characters to colors (default black). Default options for nucleotide colorings and amino acid colorings are given in the constants NUC_COLORS and AA_COLORS. This can be used along with compose_dict for drawing sequences at nodes in a tree (see tree_draw). Returns a Compose container.

source
MolecularEvolution.combine!Method
combine!(dest::P, src::P) where P<:Partition

Combines evidence from two partitions of the same type, storing the result in dest. Note: You should overload this for your own Partititon types.

source
MolecularEvolution.deepequalsMethod
deepequals(t1, t2)

Checks whether two trees are equal by recursively calling this on all fields, except :parent, in order to prevent cycles. In order to ensure that the :parent field is not hiding something different on both trees, ensure that each is consistent first (see: istreeconsistent).

source
MolecularEvolution.discrete_name_color_dictMethod
discrete_name_color_dict(newt::AbstractTreeNode,tag_func; rainbow = false, scramble = false, darken = true, col_seed = nothing)

Takes a tree and a tag_func, which converts the leaf label into a category (ie. there should be <20 of these), and returns a color dictionary that can be used to color the leaves or bubbles.

Example tagfunc: function tagfunc(nam::String) return split(nam,"_")[1] end

For prettier colors, but less discrimination: rainbow = true To randomize the rainbow color assignment: scramble = true col_seed is currently set to white, and excluded from the list of colors, to make them more visible.

Consider making your own version of this function to customize colors as you see fit.

Example use: numleaves = 50 Nefunc(t) = 1*(e^-t).+5.0 newt = simtree(numleaves,Nefunc,1.0,nstart = rand(1:numleaves)); newt = ladderize(newt) tagfunc(nam) = mod(sum(Int.(collect(nam))),7) dic = discretenamecolordict(newt,tagfunc,rainbow = true); treedraw(newt,linewidth = 0.5mm,labelcolor_dict = dic)

source
MolecularEvolution.endpoint_conditioned_sample_state_dictMethod
endpoint_conditioned_sample_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{Partition}}())

Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and draws samples under the model conditions on the leaf observations. These samples are stored in the nodemessagedict, which is returned. A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.

source
MolecularEvolution.felsenstein!Method
felsenstein!(node::FelNode, models; partition_list = nothing)

Should usually be called on the root of the tree. Propagates Felsenstein pass up from the tips to the root. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partition_list (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.

source
MolecularEvolution.felsenstein_down!Method
felsenstein_down!(node::FelNode, models; partition_list = 1:length(tree.message), temp_message = copy_message(tree.message))

Should usually be called on the root of the tree. Propagates Felsenstein pass down from the root to the tips. felsenstein!() should usually be called first. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partition_list (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.

source
MolecularEvolution.forward!Method
forward!(dest::Partition, source::Partition, model::BranchModel, node::FelNode)

Propagate the source partition forwards along the branch to the destination partition, under the model. Note: You should overload this for your own BranchModel types.

source
MolecularEvolution.gappy_Q_from_symmetric_rate_matrixMethod
gappy_Q_from_symmetric_rate_matrix(sym_mat, gap_rate, eq_freqs)

Takes a symmetric rate matrix and gap rate (governing mutations to and from gaps) and returns a gappy rate matrix. The equilibrium frequencies are multiplied on column-wise.

source
MolecularEvolution.get_phylo_treeMethod
get_phylo_tree(molev_root::FelNode; data_function = (x -> Tuple{String,Float64}[]))

Converts a FelNode tree to a Phylo tree. The data_function should return a list of tuples of the form (key, value) to be added to the Phylo tree data Dictionary. Any key/value pairs on the FelNode node_data Dict will also be added to the Phylo tree.

source
MolecularEvolution.golden_section_maximizeMethod

Golden section search.

Given a function f with a single local minimum in the interval [a,b], gss returns a subset interval [c,d] that contains the minimum with d-c <= tol.

Examples

julia> f(x) = -(x-2)^2
 f (generic function with 1 method)
 
 julia> m = golden_section_maximize(f, 1, 5, identity, 1e-10)
-2.0000000000051843

From: https://en.wikipedia.org/wiki/Golden-section_search

source
MolecularEvolution.highlight_seq_drawMethod
highlight_seq_draw(x, y, str::AbstractString, region, basecolor, hicolor; fontsize=8pt, posx=hcenter, posy=vcenter)

Draw a sequence, highlighting the sites given in region. This can be used along with compose_dict for drawing sequences at nodes in a tree (see tree_draw). Returns a Compose container.

source
MolecularEvolution.highlight_seq_drawMethod
highlight_seq_draw(x, y, str::AbstractString, region, basecolor, hicolor; fontsize=8pt, posx=hcenter, posy=vcenter)

Draw a sequence, highlighting the sites given in region. This can be used along with compose_dict for drawing sequences at nodes in a tree (see tree_draw). Returns a Compose container.

source
MolecularEvolution.highlighter_tree_drawMethod
highlighter_tree_draw(tree, ali_seqs, seqnames, master;
     highlighter_start = 1.1, highlighter_width = 1,
     coord_width = highlighter_start + highlighter_width + 0.1,
     scale_length = nothing, major_breaks = 1000, minor_breaks = 500,
-    tree_args = NamedTuple[], legend_padding = 0.5cm, legend_colors = NUC_colors)

Draws a combined tree and highlighter plot. The vector of seqnames must match the node names in tree.

kwargs:

  • treeargs: kwargs to pass to `treedraw()`
  • legendcolors: Mapping of characters to highlighter colors (default NTcolors)
  • scale_length: Length of the scale bar
  • highlighter_start: Canvas start for the highlighter panel
  • highlighter_width: Canvas width for the highlighter panel
  • coord_width: Total width of the canvas
  • major_breaks: Numbered breaks for sequence axis
  • minor_breaks: Ticks for sequence axis
source
MolecularEvolution.internal_message_init!Method
internal_message_init!(tree::FelNode, partition::Partition)
+    tree_args = NamedTuple[], legend_padding = 0.5cm, legend_colors = NUC_colors)

Draws a combined tree and highlighter plot. The vector of seqnames must match the node names in tree.

kwargs:

  • treeargs: kwargs to pass to `treedraw()`
  • legendcolors: Mapping of characters to highlighter colors (default NTcolors)
  • scale_length: Length of the scale bar
  • highlighter_start: Canvas start for the highlighter panel
  • highlighter_width: Canvas width for the highlighter panel
  • coord_width: Total width of the canvas
  • major_breaks: Numbered breaks for sequence axis
  • minor_breaks: Ticks for sequence axis
source
MolecularEvolution.lazyprep!Method
lazyprep!(tree::FelNode, initial_message::Vector{<:Partition}; partition_list = 1:length(tree.message), direction::LazyDirection = LazyUp())

Extra, intermediate step of tree preparations between initializing messages across the tree and calling message passing algorithms with LazyPartition.

  1. Perform a lazysort! on tree to obtain the optimal tree for a lazy felsenstein! prop, or a sample_down!.
  2. Fix tree.parent_message to an initial message.
  3. Preallocate sufficiently many inner partitions needed for a felsenstein! prop, or a sample_down!.
  4. Specialized preparations based on the direction of the operations (forward!, backward!). LazyDown or LazyUp.

See also LazyDown, LazyUp.

source
MolecularEvolution.lazysort!Method
  • Should be run on a tree containing LazyPartitions before running felsenstein!. Sorts for a minimal count of active partitions during a felsenstein!
  • Returns the minimum length of memoryblocks (-1) required for a felsenstein! prop. We need a temporary memoryblock during backward!, hence the '-1'.
Note

Since felsenstein! uses a stack, we want to avoid having long node.children[1].children[1]... chains

source
MolecularEvolution.log_likelihood!Method
log_likelihood!(tree::FelNode, models; partition_list = nothing)

First re-computes the upward felsenstein pass, and then computes the log likelihood of this tree. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partition_list (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.

source
MolecularEvolution.log_likelihoodMethod
log_likelihood(tree::FelNode, models; partition_list = nothing)

Computed the log likelihood of this tree. Requires felsenstein!() to have been run. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partition_list (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.

source
MolecularEvolution.longest_pathMethod

Returns the longest path in a tree For convenience, this is returned as two lists of form: [leafnode, parentnode, .... root] Where the leaf_node nodes are selected to be the furthest away

source
MolecularEvolution.marginal_state_dictMethod
marginal_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{Partition}}())

Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and returns a dictionary mapping nodes to their marginal reconstructions (ie. P(state|all observations,model)). A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.

source
MolecularEvolution.matrix_for_displayMethod
matrix_for_display(Q,labels)

Takes a numerical matrix and a vector of labels, and returns a typically mixed type matrix with the numerical values and the labels. This is to easily visualize rate matrices in eg. the REPL.

source
MolecularEvolution.mixMethod
mix(swm_part::SWMPartition{PType} ) where {PType <: MultiSitePartition}

mix collapses a Site-Wise Mixture partition to a single component partition, weighted by the site-wise likelihoods for each component, and the init weights. Specifically, it takes a SWMPartition{Ptype} and returns a PType. You'll need to have this implemented for certain helper functionality if you're playing with new kinds of SWMPartitions that aren't mixtures of DiscretePartitions.

source
MolecularEvolution.nni_optim!Method
nni_optim!(tree::FelNode, models; partition_list = nothing, tol = 1e-5)

Considers local branch swaps for all branches recursively, maintaining the integrity of the messages. Requires felsenstein!() to have been run first. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partitionlist (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over (but you probably want to optimize tree topology with all models). accrule allows you to specify a function that takes the current and proposed log likelihoods, and if true is returned the move is accepted.

source
MolecularEvolution.partition2obsMethod
partition2obs(part::Partition)

Extracts the most likely state from a Partition, transforming it into a convenient type. For example, a NucleotidePartition will be transformed into a nucleotide sequence of type String. Note: You should overload this for your own Partititon types.

source
MolecularEvolution.populate_tree!Method
populate_tree!(tree::FelNode, starting_message, names, data; init_all_messages = true, tolerate_missing = 1)

Takes a tree, and a starting_message (which will serve as the memory template for populating messages all over the tree). starting_message can be a message (ie. a vector of Partitions), but will also work with a single Partition (although the tree) will still be populated with a length-1 vector of Partitions. Further, as long as obs2partition is implemented for your Partition type, the leaf nodes will be populated with the data from data, matching the names on each leaf. When a leaf on the tree has a name that doesn't match anything in names, then if

  • tolerate_missing = 0, an error will be thrown
  • tolerate_missing = 1, a warning will be thrown, and the message will be set to the uninformative message (requires identity!(::Partition) to be defined)
  • tolerate_missing = 2, the message will be set to the uninformative message, without warnings (requires identity!(::Partition) to be defined)
source
MolecularEvolution.promote_internalMethod
promote_internal(tree::FelNode)

Creates a new tree similar to the given tree, but with 'dummy' leaf nodes (w/ zero branchlength) representing each internal node (for drawing / evenly spacing labels internal nodes).

source
MolecularEvolution.quadratic_CIMethod
quadratic_CI(f::Function,opt_params::Vector, param_ind::Int; rate_conf_level = 0.99, nudge_amount = 0.01)

Takes a NEGATIVE log likelihood function (compatible with Optim.jl), a vector of maximizing parameters, an a parameter index. Returns the quadratic confidence interval.

source
MolecularEvolution.quadratic_CIMethod
quadratic_CI(xvec,yvec; rate_conf_level = 0.99)

Takes xvec, a vector of parameter values, and yvec, a vector of log likelihood evaluations (note: NOT the negative LLs you) might use with Optim.jl. Returns the confidence intervals computed by a quadratic approximation to the LL.

source
MolecularEvolution.reversibleQMethod
reversibleQ(param_vec,eq_freqs)

Takes a vector of parameters and equilibrium frequencies and returns a reversible rate matrix. The parameters are the upper triangle of the rate matrix, with the diagonal elements omitted, and the equilibrium frequencies are multiplied column-wise.

source
MolecularEvolution.root2tip_distancesMethod
root2tips(root::AbstractTreeNode)

Returns a vector of root-to-tip distances, and a node-to-index dictionary. Be aware that this dictionary will break when any of the node content (ie. anything on the tree) changes.

source
MolecularEvolution.sample_down!Method

sampledown!(root::FelNode,models,partitionlist)

Generates samples under the model. The root.parentmessage is taken as the starting distribution, and node.message contains the sampled messages. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partitionlist (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.

source
MolecularEvolution.savefig_tweakSVGMethod
savefig_tweakSVG(fname, plot::Context; width = 10cm, height = 10cm, linecap_round = true, white_background = true)

Saves a figure created using the Compose approach, but tweaks the SVG after export.

eg. savefig_tweakSVG("export.svg",pl)

source
MolecularEvolution.savefig_tweakSVGMethod
savefig_tweakSVG(fname, plot::Plots.Plot; hack_bounding_box = true, new_viewbox = nothing, linecap_round = true)

Note: Might only work if you're using the GR backend!! Saves a figure created using the Phylo Plots recipe, but tweaks the SVG after export. new_viewbox needs to be an array of 4 numbers, typically starting at [0 0 plot_width*4 plot_height*4] but this lets you add shifts, in case the plot is getting cut off.

eg. savefig_tweakSVG("export.svg",pl, new_viewbox = [-100, -100, 3000, 4500])

source
MolecularEvolution.sim_treeMethod
sim_tree(add_limit::Int,Ne_func,sample_rate_func; nstart = 1, time = 0.0, mutation_rate = 1.0, T = Float64)

Simulates a tree of type FelNode{T}. Allows an effective population size function (Nefunc), as well as a sample rate function (samplerate_func), which can also just be constants.

Nefunc(t) = (sin(t/10)+1)*100.0 + 10.0 root = simtree(600,Nefunc,1.0) simpletree_draw(ladderize(root))

source
MolecularEvolution.simple_radial_tree_plotMethod
simple_radial_tree_plot(root::FelNode; canvas_width = 10cm, line_color = "black", line_width = 0.1mm)

Draws a radial tree. No frills. No labels. Canvas height is automatically determined to avoid distorting the tree.

newt = betternewickimport("((A:1,B:1,C:1,D:1,E:1,F:1,G:1):1,(H:1,I:1):1);", FelNode{Float64}); simpleradialtreeplot(newt,linewidth = 0.5mm,root_angle = 7/10)

source
MolecularEvolution.simple_tree_drawMethod

img = simpletreedraw(tree::FelNode; canvaswidth = 15cm, canvasheight = 15cm, linecolor = "black", linewidth = 0.1mm)

A line drawing of a tree with very few options.

img = simple_tree_draw(tree)
+Initializes the message template for each node in the tree, allocating space for each partition.
source
MolecularEvolution.lazyprep!Method
lazyprep!(tree::FelNode, initial_message::Vector{<:Partition}; partition_list = 1:length(tree.message), direction::LazyDirection = LazyUp())

Extra, intermediate step of tree preparations between initializing messages across the tree and calling message passing algorithms with LazyPartition.

  1. Perform a lazysort! on tree to obtain the optimal tree for a lazy felsenstein! prop, or a sample_down!.
  2. Fix tree.parent_message to an initial message.
  3. Preallocate sufficiently many inner partitions needed for a felsenstein! prop, or a sample_down!.
  4. Specialized preparations based on the direction of the operations (forward!, backward!). LazyDown or LazyUp.

See also LazyDown, LazyUp.

source
MolecularEvolution.lazysort!Method
  • Should be run on a tree containing LazyPartitions before running felsenstein!. Sorts for a minimal count of active partitions during a felsenstein!
  • Returns the minimum length of memoryblocks (-1) required for a felsenstein! prop. We need a temporary memoryblock during backward!, hence the '-1'.
Note

Since felsenstein! uses a stack, we want to avoid having long node.children[1].children[1]... chains

source
MolecularEvolution.log_likelihood!Method
log_likelihood!(tree::FelNode, models; partition_list = nothing)

First re-computes the upward felsenstein pass, and then computes the log likelihood of this tree. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partition_list (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.

source
MolecularEvolution.log_likelihoodMethod
log_likelihood(tree::FelNode, models; partition_list = nothing)

Computed the log likelihood of this tree. Requires felsenstein!() to have been run. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partition_list (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.

source
MolecularEvolution.longest_pathMethod

Returns the longest path in a tree For convenience, this is returned as two lists of form: [leafnode, parentnode, .... root] Where the leaf_node nodes are selected to be the furthest away

source
MolecularEvolution.marginal_state_dictMethod
marginal_state_dict(tree::FelNode, model; partition_list = 1:length(tree.message), node_message_dict = Dict{FelNode,Vector{Partition}}())

Takes in a tree and a model (which can be a single model, an array of models, or a function that maps FelNode->Array{<:BranchModel}), and returns a dictionary mapping nodes to their marginal reconstructions (ie. P(state|all observations,model)). A subset of partitions can be specified by partition_list, and a dictionary can be passed in to avoid re-allocating memory, in case you're running this over and over.

source
MolecularEvolution.matrix_for_displayMethod
matrix_for_display(Q,labels)

Takes a numerical matrix and a vector of labels, and returns a typically mixed type matrix with the numerical values and the labels. This is to easily visualize rate matrices in eg. the REPL.

source
MolecularEvolution.mixMethod
mix(swm_part::SWMPartition{PType} ) where {PType <: MultiSitePartition}

mix collapses a Site-Wise Mixture partition to a single component partition, weighted by the site-wise likelihoods for each component, and the init weights. Specifically, it takes a SWMPartition{Ptype} and returns a PType. You'll need to have this implemented for certain helper functionality if you're playing with new kinds of SWMPartitions that aren't mixtures of DiscretePartitions.

source
MolecularEvolution.nni_optim!Method
nni_optim!(tree::FelNode, models; partition_list = nothing, tol = 1e-5)

Considers local branch swaps for all branches recursively, maintaining the integrity of the messages. Requires felsenstein!() to have been run first. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partitionlist (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over (but you probably want to optimize tree topology with all models). accrule allows you to specify a function that takes the current and proposed log likelihoods, and if true is returned the move is accepted.

source
MolecularEvolution.partition2obsMethod
partition2obs(part::Partition)

Extracts the most likely state from a Partition, transforming it into a convenient type. For example, a NucleotidePartition will be transformed into a nucleotide sequence of type String. Note: You should overload this for your own Partititon types.

source
MolecularEvolution.populate_tree!Method
populate_tree!(tree::FelNode, starting_message, names, data; init_all_messages = true, tolerate_missing = 1)

Takes a tree, and a starting_message (which will serve as the memory template for populating messages all over the tree). starting_message can be a message (ie. a vector of Partitions), but will also work with a single Partition (although the tree) will still be populated with a length-1 vector of Partitions. Further, as long as obs2partition is implemented for your Partition type, the leaf nodes will be populated with the data from data, matching the names on each leaf. When a leaf on the tree has a name that doesn't match anything in names, then if

  • tolerate_missing = 0, an error will be thrown
  • tolerate_missing = 1, a warning will be thrown, and the message will be set to the uninformative message (requires identity!(::Partition) to be defined)
  • tolerate_missing = 2, the message will be set to the uninformative message, without warnings (requires identity!(::Partition) to be defined)
source
MolecularEvolution.promote_internalMethod
promote_internal(tree::FelNode)

Creates a new tree similar to the given tree, but with 'dummy' leaf nodes (w/ zero branchlength) representing each internal node (for drawing / evenly spacing labels internal nodes).

source
MolecularEvolution.quadratic_CIMethod
quadratic_CI(f::Function,opt_params::Vector, param_ind::Int; rate_conf_level = 0.99, nudge_amount = 0.01)

Takes a NEGATIVE log likelihood function (compatible with Optim.jl), a vector of maximizing parameters, an a parameter index. Returns the quadratic confidence interval.

source
MolecularEvolution.quadratic_CIMethod
quadratic_CI(xvec,yvec; rate_conf_level = 0.99)

Takes xvec, a vector of parameter values, and yvec, a vector of log likelihood evaluations (note: NOT the negative LLs you) might use with Optim.jl. Returns the confidence intervals computed by a quadratic approximation to the LL.

source
MolecularEvolution.reversibleQMethod
reversibleQ(param_vec,eq_freqs)

Takes a vector of parameters and equilibrium frequencies and returns a reversible rate matrix. The parameters are the upper triangle of the rate matrix, with the diagonal elements omitted, and the equilibrium frequencies are multiplied column-wise.

source
MolecularEvolution.root2tip_distancesMethod
root2tips(root::AbstractTreeNode)

Returns a vector of root-to-tip distances, and a node-to-index dictionary. Be aware that this dictionary will break when any of the node content (ie. anything on the tree) changes.

source
MolecularEvolution.sample_down!Method

sampledown!(root::FelNode,models,partitionlist)

Generates samples under the model. The root.parentmessage is taken as the starting distribution, and node.message contains the sampled messages. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partitionlist (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.

source
MolecularEvolution.savefig_tweakSVGMethod
savefig_tweakSVG(fname, plot::Context; width = 10cm, height = 10cm, linecap_round = true, white_background = true)

Saves a figure created using the Compose approach, but tweaks the SVG after export.

eg. savefig_tweakSVG("export.svg",pl)

source
MolecularEvolution.savefig_tweakSVGMethod
savefig_tweakSVG(fname, plot::Plots.Plot; hack_bounding_box = true, new_viewbox = nothing, linecap_round = true)

Note: Might only work if you're using the GR backend!! Saves a figure created using the Phylo Plots recipe, but tweaks the SVG after export. new_viewbox needs to be an array of 4 numbers, typically starting at [0 0 plot_width*4 plot_height*4] but this lets you add shifts, in case the plot is getting cut off.

eg. savefig_tweakSVG("export.svg",pl, new_viewbox = [-100, -100, 3000, 4500])

source
MolecularEvolution.sim_treeMethod
sim_tree(add_limit::Int,Ne_func,sample_rate_func; nstart = 1, time = 0.0, mutation_rate = 1.0, T = Float64)

Simulates a tree of type FelNode{T}. Allows an effective population size function (Nefunc), as well as a sample rate function (samplerate_func), which can also just be constants.

Nefunc(t) = (sin(t/10)+1)*100.0 + 10.0 root = simtree(600,Nefunc,1.0) simpletree_draw(ladderize(root))

source
MolecularEvolution.simple_radial_tree_plotMethod
simple_radial_tree_plot(root::FelNode; canvas_width = 10cm, line_color = "black", line_width = 0.1mm)

Draws a radial tree. No frills. No labels. Canvas height is automatically determined to avoid distorting the tree.

newt = betternewickimport("((A:1,B:1,C:1,D:1,E:1,F:1,G:1):1,(H:1,I:1):1);", FelNode{Float64}); simpleradialtreeplot(newt,linewidth = 0.5mm,root_angle = 7/10)

source
MolecularEvolution.simple_tree_drawMethod

img = simpletreedraw(tree::FelNode; canvaswidth = 15cm, canvasheight = 15cm, linecolor = "black", linewidth = 0.1mm)

A line drawing of a tree with very few options.

img = simple_tree_draw(tree)
 img |> SVG("imgout.svg",10cm, 10cm)
 OR
 using Cairo
-img |> PDF("imgout.pdf",10cm, 10cm)
source
MolecularEvolution.total_LLMethod

total_LL(p::Partition)

If called on the root, it returns the log likelihood associated with that partition. Can be overloaded for complex partitions without straightforward site log likelihoods.

source
MolecularEvolution.tree2distancesMethod
tree2distances(root::AbstractTreeNode)

Returns a distance matrix for all pairs of leaf nodes, and a node-to-index dictionary. Be aware that this dictionary will break when any of the node content (ie. anything on the tree) changes.

source
MolecularEvolution.tree2shared_branch_lengthsMethod
tree2distances(root::AbstractTreeNode)

Returns a distance matrix for all pairs of leaf nodes, and a node-to-index dictionary. Be aware that this dictionary will break when any of the node content (ie. anything on the tree) changes.

source
MolecularEvolution.total_LLMethod

total_LL(p::Partition)

If called on the root, it returns the log likelihood associated with that partition. Can be overloaded for complex partitions without straightforward site log likelihoods.

source
MolecularEvolution.tree2distancesMethod
tree2distances(root::AbstractTreeNode)

Returns a distance matrix for all pairs of leaf nodes, and a node-to-index dictionary. Be aware that this dictionary will break when any of the node content (ie. anything on the tree) changes.

source
MolecularEvolution.tree2shared_branch_lengthsMethod
tree2distances(root::AbstractTreeNode)

Returns a distance matrix for all pairs of leaf nodes, and a node-to-index dictionary. Be aware that this dictionary will break when any of the node content (ie. anything on the tree) changes.

source
MolecularEvolution.tree_drawMethod
tree_draw(tree::FelNode;
     canvas_width = 15cm, canvas_height = 15cm,
     stretch_for_labels = 2.0, draw_labels = true,
     line_width = 0.1mm, font_size = 4pt,
@@ -60,10 +60,10 @@
 img |> SVG("imgout.svg",10cm, 10cm)
 OR
 using Cairo
-img |> PDF("imgout.pdf",10cm, 10cm)
source
MolecularEvolution.tree_polish!Method

tree_polish!(newt, models; tol = 10^-4, verbose = 1, topology = true)

Takes a tree and a model function, and optimizes branch lengths and, optionally, topology. Returns final LL. Set verbose=0 to suppress output. Note: This is not intended for an exhaustive tree search (which requires different heuristics), but rather to polish a tree that is already relatively close to the optimum.

source
MolecularEvolution.unc2probvecMethod
unc2probvec(v)

Takes an array of N-1 unbounded values and returns an array of N values that sums to 1. Typically useful for optimizing over categorical probability distributions.

source
MolecularEvolution.univariate_maximizeMethod
univariate_maximize(f, a::Real, b::Real, transform, optimizer::BrentsMethodOpt, t::Real; ε::Real=sqrt(eps))

Maximizes f(x) using Brent's method. See ?brents_method_minimize.

source
MolecularEvolution.univariate_maximizeMethod
univariate_maximize(f, a::Real, b::Real, transform, optimizer::GoldenSectionOpt, tol::Real)

Maximizes f(x) using a Golden Section Search. See ?golden_section_maximize.

Examples

julia> f(x) = -(x-2)^2
+img |> PDF("imgout.pdf",10cm, 10cm)
source
MolecularEvolution.tree_polish!Method

tree_polish!(newt, models; tol = 10^-4, verbose = 1, topology = true)

Takes a tree and a model function, and optimizes branch lengths and, optionally, topology. Returns final LL. Set verbose=0 to suppress output. Note: This is not intended for an exhaustive tree search (which requires different heuristics), but rather to polish a tree that is already relatively close to the optimum.

source
MolecularEvolution.unc2probvecMethod
unc2probvec(v)

Takes an array of N-1 unbounded values and returns an array of N values that sums to 1. Typically useful for optimizing over categorical probability distributions.

source
MolecularEvolution.univariate_maximizeMethod
univariate_maximize(f, a::Real, b::Real, transform, optimizer::BrentsMethodOpt, t::Real; ε::Real=sqrt(eps))

Maximizes f(x) using Brent's method. See ?brents_method_minimize.

source
MolecularEvolution.univariate_maximizeMethod
univariate_maximize(f, a::Real, b::Real, transform, optimizer::GoldenSectionOpt, tol::Real)

Maximizes f(x) using a Golden Section Search. See ?golden_section_maximize.

Examples

julia> f(x) = -(x-2)^2
 f (generic function with 1 method)
 
 julia> m = univariate_maximize(f, 1, 5, identity, GoldenSectionOpt(), 1e-10)
-2.0000000000051843
source
MolecularEvolution.values_from_phylo_treeMethod
values_from_phylo_tree(phylo_tree, key)
 
-Returns a list of values from the given key in the nodes of the phylo_tree, in an order that is somehow compatible with the order the nodes get plotted in.
source
MolecularEvolution.weightEMMethod
weightEM(con_lik_matrix::Array{Float64,2}, θ; conc = 0.0, iters = 500)

Takes a conditional likelihood matrix (#categories-by-sites) and a starting frequency vector θ (length(θ) = #categories) and optimizes θ (using Expectation Maximization. Maybe.). If conc > 0 then this gives something like variational bayes behavior for LDA. Maybe.

source
MolecularEvolution.write_fastaMethod
write_fasta(filepath::String, sequences::Vector{String}; seq_names = nothing)

Writes a fasta file from a vector of sequences, with optional seq_names.

source
MolecularEvolution.write_nexusMethod
write_nexus(fname::String,tree::FelNode)

Writes the tree as a nexus file, suitable for opening in eg. FigTree. Data in the node_data dictionary will be converted into annotations. Only tested for simple node_data formats and types.

source
+Returns a list of values from the given key in the nodes of the phylo_tree, in an order that is somehow compatible with the order the nodes get plotted in.source
MolecularEvolution.weightEMMethod
weightEM(con_lik_matrix::Array{Float64,2}, θ; conc = 0.0, iters = 500)

Takes a conditional likelihood matrix (#categories-by-sites) and a starting frequency vector θ (length(θ) = #categories) and optimizes θ (using Expectation Maximization. Maybe.). If conc > 0 then this gives something like variational bayes behavior for LDA. Maybe.

source
MolecularEvolution.write_fastaMethod
write_fasta(filepath::String, sequences::Vector{String}; seq_names = nothing)

Writes a fasta file from a vector of sequences, with optional seq_names.

source
MolecularEvolution.write_nexusMethod
write_nexus(fname::String,tree::FelNode)

Writes the tree as a nexus file, suitable for opening in eg. FigTree. Data in the node_data dictionary will be converted into annotations. Only tested for simple node_data formats and types.

source
diff --git a/dev/models/index.html b/dev/models/index.html index 6681e7a..d6862df 100644 --- a/dev/models/index.html +++ b/dev/models/index.html @@ -1,5 +1,5 @@ -Models · MolecularEvolution.jl

Models

Coming soon.

Discrete state models

Codon models

Continuous models

Compound models

Lazy models

LazyPartition

MolecularEvolution.LazyPartitionType

Constructor

LazyPartition{PType}()

Initialize an empty LazyPartition that is meant for wrapping a partition of type PType.

Description

With this data structure, you can wrap a partition of choice. The idea is that in some message passing algorithms, there is only a wave of partitions which need to actualize. For instance, a wave following a root-leaf path, or a depth-first traversal. In which case, we can be more economical with our memory consumption. With a worst case memory complexity of O(log(n)), where n is the number of nodes, functionality is provided for:

  • log_likelihood!
  • felsenstein!
  • sample_down!
Note

For successive felsenstein! calls, we need to extract the information at the root somehow after each call. This can be done with e.g. total_LL or site_LLs.

Further requirements

Suppose you want to wrap a partition of PType with LazyPartition:

  • If you're calling log_likelihood! and felsenstein!:
    • obs2partition!(partition::PType, obs) that transforms an observation to a partition.
  • If you're calling sample_down!:
    • partition2obs(partition::PType) that returns the most likely state from a partition, inverts obs2partition!.
source

Examples

Example 1: Initializing for an upward pass

Now, we show how to wrap the CodonPartitions from Example 3: FUBAR with LazyPartition:

You simply go from initializing messages like this:

initial_partition = CodonPartition(Int64(length(seqs[1])/3))
+Models · MolecularEvolution.jl

Models

Coming soon.

Discrete state models

Codon models

Continuous models

Compound models

Lazy models

LazyPartition

MolecularEvolution.LazyPartitionType

Constructor

LazyPartition{PType}()

Initialize an empty LazyPartition that is meant for wrapping a partition of type PType.

Description

With this data structure, you can wrap a partition of choice. The idea is that in some message passing algorithms, there is only a wave of partitions which need to actualize. For instance, a wave following a root-leaf path, or a depth-first traversal. In which case, we can be more economical with our memory consumption. With a worst case memory complexity of O(log(n)), where n is the number of nodes, functionality is provided for:

  • log_likelihood!
  • felsenstein!
  • sample_down!
Note

For successive felsenstein! calls, we need to extract the information at the root somehow after each call. This can be done with e.g. total_LL or site_LLs.

Further requirements

Suppose you want to wrap a partition of PType with LazyPartition:

  • If you're calling log_likelihood! and felsenstein!:
    • obs2partition!(partition::PType, obs) that transforms an observation to a partition.
  • If you're calling sample_down!:
    • partition2obs(partition::PType) that returns the most likely state from a partition, inverts obs2partition!.
source

Examples

Example 1: Initializing for an upward pass

Now, we show how to wrap the CodonPartitions from Example 3: FUBAR with LazyPartition:

You simply go from initializing messages like this:

initial_partition = CodonPartition(Int64(length(seqs[1])/3))
 initial_partition.state .= eq_freqs
 populate_tree!(tree,initial_partition,seqnames,seqs)

To this

initial_partition = CodonPartition(Int64(length(seqs[1])/3))
 initial_partition.state .= eq_freqs
@@ -8,5 +8,5 @@
 lazyprep!(tree, initial_partition)

By this slight modification, we go from initializing and using 554 partitions to 6 during the subsequent log_likelihood! and felsenstein! calls. There is no significant decrease in performance recorded from this switch.

Example 2: Initializing for a downward pass

Now, we show how to wrap the GaussianPartitions from Quick example: Likelihood calculations under phylogenetic Brownian motion: with LazyPartition:

You simply go from initializing messages like this:

internal_message_init!(tree, GaussianPartition())

To this (technically we only add 1 LOC)

initial_partition = GaussianPartition()
 lazy_initial_partition = LazyPartition{GaussianPartition}()
 internal_message_init!(tree, lazy_initial_partition)
-lazyprep!(tree, initial_partition, direction=LazyDown(isleafnode))
Note

Now, we provided a direction for lazyprep!. The direction is an instance of LazyDown, which was initialized with the isleafnode function. The function isleafnode dictates if a node saves its sampled observation after a down pass. If you use direction=LazyDown(), every node saves its observation.

Surrounding LazyPartition

MolecularEvolution.lazyprep!Function
lazyprep!(tree::FelNode, initial_message::Vector{<:Partition}; partition_list = 1:length(tree.message), direction::LazyDirection = LazyUp())

Extra, intermediate step of tree preparations between initializing messages across the tree and calling message passing algorithms with LazyPartition.

  1. Perform a lazysort! on tree to obtain the optimal tree for a lazy felsenstein! prop, or a sample_down!.
  2. Fix tree.parent_message to an initial message.
  3. Preallocate sufficiently many inner partitions needed for a felsenstein! prop, or a sample_down!.
  4. Specialized preparations based on the direction of the operations (forward!, backward!). LazyDown or LazyUp.

See also LazyDown, LazyUp.

source
MolecularEvolution.LazyDownType

Constructors

LazyDown(stores_obs)
-LazyDown() = LazyDown(x::FelNode -> true)

Description

Indicate that we want to do a downward pass, e.g. sample_down!. The function passed to the constructor takes a node::FelNode as input and returns a Bool that decides if node stores its observations.

source
+lazyprep!(tree, initial_partition, direction=LazyDown(isleafnode))
Note

Now, we provided a direction for lazyprep!. The direction is an instance of LazyDown, which was initialized with the isleafnode function. The function isleafnode dictates if a node saves its sampled observation after a down pass. If you use direction=LazyDown(), every node saves its observation.

Surrounding LazyPartition

MolecularEvolution.lazyprep!Function
lazyprep!(tree::FelNode, initial_message::Vector{<:Partition}; partition_list = 1:length(tree.message), direction::LazyDirection = LazyUp())

Extra, intermediate step of tree preparations between initializing messages across the tree and calling message passing algorithms with LazyPartition.

  1. Perform a lazysort! on tree to obtain the optimal tree for a lazy felsenstein! prop, or a sample_down!.
  2. Fix tree.parent_message to an initial message.
  3. Preallocate sufficiently many inner partitions needed for a felsenstein! prop, or a sample_down!.
  4. Specialized preparations based on the direction of the operations (forward!, backward!). LazyDown or LazyUp.

See also LazyDown, LazyUp.

source
MolecularEvolution.LazyDownType

Constructors

LazyDown(stores_obs)
+LazyDown() = LazyDown(x::FelNode -> true)

Description

Indicate that we want to do a downward pass, e.g. sample_down!. The function passed to the constructor takes a node::FelNode as input and returns a Bool that decides if node stores its observations.

source
diff --git a/dev/optimization/index.html b/dev/optimization/index.html index 7e5ef89..7e8ff81 100644 --- a/dev/optimization/index.html +++ b/dev/optimization/index.html @@ -70,4 +70,4 @@ LL: -3782.322906364547 LL: -3782.321183009534 LL: -3782.3210398963506 -LL: -3782.3210271696703
Warning

tree_polish! probably won't find a good tree from a completely start. Different tree search heuristics are required for that.

Functions

MolecularEvolution.reversibleQFunction
reversibleQ(param_vec,eq_freqs)

Takes a vector of parameters and equilibrium frequencies and returns a reversible rate matrix. The parameters are the upper triangle of the rate matrix, with the diagonal elements omitted, and the equilibrium frequencies are multiplied column-wise.

source
MolecularEvolution.unc2probvecFunction
unc2probvec(v)

Takes an array of N-1 unbounded values and returns an array of N values that sums to 1. Typically useful for optimizing over categorical probability distributions.

source
MolecularEvolution.branchlength_optim!Function
branchlength_optim!(tree::FelNode, models; partition_list = nothing, tol = 1e-5, bl_optimizer::UnivariateOpt = GoldenSectionOpt())

Uses golden section search, or optionally Brent's method, to optimize all branches recursively, maintaining the integrity of the messages. Requires felsenstein!() to have been run first. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partitionlist (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over (but you probably want to optimize branch lengths with all models). tol is the absolute tolerance for the bloptimizer which defaults to golden section search, and has Brent's method as an option by setting bl_optimizer=BrentsMethodOpt().

source
MolecularEvolution.nni_optim!Function
nni_optim!(tree::FelNode, models; partition_list = nothing, tol = 1e-5)

Considers local branch swaps for all branches recursively, maintaining the integrity of the messages. Requires felsenstein!() to have been run first. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partitionlist (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over (but you probably want to optimize tree topology with all models). accrule allows you to specify a function that takes the current and proposed log likelihoods, and if true is returned the move is accepted.

source
MolecularEvolution.tree_polish!Function

tree_polish!(newt, models; tol = 10^-4, verbose = 1, topology = true)

Takes a tree and a model function, and optimizes branch lengths and, optionally, topology. Returns final LL. Set verbose=0 to suppress output. Note: This is not intended for an exhaustive tree search (which requires different heuristics), but rather to polish a tree that is already relatively close to the optimum.

source
+LL: -3782.3210271696703
Warning

tree_polish! probably won't find a good tree from a completely start. Different tree search heuristics are required for that.

Functions

MolecularEvolution.reversibleQFunction
reversibleQ(param_vec,eq_freqs)

Takes a vector of parameters and equilibrium frequencies and returns a reversible rate matrix. The parameters are the upper triangle of the rate matrix, with the diagonal elements omitted, and the equilibrium frequencies are multiplied column-wise.

source
MolecularEvolution.unc2probvecFunction
unc2probvec(v)

Takes an array of N-1 unbounded values and returns an array of N values that sums to 1. Typically useful for optimizing over categorical probability distributions.

source
MolecularEvolution.branchlength_optim!Function
branchlength_optim!(tree::FelNode, models; partition_list = nothing, tol = 1e-5, bl_optimizer::UnivariateOpt = GoldenSectionOpt())

Uses golden section search, or optionally Brent's method, to optimize all branches recursively, maintaining the integrity of the messages. Requires felsenstein!() to have been run first. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partitionlist (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over (but you probably want to optimize branch lengths with all models). tol is the absolute tolerance for the bloptimizer which defaults to golden section search, and has Brent's method as an option by setting bl_optimizer=BrentsMethodOpt().

source
MolecularEvolution.nni_optim!Function
nni_optim!(tree::FelNode, models; partition_list = nothing, tol = 1e-5)

Considers local branch swaps for all branches recursively, maintaining the integrity of the messages. Requires felsenstein!() to have been run first. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partitionlist (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over (but you probably want to optimize tree topology with all models). accrule allows you to specify a function that takes the current and proposed log likelihoods, and if true is returned the move is accepted.

source
MolecularEvolution.tree_polish!Function

tree_polish!(newt, models; tol = 10^-4, verbose = 1, topology = true)

Takes a tree and a model function, and optimizes branch lengths and, optionally, topology. Returns final LL. Set verbose=0 to suppress output. Note: This is not intended for an exhaustive tree search (which requires different heuristics), but rather to polish a tree that is already relatively close to the optimum.

source
diff --git a/dev/simulation/index.html b/dev/simulation/index.html index 8a98606..6e5aa11 100644 --- a/dev/simulation/index.html +++ b/dev/simulation/index.html @@ -58,4 +58,4 @@ df.names = [n.name for n in getleaflist(tree)] df.seqs = [partition2obs(n.message[1]) for n in getleaflist(tree)] df.mu = [partition2obs(n.message[2]) for n in getleaflist(tree)] -CSV.write("flu_sim_seq_and_bm.csv",df)

Or we could export just the sequences as .fasta

write_fasta("flu_sim_seq_and_bm.fasta",df.seqs,seq_names = df.names)

Which will look something like this, when opened in AliView

Functions

MolecularEvolution.sim_treeFunction
sim_tree(add_limit::Int,Ne_func,sample_rate_func; nstart = 1, time = 0.0, mutation_rate = 1.0, T = Float64)

Simulates a tree of type FelNode{T}. Allows an effective population size function (Nefunc), as well as a sample rate function (samplerate_func), which can also just be constants.

Nefunc(t) = (sin(t/10)+1)*100.0 + 10.0 root = simtree(600,Nefunc,1.0) simpletree_draw(ladderize(root))

source
sim_tree(;n = 10)

Simulates tree with constant population size.

source
MolecularEvolution.sample_down!Function

sampledown!(root::FelNode,models,partitionlist)

Generates samples under the model. The root.parentmessage is taken as the starting distribution, and node.message contains the sampled messages. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partitionlist (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.

source
MolecularEvolution.partition2obsFunction
partition2obs(part::Partition)

Extracts the most likely state from a Partition, transforming it into a convenient type. For example, a NucleotidePartition will be transformed into a nucleotide sequence of type String. Note: You should overload this for your own Partititon types.

source
+CSV.write("flu_sim_seq_and_bm.csv",df)

Or we could export just the sequences as .fasta

write_fasta("flu_sim_seq_and_bm.fasta",df.seqs,seq_names = df.names)

Which will look something like this, when opened in AliView

Functions

MolecularEvolution.sim_treeFunction
sim_tree(add_limit::Int,Ne_func,sample_rate_func; nstart = 1, time = 0.0, mutation_rate = 1.0, T = Float64)

Simulates a tree of type FelNode{T}. Allows an effective population size function (Nefunc), as well as a sample rate function (samplerate_func), which can also just be constants.

Nefunc(t) = (sin(t/10)+1)*100.0 + 10.0 root = simtree(600,Nefunc,1.0) simpletree_draw(ladderize(root))

source
sim_tree(;n = 10)

Simulates tree with constant population size.

source
MolecularEvolution.sample_down!Function

sampledown!(root::FelNode,models,partitionlist)

Generates samples under the model. The root.parentmessage is taken as the starting distribution, and node.message contains the sampled messages. models can either be a single model (if the messages on the tree contain just one Partition) or an array of models, if the messages have >1 Partition, or a function that takes a node, and returns a Vector{<:BranchModel} if you need the models to vary from one branch to another. partitionlist (eg. 1:3 or [1,3,5]) lets you choose which partitions to run over.

source
MolecularEvolution.partition2obsFunction
partition2obs(part::Partition)

Extracts the most likely state from a Partition, transforming it into a convenient type. For example, a NucleotidePartition will be transformed into a nucleotide sequence of type String. Note: You should overload this for your own Partititon types.

source
diff --git a/dev/viz/index.html b/dev/viz/index.html index b8c3ac1..10af76e 100644 --- a/dev/viz/index.html +++ b/dev/viz/index.html @@ -30,9 +30,9 @@ for n in getnodelist(tree) compose_dict[n] = (x,y)->pie_chart(x,y,d[n][1].state[:,1],size = 0.02, opacity = 0.75) end -img = tree_draw(tree,draw_labels = false, line_width = 0.5mm, compose_dict = compose_dict)

This can then be exported with:

savefig_tweakSVG("piechart_tree.svg",img)

Functions

MolecularEvolution.get_phylo_treeFunction
get_phylo_tree(molev_root::FelNode; data_function = (x -> Tuple{String,Float64}[]))

Converts a FelNode tree to a Phylo tree. The data_function should return a list of tuples of the form (key, value) to be added to the Phylo tree data Dictionary. Any key/value pairs on the FelNode node_data Dict will also be added to the Phylo tree.

source
MolecularEvolution.values_from_phylo_treeFunction
values_from_phylo_tree(phylo_tree, key)
+img = tree_draw(tree,draw_labels = false, line_width = 0.5mm, compose_dict = compose_dict)

This can then be exported with:

savefig_tweakSVG("piechart_tree.svg",img)

Functions

MolecularEvolution.get_phylo_treeFunction
get_phylo_tree(molev_root::FelNode; data_function = (x -> Tuple{String,Float64}[]))

Converts a FelNode tree to a Phylo tree. The data_function should return a list of tuples of the form (key, value) to be added to the Phylo tree data Dictionary. Any key/value pairs on the FelNode node_data Dict will also be added to the Phylo tree.

source
MolecularEvolution.values_from_phylo_treeFunction
values_from_phylo_tree(phylo_tree, key)
 
-Returns a list of values from the given key in the nodes of the phylo_tree, in an order that is somehow compatible with the order the nodes get plotted in.
source
MolecularEvolution.savefig_tweakSVGFunction
savefig_tweakSVG(fname, plot::Plots.Plot; hack_bounding_box = true, new_viewbox = nothing, linecap_round = true)

Note: Might only work if you're using the GR backend!! Saves a figure created using the Phylo Plots recipe, but tweaks the SVG after export. new_viewbox needs to be an array of 4 numbers, typically starting at [0 0 plot_width*4 plot_height*4] but this lets you add shifts, in case the plot is getting cut off.

eg. savefig_tweakSVG("export.svg",pl, new_viewbox = [-100, -100, 3000, 4500])

source
savefig_tweakSVG(fname, plot::Context; width = 10cm, height = 10cm, linecap_round = true, white_background = true)

Saves a figure created using the Compose approach, but tweaks the SVG after export.

eg. savefig_tweakSVG("export.svg",pl)

source
MolecularEvolution.tree_drawFunction
tree_draw(tree::FelNode;
+Returns a list of values from the given key in the nodes of the phylo_tree, in an order that is somehow compatible with the order the nodes get plotted in.
source
MolecularEvolution.savefig_tweakSVGFunction
savefig_tweakSVG(fname, plot::Plots.Plot; hack_bounding_box = true, new_viewbox = nothing, linecap_round = true)

Note: Might only work if you're using the GR backend!! Saves a figure created using the Phylo Plots recipe, but tweaks the SVG after export. new_viewbox needs to be an array of 4 numbers, typically starting at [0 0 plot_width*4 plot_height*4] but this lets you add shifts, in case the plot is getting cut off.

eg. savefig_tweakSVG("export.svg",pl, new_viewbox = [-100, -100, 3000, 4500])

source
savefig_tweakSVG(fname, plot::Context; width = 10cm, height = 10cm, linecap_round = true, white_background = true)

Saves a figure created using the Compose approach, but tweaks the SVG after export.

eg. savefig_tweakSVG("export.svg",pl)

source
MolecularEvolution.tree_drawFunction
tree_draw(tree::FelNode;
     canvas_width = 15cm, canvas_height = 15cm,
     stretch_for_labels = 2.0, draw_labels = true,
     line_width = 0.1mm, font_size = 4pt,
@@ -61,4 +61,4 @@
 img |> SVG("imgout.svg",10cm, 10cm)
 OR
 using Cairo
-img |> PDF("imgout.pdf",10cm, 10cm)
source
+img |> PDF("imgout.pdf",10cm, 10cm)source