A sketch-based semi-Markov learner with approximate duration estimation.
This module implements an on-line (streaming) data structure for 'learning' semi-Markov models. A Markov model is a stochastic model used to model randomly changing systems where it is assumed that future states depend only on the current state, and not on the events (states) that occurred before it [wikipedia]. A semi-Markov process is one in which the probability of there being a change in state additionally depends on the amount of time that has elapsed since entry into the current state.
The SemiMarkov
data structure within this module is designed to take in state change events (state 1 -> state 2
) and their corresponding holding times (duration
), and update a simple semi-Markov model. As each new event is added to the data structure, the data structure...
- Updates
state 1
duration distribution, and - Updates the
state 1
transitioncounter
and set of possible transition statesstate 2
There are tons of resources for computing Markov models, a quick Google will find lots of implementations.
predicure
has not yet been uploaded to PyPi,
as we are currently at the 'pre-release' stage*. Having said that you should be
able to install it via pip
directly from the GitHub repository with:
pip install git+git://github.com/carsonfarmer/predicure.git
You can also install predicure
by cloning the
GitHub repository and using the
setup script:
git clone https://github.com/carsonfarmer/predicure.git
cd addc
python setup.py install
Note that predicure
is written for Python 3 only. While it may work in earlier
versions of Python, no attempt has been made to make it Python 2.x compatible.
* This means the API is not set, and subject to crazy changes at any time!
predicure
comes with a comprehensive a very basic range
of tests. To run the tests, you can use py.test
(maybe also nosetests
?), which can be installed via pip
using the
recommended.txt
file (note, this will also install some other stuff (numpy
,
scipy
, and matplotlib
) which are all great and useful for
tests and examples):
pip install -r recommended.txt
py.test predicure
In the following examples we use the random
module to generate data.
from predicure import SemiMarkov
import random
The simplest way to use a SemiMarkov
data-structure is to initialize one
and then update it with data points (via the update
method). In this first
example... and add them to an SemiMarkov
object:
ac = SemiMarkov().batch(data) # Add paths all at once...
We can then add additional data, and start to query the data-structure. As transitions are added, the data-structure responds and updates accordingly.
To illustrate the use of this data-structure, here is an example plot using
the cluster_data
from above:
import matplotlib.pyplot as plt
plt.figure()
mod.plot()
plt.show()
Copyright © 2016, Carson J. Q. Farmer
Licensed under the BSD-2 License.