Clustering Dirichlet distributed data evolving in time, using the EM algorithm
In my master's project I extend the possibilities of the EM algorithm to cluster a Dirichlet distributed data evolving in time. Project consists of a number of steps:
- Programming a simple EM algorithm for normally distrubuted data
- Adjusting the algorithm for the dirichlet distribution
- Developing the model suitable for a data consisting of observations evolving in time
- Testing the algorithm on a synthetic data set
- Cleaning and preparing a real-life data set - marital statuses in countries in different group ages
- Applying the algorithm for the real data