In the project assignment for the Big Data Processing Class, we implemented the clustering algorithm Gaussian Mixture Models - Expectation Maximization method, where we will analyze and determine the number of clusters in the dataset given the information that the raw numbers on it belong to multiple Gaussian distribution. We need to find the weights, means and standard deviation of this multi-modal dataset.
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.