GitHub - ZeratuuLL/Gaussian-mixture-modeling-by-exploiting-the-Mahalanobis-distance: This repo is my replication of a paper, which splits a dataset into a mixture of several gaussian distributions. The number of gaussian components can be found automatically

This repo saves the code and some simple test cases for the paper Gaussian mixture modeling by exploiting the Mahalanobis distance. What the paper did is basically to split a dataset of mixed Gaussian distributions into separate groups which contain only one Gaussian distribution. The number of groups is found by the algorithm automatically.

Here are some differences of my implement from the original paper:

The original algorithm can split a mixture in two ways: into two groups which have different means, or two group sharing a same mean but different covariance structure. Currently my implement only does the first one
Originally the paper does the first kind of split by choosing one dimension of variable and a threshold, but I used KMeans and set number of groups = 2 in KMeams.

From the test cases in the notebook we can see the following:

The algorithm is possible to correctly splits the mixed dataset
The algorithm tends to oversplit, which means it might split a single group into more groups
If given the maximum number of splits, the algorithm works properly

My understanding of the oversplit problem is that the paper uses a hard split, which means every iteration is assigns a group to each sample. Even with a small probability, when the number of samples is large, some samples will be assigned to a wrong group and the group to which the sample is assigned will be 'polluted' so it will be split again. This happens especially to the groups with a small variance. I will look into this and try to solve it.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Gaussian mixture modeling by exploiting the Mahalanobis distance.ipynb		Gaussian mixture modeling by exploiting the Mahalanobis distance.ipynb
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

ZeratuuLL/Gaussian-mixture-modeling-by-exploiting-the-Mahalanobis-distance

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages