Python sklearn-gaussian.mixture how to get samples/points in each cluster


Huu son Nguyen

I'm using GMM to cluster a dataset into K groups, my model is working fine, but I can't get the raw data from each cluster, can you guys suggest me some ideas to solve this problem? thank you very much.

Ash

You can do it this way (look at d0, d1 and d2).

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
from pandas import DataFrame 
from sklearn import datasets 
from sklearn.mixture import GaussianMixture 

# load the iris dataset 
iris = datasets.load_iris() 

# select first two columns  
X = iris.data[:, 0:2] 

# turn it into a dataframe 
d = pd.DataFrame(X) 

# plot the data 
plt.scatter(d[0], d[1]) 

gmm = GaussianMixture(n_components = 3) 

# Fit the GMM model for the dataset  
# which expresses the dataset as a  
# mixture of 3 Gaussian Distribution 
gmm.fit(d) 

# Assign a label to each sample 
labels = gmm.predict(d) 
d['labels']= labels 
d0 = d[d['labels']== 0] 
d1 = d[d['labels']== 1] 
d2 = d[d['labels']== 2] 

# here is a possible solution for you:
d0
d1
d2

# plot three clusters in same plot 
plt.scatter(d0[0], d0[1], c ='r') 
plt.scatter(d1[0], d1[1], c ='yellow') 
plt.scatter(d2[0], d2[1], c ='g') 

enter image description here

# print the converged log-likelihood value 
print(gmm.lower_bound_) 

# print the number of iterations needed 
# for the log-likelihood value to converge 
print(gmm.n_iter_)

# it needed 8 iterations for the log-likelihood to converge.

Related


Get PDF from Gaussian Mixture Model in sklearn

learner I have fitted a Gaussian Mixture Model (GMM) to the data series I have. Using GMM, I am trying to get the probability of another vector, element-wise. Matlab achieves this with the following lines of code. a = reshape(0:1:15, 14, 1); gm = fitgmdist(a,

How to make a histogram using sklearn with 1D Gaussian mixture?

Théré Hernandez I want to do a histogram with mixed 1D Gaussian images. Thanks Meng for the photos. My histogram looks like this: I have a file with a lot of data (4,000,000 numbers) in columns: 1.727182 1.645300 1.619943 1.709263 1.614427 1.522313 I'm using

How to make a histogram using sklearn with 1D Gaussian mixture?

Théré Hernandez I want to do a histogram with mixed 1D Gaussian images. Thanks Meng for the photos. My histogram looks like this: I have a file with a lot of data (4,000,000 numbers) in columns: 1.727182 1.645300 1.619943 1.709263 1.614427 1.522313 I'm using

Problems with sklearn.mixture.GMM (Gaussian Mixture Model)

Gabriele Pompa I'm fairly new to scikit-lear and GMM in general...I have some questions about the fit quality of Gaussian mixture models in python (scikit-learn). I have an array of data that you can find in DATA HERE to match a GMM of n=2 components . As a be

Get cluster size in sklearn in python

username I am using sklearn DBSCAN to cluster data as follows. #Apply DBSCAN (sims == my data as list of lists) db1 = DBSCAN(min_samples=1, metric='precomputed').fit(sims) db1_labels = db1.labels_ db1n_clusters_ = len(set(db1_labels)) - (1 if -1 in db1_labels

Get cluster size in sklearn in python

username I am using sklearn DBSCAN to cluster data as follows. #Apply DBSCAN (sims == my data as list of lists) db1 = DBSCAN(min_samples=1, metric='precomputed').fit(sims) db1_labels = db1.labels_ db1n_clusters_ = len(set(db1_labels)) - (1 if -1 in db1_labels

Get cluster size in sklearn in python

username I am using sklearn DBSCAN to cluster data as follows. #Apply DBSCAN (sims == my data as list of lists) db1 = DBSCAN(min_samples=1, metric='precomputed').fit(sims) db1_labels = db1.labels_ db1n_clusters_ = len(set(db1_labels)) - (1 if -1 in db1_labels

Get cluster size in sklearn in python

username I am using sklearn DBSCAN to cluster data as follows. #Apply DBSCAN (sims == my data as list of lists) db1 = DBSCAN(min_samples=1, metric='precomputed').fit(sims) db1_labels = db1.labels_ db1n_clusters_ = len(set(db1_labels)) - (1 if -1 in db1_labels

Get cluster size in sklearn in python

username I am using sklearn DBSCAN to cluster data as follows. #Apply DBSCAN (sims == my data as list of lists) db1 = DBSCAN(min_samples=1, metric='precomputed').fit(sims) db1_labels = db1.labels_ db1n_clusters_ = len(set(db1_labels)) - (1 if -1 in db1_labels

Fitting Gaussian mixture with fixed covariance in Python

Ulf Aslak: I have some 2D data (GPS data) with clusters (stop locations) that I know are similar to Gaussians with characteristic standard deviations (proportional to the inherent noise of GPS samples). The image below shows a sample, I would like it to have t

How to evaluate samples in a weighted Gaussian mixture model?

kind Lite: If I have a MoG model with n components, each component has its own weight w^n. I have a sample. I wish to calculate the probability of drawing samples from the MoG. I can easily evaluate individual Gaussians, but I don't know how to consider their

Sampling data points from Gaussian mixture model python

Yufeng I am really new to python and GMM. I recently learned GMM and tried to implement the code from here I have some problems running the gmm.sample() method: gmm16 = GaussianMixture(n_components=16, covariance_type='full', random_state=0) Xnew = gmm16.s

Sampling data points from Gaussian mixture model python

Yufeng I am really new to python and GMM. I recently learned GMM and tried to implement the code from here I have some problems running the gmm.sample() method: gmm16 = GaussianMixture(n_components=16, covariance_type='full', random_state=0) Xnew = gmm16.s

Semi-Supervised Gaussian Mixture Model Clustering in Python

Avpenn I have images that I want to subdivide using a Gaussian mixture model scikit-learn. Some images have labels, so I want to use a lot of prior information. I would like to do semi-supervised training of a hybrid model by providing some cluster assignments

Semi-Supervised Gaussian Mixture Model Clustering in Python

Avpenn I have images that I want to subdivide using a Gaussian mixture model scikit-learn. Some images have labels, so I want to use a lot of prior information. I would like to do semi-supervised training of a hybrid model by providing some cluster assignments

Understanding Gaussian Mixture Models

Hansner I'm trying to understand the results of the scikit-learn Gaussian Mixture Model implementation. See the example below: #!/opt/local/bin/python import numpy as np import matplotlib.pyplot as plt from sklearn.mixture import GaussianMixture # Define simp

Understanding Gaussian Mixture Models

Hansner I'm trying to understand the results of the scikit-learn Gaussian Mixture Model implementation. See the example below: #!/opt/local/bin/python import numpy as np import matplotlib.pyplot as plt from sklearn.mixture import GaussianMixture # Define simp