K-Means Clustering in OpenCV

cv2.kmeans()</span> </span>

### Input parameters[](http://docs.opencv.org/trunk/doc/py_tutorials/py_ml/py_kmeans/py_kmeans_opencv/py_kmeans_opencv.html#input-parameters "Permalink to this headline") > 1. **samples** : It should be of **np.float32** data type, and each feature should be put in a single column. > > 2. **nclusters(K)** : Number of clusters required at end > > 3.

**criteria** : It is the iteration termination criteria. When this criteria is satisfied, algorithm iteration stops. Actually, it should be a tuple of 3 parameters. They are ( type, max_iter, epsilon ):

> > *

3.a - type of termination criteria : It has 3 flags as below:: > > **cv2.TERM_CRITERIA_EPS** - stop the algorithm iteration if specified accuracy, _epsilon_, is reached. **cv2.TERM_CRITERIA_MAX_ITER** - stop the algorithm after the specified number of iterations, _max_iter_. **cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER** - stop the iteration when any of the above condition is met. >

> * 3.b - max_iter - An integer specifying maximum number of iterations. > > * 3.c - epsilon - Required accuracy

> 4. **attempts** : Flag to specify the number of times the algorithm is executed using different initial labellings. The algorithm returns the labels that yield the best compactness. This compactness is returned as output. > > 5. **flags** : This flag is used to specify how initial centers are taken. Normally two flags are used for this :**cv2.KMEANS_PP_CENTERS** and **cv2.KMEANS_RANDOM_CENTERS**.

### Output parameters[](http://docs.opencv.org/trunk/doc/py_tutorials/py_ml/py_kmeans/py_kmeans_opencv/py_kmeans_opencv.html#output-parameters "Permalink to this headline") > 1. **compactness** : It is the sum of squared distance from each point to their corresponding centers. > 2. **labels** : This is the label array (same as ‘code’ in previous article) where each element marked ‘0’, ‘1’..... > 3. **centers** : This is array of centers of clusters.

示例：


1.  `import numpy as np`
2.  `import cv2`
3.  `from matplotlib import pyplot as plt`
4.5.  `X = np.random.randint(25,50,(25,2))`
6.  `Y = np.random.randint(60,85,(25,2))`
7.  `Z = np.vstack((X,Y))`
8.9.  `# convert to np.float32`
10.  `Z = np.float32(Z)`
11.12.  `# define criteria and apply kmeans()`
13.  `criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)`
14.  `ret,label,center=cv2.kmeans(Z,2,None,criteria,10,cv2.KMEANS_RANDOM_CENTERS)`
15.16.  `# Now separate the data, Note the flatten()`
17.  `A = Z[label.ravel()==0]`
18.  `B = Z[label.ravel()==1]`
19.20.  `# Plot the data`
21.  `plt.scatter(A[:,0],A[:,1])`
22.  `plt.scatter(B[:,0],B[:,1],c = 'r')`
23.  `plt.scatter(center[:,0],center[:,1],s = 80,c = 'y', marker = 's')`
24.  `plt.xlabel('Height'),plt.ylabel('Weight')`
25.  `plt.show()`

![](http://codeplanet-wordpress.stor.sinaapp.com/uploads/2015/01/wpid-d2ab590efc92b80dda0dbf8a3363cb77_f29be1bf-f12c-4f1e-b005-cfaf99fe5156.png)

[来自为知笔记(Wiz)](http://www.wiz.cn/i/c09d0bb5 "来自为知笔记(Wiz)")