All about Artificial Intelligence, Machine Learning, Deep Learning and Data Science: K-Means Clustering

Monday, 6 July 2020

K-Means Clustering

K-Means clustering is an unsupervised learning technique. This technique is used to group the data points that are showing similar characteristics and dissimilar from others

In K-Means, K represents number of clusters

Advantages

Scales well
Efficient

Disadvantages

Choosing K

When to Use?

Normally distributed data
Large number of samples
Limited number of clusters

Use Cases

Document classification
Customer segmentation

Python Code

from sklearn.cluster import KMeans

import numpy as np

X = np.array([[1, 2], [1, 4], [1, 0],

... [10, 2], [10, 4], [10, 0]])

kmeans = KMeans(n_clusters=2, random_state=0).fit(X)

kmeans.labels_

array([1, 1, 1, 0, 0, 0], dtype=int32)

kmeans.predict([[0, 0], [12, 3]])

array([1, 0], dtype=int32)

kmeans.cluster_centers_

array([[10., 2.],

[ 1., 2.]])

All about Artificial Intelligence, Machine Learning, Deep Learning and Data Science

Monday, 6 July 2020

K-Means Clustering

No comments:

Post a Comment

Blog Archive