each sample is 2. There are 4 centers in the data. And the random_state=1 parameter is used to control the random number generator that is used to generate the dataset. Here, X is the generated samples and y is the integer label for cluster membership of each sample. However, we will use the k-means clustering algorithm to find out the cluster centers.
X, y = make_blobs(n_samples=1000, n_features=2, centers=4, random_state=1)
We are using a scatter plot to plot the generated dataset.
pyplot.scatter(x=X[:, 0], y=X[:, 1])
pyplot.savefig("clusters.png")
pyplot.close()
The scatter plot looks like the following:

Now, we are using the Kmeans() function from the sklearn.cluster module to perform k-means clustering. The fit() method learns from the dataset and finds out the cluster centers. Please note that the n_clusters parameter initializes the value of k, which is 4 here.
kmeans = KMeans(n_clusters=4, random_state=1) kmeans.fit(X)
We are then printing the cluster centers and plotting the cluster centers using a scatter plot.
print("Cluster Centers: \n", kmeans.cluster_centers_)
pyplot.scatter(x=X[:, 0], y=X[:, 1])
pyplot.scatter(x=kmeans.cluster_centers_[:, 0], y=kmeans.cluster_centers_[:, 1], color="black")
pyplot.savefig("cluster-centers.png")
pyplot.close()
The output of the above program will be like the following: …








































0 Comments