The K-Means Clustering algorithm gives a nice introduction to unsupervised machine learning. The algorithm seeks to group similar items together into a set of clusters. The number of clusters (k) is part of the input into the problem, along with the data points themselves. The algorithm breaks down into three basic steps which are repeated until either convergence or until some maximum number of iterations is met. These three steps are as follows:
0. Initially assign a point to each cluster. (Call this the center of that cluster), and set count to 0.
1. For each point in our data set, we find the "closest" cluster center to that point and assign this point to the associated cluster.
2. For each cluster, we calculate the average of the points in that cluster to determine a new cluster center.
3. Increment count.
4. If a point has changed cluster and count is less than the maximum number of iterations, then go to 1.
5. Display the Clusters.