Question

In: Computer Science

please show the differences in the kmeans clustering algorithms using graphs: hartigan wong method, macqueen method,...

please show the differences in the kmeans clustering algorithms using graphs:

hartigan wong method, macqueen method, and lloyd floyd method

Solutions

Expert Solution

Hartigan's method for k-means clusteringholds several potential advantages compared to the clas- sical and prevalent optimization heuristic known as Lloyd's algorithm. This algorithm is proven to converge in the sense that after a finite number of such trials the assignments of the data items no longer change.(Figure 2)

K-means (MacQueen, 1967) is one of the simplest unsupervised learning algorithms that solve the well known clustering problem. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume k clusters) fixed a priori. The main idea is to define k centroids, one for each cluster. These centroids shoud be placed in a cunning way because of different location causes different result. So, the better choice is to place them as much as possible far away from each other. The next step is to take each point belonging to a given data set and associate it to the nearest centroid. When no point is pending, the first step is completed and an early groupage is done. At this point we need to re-calculate k new centroids as barycenters of the clusters resulting from the previous step. After we have these k new centroids, a new binding has to be done between the same data set points and the nearest new centroid. A loop has been generated. As a result of this loop we may notice that the k centroids change their location step by step until no more changes are done. In other words centroids do not move any more.
Finally, this algorithm aims at minimizing an objective function, in this case a squared error function. The objective function

,

where is a chosen distance measure between a data point and the cluster centre , is an indicator of the distance of the n data points from their respective cluster centres.

The algorithm is composed of the following steps:

  1. Place K points into the space represented by the objects that are being clustered. These points represent initial group centroids.
  2. Assign each object to the group that has the closest centroid.
  3. When all objects have been assigned, recalculate the positions of the K centroids.
  4. Repeat Steps 2 and 3 until the centroids no longer move. This produces a separation of the objects into groups from which the metric to be minimized can be calculated.

Although it can be proved that the procedure will always terminate, the k-means algorithm does not necessarily find the most optimal configuration, corresponding to the global objective function minimum. The algorithm is also significantly sensitive to the initial randomly selected cluster centres. The k-means algorithm can be run multiple times to reduce this effect.

K-means is a simple algorithm that has been adapted to many problem domains. As we are going to see, it is a good candidate for extension to work with fuzzy feature vectors.

R provides Lloyd's algorithm as an option to kmeans(); the default algorithm, by Hartigan and Wong (1979) is much smarter. Like MacQueen's algorithm (MacQueen, 1967), it updates the centroids any time a point is moved; it also makes clever (time-saving) choices in checking for the closest cluster. On the other hand Lloyd's k-means algorithm is the first and simplest of all these clustering algorithms.

Lloyd's algorithm (Lloyd, 1957) takes a set of observations or cases (think: rows of an nxp matrix, or points in Reals) and clusters them into k groups. It tries to minimize the within-cluster sum of squares

where u_i is the mean of all the points in cluster S_i. The algorithm proceeds as follows (I'll spare you the formality of the exhaustive notation):

There is a problem with R's implementation, however, and the problem arises when considering multiple starting points. I should note that it's generally prudent to consider several dierent starting points, because the algorithm is guaranteed to converge, but is not guaranteed to coverge to a global optima. This is particularly true for large, high-dimensional problems.

The link for the diagrams

https://www.ijcai.org/Proceedings/13/Papers/249.pdf


Related Solutions

1. 3 types of machine learn algorithms - regression, clustering, and classification. Please give examples to...
1. 3 types of machine learn algorithms - regression, clustering, and classification. Please give examples to each of these algorithms to explain what business question can be answered by these algorithms. 2. Please describe the overfitting issue in supervised learning, and what method do we usually use to solve it. 3. Describe the definition and difference between supervised and unsupervised learning.
Using the labor market graphs for both the classical and Keynesian models, please show the effects...
Using the labor market graphs for both the classical and Keynesian models, please show the effects of a decrease in labor demand in the economy
Using fully labeled graphs and words that clearly and fully explain these graphs show the impact...
Using fully labeled graphs and words that clearly and fully explain these graphs show the impact a tariff will have on U.S. consumers, U.S. producers, and the U.S. Treasury. Assume that before the tariff is imposed that U.S. consumers can buy goods at world prices (i.e. there are no restrictions on imports into the U.S.)
Using R to show how classification and clustering, can be applied to classify or cluster data....
Using R to show how classification and clustering, can be applied to classify or cluster data. Make sure to submit your data, results, plots, and your interpretation. a) Please use any dataset to conduct a classification analysis (logistic regression, random forest, or decision tree – better if you apply more than one method and compare) b) Please use any dataset to conduct a clustering analysis (kmeans or hierarchical or random forest – better if you apply more than one method...
Please explain the key differences between RRT* (RRT-star) and A* (A-star) algorithms.
Please explain the key differences between RRT* (RRT-star) and A* (A-star) algorithms.
Using GRAPHS and explanation, show the 3 states of the economy in the short run as...
Using GRAPHS and explanation, show the 3 states of the economy in the short run as compared to the natural rate of GDP.
Using graphs and explanation, show the 3 states of the economy in the short run as...
Using graphs and explanation, show the 3 states of the economy in the short run as compared to the natural rate of GDP.
Using graphs, show the relationship between production and costs, by using marginal product of labor (MPL),...
Using graphs, show the relationship between production and costs, by using marginal product of labor (MPL), Average Product of labor (APL), Marginal Cost (MC), and Average Cost (AC) curves?
Please show your answer and a draw the graphs. A radar unit is used to measure...
Please show your answer and a draw the graphs. A radar unit is used to measure speeds of cars on a motorway. The speeds are normally distributed with a mean of 90 km/hr and a standard deviation of 10 km/hr. What is the probability that a car picked at random is travelling at more than 100 km/hr? For a certain type of computers, the length of time bewteen charges of the battery is normally distributed with a mean of 50...
Using IS and LM graphs, explain PLEASE USE GRAPHS!!!! Monetary policy which targets the interest rate...
Using IS and LM graphs, explain PLEASE USE GRAPHS!!!! Monetary policy which targets the interest rate Monetary policy which targets the money supply When each of the two targeting strategies are the most effective in managing the business cycle
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT