Question

In: Computer Science

INTRODUCTION TO DATA MINING Question 3: K-means clustering Use the k-means algorithm and Euclidean distance to...

INTRODUCTION TO DATA MINING

Question 3: K-means clustering
Use the k-means algorithm and Euclidean distance to cluster the following seven examples into two clusters:
A1=(1, 1), A2=(1.5, 2), A3=(3,4), A4=(5,7), A5=(3.5,5), A6=(4.5,5), A7=(3.5,4.5)
Suppose that the initial seeds (centers of each cluster) are A1 and A4. Run the k-means algorithm for 2 epochs. At
the end of this epoch show:
a) Distance matrix by calculating Euclidean distance.
b) The new clusters (i.e. the examples belonging to each cluster)
c) The centers of the new clusters
d) Where the algorithm should stop?

Solutions

Expert Solution

At the end of second epoch:

Let C1 & C2 represents the cluster means, and after 2nd epoch C1 & C2 turns out to be

C1 = (1.25 , 1.5)

C2 = (3.9 , 5.1)

a) The distance matrix :

A1 A2 A3 A4 A5 A6 A7
C1 0.56 0.56 3.05 6.66 4.16 4.78 3.75
C2 5.02 3.92 1.42 2.2 0.41 0.61 0.72

the euclidean distance formula is given as where is the ith dimension of the points.

For example: The euclidean distance between C1 = (1.25, 1.5) and A1 = (1, 1) can be calculated  as ​​​​​​​

b) The new clusters are as following:

A data point gets associated with the cluster mean which is closest to it hence:

  • C1 has A1 and A2 points.
  • C2 has A3, A4, A5, A6 and A7 points.

c) The centers of the clusters after 2nd epoch are:

C1 = (1.25 , 1.5)

C2 = (3.9 , 5.1)

d) After the 2nd epoch the algorithm should stop.


Related Solutions

Use the k-means algorithm and Euclidean distance to cluster the following eight examples into three clusters:...
Use the k-means algorithm and Euclidean distance to cluster the following eight examples into three clusters: A1 = (26, 18), A2 = (20, 26), A3(14, 20), A4(24, 20), A5(14, 30), A6(22, 18), A7(8, 18), A8(12, 14) a. Suppose that the initial seeds (centres of each cluster) are A2, A3, and A8. Run the k-means algorithm for one epoch only. At the end of this epoch, o show the new clusters (i.e., the examples belonging to each cluster); o show the...
Question: In MATLAB, Implement a hybrid clustering algorithm which combines hierarchical clustering and k-means clustering. The...
Question: In MATLAB, Implement a hybrid clustering algorithm which combines hierarchical clustering and k-means clustering. The hybrid algorithm will use hierarchical clustering to produce stable clusters and k-means clustering will initialize seeds based on the centroids of the produced stable clusters (instead of randomly initialized seeds) Background Information: Both hierarchal clustering and k-means clustering group similar data objects into clusters. However, the two algorithms have their pros and cons. For example, hierarchical clustering produces stable clusters while k-means clustering generates...
In MATLAB, Implement a hybrid clustering algorithm which combines hierarchical clustering and k-means clustering.
In MATLAB, Implement a hybrid clustering algorithm which combines hierarchical clustering and k-means clustering.
What is clustering? Explain how K-Means Clustering Algorithm works? What are the Advantages and disadvantages of...
What is clustering? Explain how K-Means Clustering Algorithm works? What are the Advantages and disadvantages of Clustering ALgorithms discussed in our class (K-Means,Hierchal)? Which Clustering Algorithm is better K-Means or hierarchical Clustering? Explain with a proper example which is better algorithm?
Data mining--> Please Perform Principal Component Analysis and K-Means Clustering on the Give dataset Below. [50...
Data mining--> Please Perform Principal Component Analysis and K-Means Clustering on the Give dataset Below. [50 Points] Dataset Link : https://dataminingcsc6740.s3-us-west-2.amazonaws.com/datasets/homework_2.csv 10 Points for Data Preprocessing. 15 Points for PCA Algorithm along with plots and Results Explaination. 15 Points for K-Means Algorithm with plots and Results Explination. 10 Points for Comparing the results between PCA and K-Means and whats your infer- ence from your ouputs of the algorithms. Hints: As per the data preprocessing step convert all the variables in...
K-means clustering: a. In the k-means lab, you examined different values for k using the "knee"...
K-means clustering: a. In the k-means lab, you examined different values for k using the "knee" heuristic to pick the best value of k. Explain what is so special about the k values on the “knee”? Hint: There are two properties that together make these values of k special. b. Give an example of a type of data (data type) that k-means should not be used for and explain why.
Suppose you have been building a model using the k-means clustering algorithm and you keep finding...
Suppose you have been building a model using the k-means clustering algorithm and you keep finding that a certain variable is essentially ignored by the model (in other words, the variable is very similarly distributed across all clusters). Describe a method that can be used to exaggerate or minimize the impact of a variable when using k-means clustering. Why does this method work? no additional info available, predictive analysis
Use the Euclidean algorithm to find the GCD of 3 + 9i and 7-i
Use the Euclidean algorithm to find the GCD of 3 + 9i and 7-i
Try to use K means clustering to segment an image. You can use Matlab function: kmeans(...
Try to use K means clustering to segment an image. You can use Matlab function: kmeans( )
We've now had an introduction to several different models: Linear regression, logistic regression, k-means, hierarchical clustering,...
We've now had an introduction to several different models: Linear regression, logistic regression, k-means, hierarchical clustering, GMM, Naive Bayes, and decision trees. For this assignment, I would like you to choose three models from the above list and describe two problems that each of the models could potentially be used to solve. You can do one big post with all three models and six solvable problems or do three separate posts if you prefer.   Short Explanation of Decision Trees Decision...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT