Question

In: Statistics and Probability

Use the data below and find the clusters using a single link technique. Use Euclidean distance...

Use the data below and find the clusters using a single link technique. Use Euclidean distance and draw the dendrogram.

X Y
P1 0.35 0.48
P2 0.17 0.33
P3 0.3 0.28
P4 0.21 0.18
P5 0.08 0.29

Solutions

Expert Solution

Sol:

Distance between two clusters is the shortest distance between two points in each cluster.

Obtain dissimilarity matrix using dist function and specify method =euclidean.

obtain heirrachial clustering using single linkage.

and obtain dendogram with plot function in R

Rcode:

df1 =read.table(header = TRUE, text ="
   X   Y
P1   0.35   0.48
P2   0.17   0.33
P3   0.3   0.28
P4   0.21   0.18
P5   0.08   0.29
"
)
df1

dm <- dist(df1, method = "euclidean")
hc1us <- hclust(dm, method = "single" )
plot(hc1us, cex = 0.6, hang = -1)

Output:


Related Solutions

Use the k-means algorithm and Euclidean distance to cluster the following eight examples into three clusters:...
Use the k-means algorithm and Euclidean distance to cluster the following eight examples into three clusters: A1 = (26, 18), A2 = (20, 26), A3(14, 20), A4(24, 20), A5(14, 30), A6(22, 18), A7(8, 18), A8(12, 14) a. Suppose that the initial seeds (centres of each cluster) are A2, A3, and A8. Run the k-means algorithm for one epoch only. At the end of this epoch, o show the new clusters (i.e., the examples belonging to each cluster); o show the...
Use the below data to find (using an interval of 5 or 6 or as in...
Use the below data to find (using an interval of 5 or 6 or as in Excel calls it bin width of 5 or 6): Use the data below only and use your choice of software to do the Histogram – You can use any Histogram tool whether it is Excel or another application from the Internet. Be sure that the Histogram shows the classes and frequencies. Include a Title of your histogram. A) Paste the Histogram below B) Describe...
INTRODUCTION TO DATA MINING Question 3: K-means clustering Use the k-means algorithm and Euclidean distance to...
INTRODUCTION TO DATA MINING Question 3: K-means clustering Use the k-means algorithm and Euclidean distance to cluster the following seven examples into two clusters: A1=(1, 1), A2=(1.5, 2), A3=(3,4), A4=(5,7), A5=(3.5,5), A6=(4.5,5), A7=(3.5,4.5) Suppose that the initial seeds (centers of each cluster) are A1 and A4. Run the k-means algorithm for 2 epochs. At the end of this epoch show: a) Distance matrix by calculating Euclidean distance. b) The new clusters (i.e. the examples belonging to each cluster) c) The...
Deseasonalize the data and use an appropriate technique to find the seasonally unadjusted forecasts for the...
Deseasonalize the data and use an appropriate technique to find the seasonally unadjusted forecasts for the next year. Include your results here, showing the deseasonalized data and the seasonally unadjusted forecasts. Period Year Quarter Sales 1 1 1 40 2 1 2 60 3 1 3 65 4 1 4 80 5 2 1 44 6 2 2 70 7 2 3 72 8 2 4 91 9 3 1 46 10 3 2 77 11 3 3 74 12...
a. Using the original values, compute the Euclidean distance between the first two observations. (Round intermediate...
a. Using the original values, compute the Euclidean distance between the first two observations. (Round intermediate calculations to at least 4 decimal places and your final answer to 2 decimal places.) Euclidean distance between observations 1 and 2 ___________ b. Using the original values, compute the Manhattan distance between the first two observations. (Round your final answer to 2 decimal places.) Manhattan distance between observations 1 and 2 ___________ c. Use z-scores to standardize the values, and then compute the...
The link to the data is below, just click the link & open up the files...
The link to the data is below, just click the link & open up the files please. Listed under MOISTURE http://www.mediafire.com/download/thnnoaaqqefdwcf/excel_files.zip An important quality characteristic used by the manufacturer of Boston and Vermont asphalt shingles is the amount of moisture the shingles contain when they are packaged. Customers may feel that they have purchased a product lacking in quality if they find moisture and wet shingles inside the packaging. In some cases, excessive moisture can cause the granules attached to...
Use the Euclidean algorithm to find the GCD of 3 + 9i and 7-i
Use the Euclidean algorithm to find the GCD of 3 + 9i and 7-i
Answer the questions below using the appropriate statistical technique. For questions involving the use of hypothesis...
Answer the questions below using the appropriate statistical technique. For questions involving the use of hypothesis testing, you must: 1. State the null and research hypotheses 2. Provide the Z(critical), T(critical), or χ 2 (critical) score corresponding to the α threshold for your test 3. Provide your test statistic 4. Provide your decision about statistical significance A random sample of 350 persons yields a sample mean of 105 and a sample standard deviation of 10. Construct three different confidence intervals...
Answer the questions below using the appropriate statistical technique. For questions involving the use of hypothesis...
Answer the questions below using the appropriate statistical technique. For questions involving the use of hypothesis testing, you must: 1. State the null and research hypotheses 2. Provide the Z(critical), T(critical), or χ 2 (critical) score corresponding to the α threshold for your test 3. Provide your test statistic 4. Provide your decision about statistical significance An advantage that often comes with a basic knowledge of statistics is a change in salary. To see whether this was the case for...
Using Euclidean algorithm, Find integers x and y with 65537x + 3511y = 17.
Using Euclidean algorithm, Find integers x and y with 65537x + 3511y = 17.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT