Question

In: Statistics and Probability

Using the ruspini dataset provided with the cluster package in R, perform a k-means analysis. Document...

Using the ruspini dataset provided with the cluster package in R, perform a k-means analysis. Document the findings and justify the choice of K. Hint: use data(ruspini) to load the dataset into the R workspace.

Solutions

Expert Solution

r code:

require(cluster)

x=ruspini

kmeans(x, k)

output is as above


Related Solutions

COMPLETE A LOGISTIC REGRESSION, AS WELL AS A K-MEANS CLUSTER ANALYSIS IN EXCEL? Using the data...
COMPLETE A LOGISTIC REGRESSION, AS WELL AS A K-MEANS CLUSTER ANALYSIS IN EXCEL? Using the data to find four clusters of cities. Write a short report about the clusters you find. Does the clustering make sense? Can you provide descriptive, meaningful names for the clusters? SHOW GRAPHS PLEASE (Scatter plot/cluster) Metropolitan_Area Cost_Living Transportation Jobs Education Abilene, TX 96.32 36.54 17.28 49.29 Akron, OH 47.31 69.68 86.11 71.95 Albany, GA 86.12 28.02 32.01 26.62 Albany-Schenectady-Troy, NY 25.22 82.71 52.97 99.43 Albuquerque,...
One way to cluster objects is called k-means clustering. The goal is to find k different...
One way to cluster objects is called k-means clustering. The goal is to find k different clusters, each represented by a "prototype", defined as the centroid of cluster. The centroid is computed as follows: the jth value in the centroid is the mean (average) of the jth values of all the members of the cluster. Our goal is for every member a cluster to be closer to that cluster's prototype than to any of the other prototypes. Thus a prototype...
Using the Motor Trend Car Road Tests dataset mtcars, in faraway R package, fit a model...
Using the Motor Trend Car Road Tests dataset mtcars, in faraway R package, fit a model with mpg: Miles/(US) gallon as the response and the other variables as predictors. (a) Which variables are statistically significant at the 5% level? For each and every test provide the null and alternative hypotheses, critical region (or rejection region), test statistics and your conclusions. (30) (b) What interpretation should be given to the coefficient for vs: Engine? (3) (c) Compute 90 and 95% confidence...
ANSWER USING R CODE Using the dataset 'LakeHuron' which is a built in R dataset describing...
ANSWER USING R CODE Using the dataset 'LakeHuron' which is a built in R dataset describing the level in feet of Lake Huron from 1872- 1972. To assign the values into an ordinary vector,x, we can do the following 'x <- as.vector(LakeHuron)'. From there, we can access the data easily. Assume the values in X are a random sample from a normal population with distribution X. Also assume the X has an unknown mean and unknown standard deviation. With this...
Fitting a linear model using R a. Read the Toluca.txt dataset into R (this dataset can...
Fitting a linear model using R a. Read the Toluca.txt dataset into R (this dataset can be found on Canvas). Now fit a simple linear regression model with X = lotSize and Y = workHrs. Summarize the output from the model: the least square estimators, their standard errors, and corresponding p-values. b. Draw the scatterplot of Y versus X and add the least squares line to the scatterplot. c. Obtain the fitted values ˆyi and residuals ei . Print the...
Describe the advantages of using R to perform basic statistical analysis, as compared to using Microsoft...
Describe the advantages of using R to perform basic statistical analysis, as compared to using Microsoft Excel's Data Analysis add-in Descriptive Statistics tool. Provide specific examples that justify the advantages you have described.
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In...
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In the study, three treatments (Treat) were applied to groups of young female anorexia patients, and their weights before (Prewt) and after (Postwt) treatment were recorded. The three treatments adminstered were no treatment (Cont), Cognitive Behavioural treatment (CBT), and family treatment (FT). Determine at the 5% significance level if there is a difference in mean weight gain between those receiving no treatment and those receiving...
Install and load the dataset named Carseats (in the ISLR package) into R. Run a multiple...
Install and load the dataset named Carseats (in the ISLR package) into R. Run a multiple linear regression with all the variables. Using the coefficients, write down the model. ( be careful with the qualitative variable ShelveLoc. ) obtain the interaction plot of ShelveLoc and price.
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In...
The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In the study, three treat- ments (Treat) were applied to groups of young female anorexia patients, and their weights before (Prewt) and after (Postwt) treatment were recorded. The three treatments adminstered were no treatment (Cont), Cognitive Behavioural treatment (CBT), and family treatment (FT). Determine at the 5% significance level if there is a difference in mean weight gain between those receiving no treatment and those...
1. The dataset prostate (in R package ”faraway”) is from a study on 97 men with...
1. The dataset prostate (in R package ”faraway”) is from a study on 97 men with prostatecancer who were due to receive a radical prostatectomy.Fit a model withlpsa(y) as the response variable andlcavol(x) as the predictor andanswer the following question: •Calculate and plot the 90%confidenceandpredictionbands. Which type ofintervals are wider?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT