Question

In: Statistics and Probability

Please use R or Rstudio for this exercise and show everything, including the R output. Pay...

Please use R or Rstudio for this exercise and show everything, including the R output. Pay attention in everything in Bold, please.

" The quality of Pinot Noir wine is thought to be related to the properties of clarity, aroma, body, flavor, and oakiness. Data for 38 wines are given in stat5_prob1.

(a) Fit a multiple linear regression model relating wine quality to these regressors.

(b) Construct the ANOVA table.

(c) Test for the significance of the regression in a 0.05 significance level. What conclu- sions can you draw?

(d) Use the t tests to assess the individual contribution of each regressor to the model in a 0.05 significance level. Discuss your findings.

(e) What is the contribution of the set of clarity and aroma to the model, given that all of the other regressors are included? Perform this hypothesis test using 0.05 significance level.

(f) Find a 95% confidence interval for the regression coefficient for flavor.

(g) Calculate R^2 and R^2 adj for this model. Compare these values to the R^2 and R^2 adj for the regression model relating wine quality to aroma and flavor. Discuss your results.

***Here is the data for the 38 wines***

# quality is y
# clarity is x1, aroma is x2, body is x3, flavor is x4, oakiness is x5.

y=c(9.8, 12.6, 11.9, 11.1, 13.3, 12.8, 12.8, 12, 13.6, 13.9, 14.4, 12.3, 16.1, 16.1, 15.5, 15.5, 13.8, 13.8, 11.3, 7.9, 15.1, 13.5, 10.8, 9.5, 12.7, 11.6, 11.7, 11.9, 10.8, 8.5, 10.7, 9.1, 12.1, 14.9, 13.5, 12.2, 10.3, 13.2)

x1=c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.5, 0.8, 0.7, 1, 0.9, 1, 1, 1, 0.9, 0.9, 1, 0.7, 0.7, 1, 1, 1, 1, 1, 1, 1, 0.8, 1, 1, 0.8, 0.8, 0.8, 0.8)

x2=c(3.3, 4.4, 3.9, 3.9, 5.6, 4.6, 4.8, 5.3, 4.3, 4.3, 5.1, 3.3, 5.9, 7.7, 7.1, 5.5, 6.3, 5, 4.6, 3.4, 6.4, 5.5, 4.7, 4.1, 6, 4.3, 3.9, 5.1, 3.9, 4.5, 5.2, 4.2, 3.3, 6.8, 5, 3.5, 4.3, 5.2)

x3=c(2.8, 4.9, 5.3, 2.6, 5.1, 4.7, 4.8, 4.5, 4.3, 3.9, 4.3, 5.4, 5.7, 6.6, 4.4, 5.6, 5.4, 5.5, 4.1, 5, 5.4, 5.3, 4.1, 4, 5.4, 4.6, 4, 4.9, 4.4, 3.7, 4.3, 3.8, 3.5, 5, 5.7, 4.7, 5.5, 4.8)

x4=c(3.1, 3.5, 4.8, 3.1, 5.5, 5, 4.8, 4.3, 3.9, 4.7, 4.5, 4.3, 7, 6.7, 5.8, 5.6, 4.8, 5.5, 4.3, 3.4, 6.6, 5.3, 5, 4.1, 5.7, 4.7, 5.1, 5, 5, 2.9, 5, 3, 4.3, 6, 5.5, 4.2, 3.5, 5.7)

x5=c(4.1, 3.9, 4.7, 3.6, 5.1, 4.1, 3.3, 5.2, 2.9, 3.9, 3.6, 3.6, 4.1, 3.7, 4.1, 4.4, 4.6, 4.1, 3.1, 3.4, 4.8, 3.8, 3.7, 4, 4.7, 4.9, 5.1, 5.1, 4.4, 3.9, 6, 4.7, 4.5, 5.2, 4.8, 3.3, 5.8, 3.5). "

Solutions

Expert Solution

y=c(9.8, 12.6, 11.9, 11.1, 13.3, 12.8, 12.8, 12, 13.6, 13.9, 14.4, 12.3, 16.1, 16.1, 15.5, 15.5, 13.8, 13.8, 11.3, 7.9, 15.1, 13.5, 10.8, 9.5, 12.7, 11.6, 11.7, 11.9, 10.8, 8.5, 10.7, 9.1, 12.1, 14.9, 13.5, 12.2, 10.3, 13.2)
x1=c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.5, 0.8, 0.7, 1, 0.9, 1, 1, 1, 0.9, 0.9, 1, 0.7, 0.7, 1, 1, 1, 1, 1, 1, 1, 0.8, 1, 1, 0.8, 0.8, 0.8, 0.8)

x2=c(3.3, 4.4, 3.9, 3.9, 5.6, 4.6, 4.8, 5.3, 4.3, 4.3, 5.1, 3.3, 5.9, 7.7, 7.1, 5.5, 6.3, 5, 4.6, 3.4, 6.4, 5.5, 4.7, 4.1, 6, 4.3, 3.9, 5.1, 3.9, 4.5, 5.2, 4.2, 3.3, 6.8, 5, 3.5, 4.3, 5.2)

x3=c(2.8, 4.9, 5.3, 2.6, 5.1, 4.7, 4.8, 4.5, 4.3, 3.9, 4.3, 5.4, 5.7, 6.6, 4.4, 5.6, 5.4, 5.5, 4.1, 5, 5.4, 5.3, 4.1, 4, 5.4, 4.6, 4, 4.9, 4.4, 3.7, 4.3, 3.8, 3.5, 5, 5.7, 4.7, 5.5, 4.8)

x4=c(3.1, 3.5, 4.8, 3.1, 5.5, 5, 4.8, 4.3, 3.9, 4.7, 4.5, 4.3, 7, 6.7, 5.8, 5.6, 4.8, 5.5, 4.3, 3.4, 6.6, 5.3, 5, 4.1, 5.7, 4.7, 5.1, 5, 5, 2.9, 5, 3, 4.3, 6, 5.5, 4.2, 3.5, 5.7)

x5=c(4.1, 3.9, 4.7, 3.6, 5.1, 4.1, 3.3, 5.2, 2.9, 3.9, 3.6, 3.6, 4.1, 3.7, 4.1, 4.4, 4.6, 4.1, 3.1, 3.4, 4.8, 3.8, 3.7, 4, 4.7, 4.9, 5.1, 5.1, 4.4, 3.9, 6, 4.7, 4.5, 5.2, 4.8, 3.3, 5.8, 3.5)
lm_y <- lm(y ~ x1 + x2 + x3 + x4 + x5)
summary(lm_y)
lm_y_best <- lm(y ~ x4 + x5)
summary(lm_y_best)


Related Solutions

Please use RStudio to answer the question and give the R command: please load data use...
Please use RStudio to answer the question and give the R command: please load data use data: library(MASS) data(cats) Use the “cats” data set to test for the variance of the body weight in male and female cats
Please answer everything in R programming language. Show the code to me as well. Thank You...
Please answer everything in R programming language. Show the code to me as well. Thank You 1. Problem Open dataset stat500. Package: faraway. Use R (a) Calculate the correlation matrix. 10. (b) Plot total vs hw, to see how strong the relationship. (c) Build a simple linear regression: total regressed against midterm. Print model output. (d) Calculate directly the coefficients as in (3) (e) Calculate the Residual standard error s, as in (4). (f) Calculate the standard error of beta_1,...
APPLIED STATISTICS 2 PLEASE USE R, SHOW R CODE AND OUTPUT, with conclusion Let x<-c(1,2,3,4,5,6,7,8), y<-c(4,6,3,7,8,3,9,...
APPLIED STATISTICS 2 PLEASE USE R, SHOW R CODE AND OUTPUT, with conclusion Let x<-c(1,2,3,4,5,6,7,8), y<-c(4,6,3,7,8,3,9, 6.5). By vector operation (other method will get 0 point), find a). the equation of regression line, that is, find a, b. b). Find SSR, SSE c). Find F-value d). Find p-value e). Make your decision, that is, answer the question, can we say y and x have linear relationship at alpha=0.05?.
USE R, SHOW R CODE AND OUTPUT APPLIED STATISTICS 2 Are Angry People More Likely to...
USE R, SHOW R CODE AND OUTPUT APPLIED STATISTICS 2 Are Angry People More Likely to have Heart Disease’? People who get angry easily tend to be more likely to have heart disease. That is the conclusion of a study that followed a random sample of 12,986 people from three locations over about four years. All subjects were free of heart disease at the beginning of the study. The subjects took the Spielberger Trait Anger Scale, which measures how prone...
USE R CODE AND SHOW OUTPUT APPLIED STATISTICS 2 Traditionally, the policy for students’ course grade,...
USE R CODE AND SHOW OUTPUT APPLIED STATISTICS 2 Traditionally, the policy for students’ course grade, >=90, A; between 80 to 89, B, between 70 to 79, C; between 60-69, D; and F, if <60. Now, suppose we use a new grade policy. We just to separate all students into four parts, with the first parts assigning grade A, second parts assigning grade B, then, C, then D (no F). We use the data RecordMath2526.txt to have a try for...
Use R programming language to answer and please so show the code as well. A paper...
Use R programming language to answer and please so show the code as well. A paper manufacturer studied the effect of three vat pressures on the strength of one of its products. Three batches of cellulose were selected at random from the inventory. The company made two production runs for each pressure setting from each batch. As a result, each batch produced a total of six production runs. The data follow. Perform the appropriate analysis. Table is below Batch Pressure...
** Please use only Rstudio and include code ** The target activation force of the buttons...
** Please use only Rstudio and include code ** The target activation force of the buttons on a clicker is 1.967 newtons. Variation exists in activation force due to the nature of the manufacturing process. A sample of 9 clickers showed a mean activation force of 1.88 newtons. The population standard deviation is known to be 0.145 newton. Too much force makes the keys hard to click, while too little force means the keys might be clicked accidentally. We want...
1. Create the following items in R. Show the R output provided by each object. b...
1. Create the following items in R. Show the R output provided by each object. b The matrix     1 2 3 4 5 6 7 8 9 10 11 12     stored as object mat1. c The matrix     1 5 9 2 6 10 3 7 11 4 8 12     stored as object mat2. d A data frame datfr such that the vector 1 2...
PLEASE WRITE IN R CODE. Has to output on R software. (1) The stem length of...
PLEASE WRITE IN R CODE. Has to output on R software. (1) The stem length of soybeans from an experiment are: 20.2, 22.9, 23.3, 20.0, 19.4, 22.0, 22.1, 22.0, 21.9, 21.5, 20.9 a. Create a histogram to visualize the data b. Test "t.test" whether the population mean is different from 22 c. Obtain a 2 sided 98% confidence interval on the true mean using "t.test". d. The researcher, by using "t.test" on a sample size of 11 was assuming that...
Please Use R studio and show all the steps to answer this question NY Marathon 2013...
Please Use R studio and show all the steps to answer this question NY Marathon 2013 the table below shows the winning times (in minutes) for men and women in the new york city marathon between 1978 and 2013. (the race was not run in 2012 because of superstorm sandy.) assuming that performances in the big apple resemble performances elsewhere, we can think of these data as a sample of performance in marathon competitions. Create a 90% confidence interval for...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT