Question

In: Math

Please answer this using Rstudio For the oyster data, calculate regression fits (simple regression) for the...

Please answer this using Rstudio

For the oyster data, calculate regression fits (simple regression) for the 2D and 3D data

a.1) Give null and alternative hypotheses

a.2) Fit the regression model

a.3) Summarize the fit and evaluation of the regression model (is the linear relationship significant).

a.4 )Calculate residuals and make a qqplot. Is the normal assumption reasonable?

Actual   2D   3D
13.04   47.907   5.136699
11.71   41.458   4.795151
17.42   60.891   6.453115
7.23   29.949   2.895239
10.03   41.616   3.672746
15.59   48.070   5.728880
9.94   34.717   3.987582
7.53   27.230   2.678423
12.73   52.712   5.481545
12.66   41.500   5.016762
10.53   31.216   3.942783
10.84   41.852   4.052638
13.12   44.608   5.334558
8.48   35.343   3.527926
14.24   47.481   5.679636
11.11   40.976   4.013992
15.35   65.361   5.565995
15.44   50.910   6.303198
5.67   22.895   1.928109
8.26   34.804   3.450164
10.95   37.156   4.707532
7.97   29.070   3.019077
7.34   24.590   2.768160
13.21   48.082   4.945743
7.83   32.118   3.138463
11.38   45.112   4.410797
11.22   37.020   4.558251
9.25   39.333   3.449867
13.75   51.351   5.609681
14.37   53.281   5.292105

Solutions

Expert Solution

a.1) Give null and alternative hypotheses

Let the estimated regression line is,

Actual = + X2D + X3D

Null Hypothesis H0:

Alternative Hypothesis H1: or

a.2) Fit the regression model

Loaded the above data into a dataframe df.

df = read.table("data.txt", header = TRUE)

Run the linear regression model on the given data with Actual as dependent variable.

model = lm(Actual ~ ., data = df)

The output of the model is,

> model

Call:
lm(formula = Actual ~ ., data = df)

Coefficients:
(Intercept) X2D X3D
-0.04645 0.06815 1.93979

a.3) Summarize the fit and evaluation of the regression model (is the linear relationship significant).

The estimated regression equation is,

Actual = -0.04645 + 0.06815 X2D + 1.93979 X3D

Generate the summary report of the model.

> summary(model)

Call:
lm(formula = Actual ~ ., data = df)

Residuals:
Min 1Q Median 3Q Max
-1.4490 -0.3333 -0.0215 0.3746 1.2475

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.04645 0.44195 -0.105 0.91708
X2D 0.06815 0.02297 2.967 0.00623 **
X3D 1.93979 0.20216 9.595 3.42e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5737 on 27 degrees of freedom
Multiple R-squared: 0.9651,   Adjusted R-squared: 0.9625
F-statistic: 373.3 on 2 and 27 DF, p-value: < 2.2e-16

The p-value for F test is < 2.2e-16 which is less than the significance level. So, we reject H0 and conclude that there is significant evidence that or and the linear model is significant.

a.4 )Calculate residuals and make a qqplot. Is the normal assumption reasonable?

The residuals will be generated by the command model$residuals

> model$residuals
1 2 3 4 5 6 7 8 9
-0.14264365 -0.37059721 0.79889624 -0.38080399 0.11586169 1.24754228 -0.11466858 0.52507979 -1.44904467
10 11 12 13 14 15 16 17 18
0.14666218 0.80083412 0.17286799 -0.22161268 -0.72569721 0.03320704 0.57753467 0.14507837 -0.21006581
19 20 21 22 23 24 25 26 27
0.41597079 -0.75812123 -0.66744260 0.17888274 0.34093189 0.38584368 -0.40042976 -0.20406170 -0.09860046
28 29 30
-0.07620812 -0.58484561 0.51964982

qqplot will be generated as second plot of the command plot(model)

Since the data points form a linear line, the normality assumption seems reasonable.


Related Solutions

~~~~~~~~~~~~TO BE COMPLETED USING RSTUDIO~~~~~~~~~~~~~~ ~~~~~~~~~~~~(Please display all RCode used)~~~~~~~~~~~~~~ Regression Is there a relationship between...
~~~~~~~~~~~~TO BE COMPLETED USING RSTUDIO~~~~~~~~~~~~~~ ~~~~~~~~~~~~(Please display all RCode used)~~~~~~~~~~~~~~ Regression Is there a relationship between the number of stories a building has and its height? Some statisticians compiled data on a set of n = 60 buildings reported in the World Almanac. You will use the data set to decide whether height (in feet) can be predicted from the number of stories. data from buildings.txt. (Note that this is a text file, so use the appropriate instruction. If you...
~~~~~~~~~~~~TO BE COMPLETED USING RSTUDIO~~~~~~~~~~~~~~ ~~~~~~~~~~~~(Please display all RCode used)~~~~~~~~~~~~~~ Regression Is there a relationship between...
~~~~~~~~~~~~TO BE COMPLETED USING RSTUDIO~~~~~~~~~~~~~~ ~~~~~~~~~~~~(Please display all RCode used)~~~~~~~~~~~~~~ Regression Is there a relationship between the number of stories a building has and its height? Some statisticians compiled data on a set of n = 60 buildings reported in the World Almanac. You will use the data set to decide whether height (in feet) can be predicted from the number of stories. data from buildings.txt. (Note that this is a text file, so use the appropriate instruction. If you...
Using the data from the csv file, answer the questions with rstudio # number_children - The...
Using the data from the csv file, answer the questions with rstudio # number_children - The number of children in the home # internet - Does the home have internet access? # mode - The way the household took the survey # own - Do the residents own with or without a mortgage or rent? # language - The primary language spoken in the home # decade_built - The decade the home was built 1) In how many households, wife’s...
Please use RStudio to answer the question and give the R command: please load data use...
Please use RStudio to answer the question and give the R command: please load data use data: library(MASS) data(cats) Use the “cats” data set to test for the variance of the body weight in male and female cats
Multiple Regression: Must find a model that best fits the data: USING R 1. Test to...
Multiple Regression: Must find a model that best fits the data: USING R 1. Test to see if x1 and x2 are highly correlated using variance inflation factor technique. What can we conclude? Is Multicollinearity present? 2. Construct scatter plot in R to visualize relationship between y and each x. Dataset: Y= Time X1= School X2=District "School" "District" "Time" 1,3,4 2,6,7 18,9,24 4,10,114 9, 2, 16
Plot logistic regression in Rstudio: The data set in the table considers information on the spread...
Plot logistic regression in Rstudio: The data set in the table considers information on the spread of prostate cancer to the lymph nodes for 53 patients. For a sample of prostate cancer patients, a set of possible predictor variables were measured before surgery to determine if the lymph nodes were compromised. Subsequently, the patient underwent surgery and the status of his lymph nodes was determined. The data set contains 53 observations of 7 variables: id: identifiers for each subject in...
d. Simple Regression. Identify two variables for which you could calculate a simple regression. Describe the...
d. Simple Regression. Identify two variables for which you could calculate a simple regression. Describe the variables and their scale of measurement. Which variable would you include as the predictor variable and which as the outcome variable? Why? What would R2 tell you about the relationship between the two variables?
Q1. Using the data provided for your group assignment estimate the simple regression Y= Final_exam and...
Q1. Using the data provided for your group assignment estimate the simple regression Y= Final_exam and X= assignment_grade. Each part of question is worth 2 marks. Prior to estimating the regression what are your a priori expectations about the sign of β1? Explain. Write down the regression results in traditional form, with t statistics below each of the estimated coefficients and anything else that should be included. Test the null hypothesis that β1=0 against the two sided alternative hypothesis at...
Using the Instructor ranking data conduct a simple linear regression to predict a student’s scores based...
Using the Instructor ranking data conduct a simple linear regression to predict a student’s scores based on the number of hours the student studies and answer the following questions: What is the value of the intercept of this regression equation? What is the value of the slope of this regression equation and what is its interpretation within the context of this problem? Write the equation for this logistic regression model R2 =    What does this number mean? F=     Explain how F (F...
How well can we evaluate a regression equation “fits” the data by examining the R Square...
How well can we evaluate a regression equation “fits” the data by examining the R Square statistic, and test for statistical significance of each independent variable in the regression equation by using the t-test?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT