Question

In: Statistics and Probability

install.packages("mosaic") library(mosaic) Data=(RailTrail) RailTrail above is the data set it can be found in R (a)...

install.packages("mosaic")
library(mosaic)
Data=(RailTrail)
RailTrail

above is the data set it can be found in R

(a) Perform multivariate regression model that can predict the variable volume based on the variables hightemp, lowtemp, cloudcover, precip,. Interpret and discuss all the necessary statics from the output.

(b) Test whether cloudcover can be dropped from the regression model given that precipitation, hightemp, and lowtemp are retained. Use the F statistic and level of significance 0.01. State the hypotheses, p-value, and conclusion in terms of the problem. Hint: This can be achieved using ANOVA.

(c) Assess whether both lowtemp and cloudcover can be dropped from the model given that hightemp and precipitation are retained. Discuss your results giving all relevant details to your solution. This includes any graphs or plots.

Solutions

Expert Solution

Solution

a)

install.packages("mosaic")
library(mosaic)

d=RailTrai

attach(d)

model=lm(volume~hightemp+lowtemp+cloudcover+precip,data=d)
summary(model)

#output

Call:
lm(formula = volume ~ hightemp + lowtemp + cloudcover + precip,
data = d)

Residuals:
Min 1Q Median 3Q Max
-269.447 -37.449 4.186 41.178 299.266

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 35.308 59.796 0.590 0.5564
hightemp 6.571 1.153 5.699 1.7e-07 ***
lowtemp -1.290 1.387 -0.930 0.3551
cloudcover -7.501 3.851 -1.948 0.0547 .
precip -100.616 42.064 -2.392 0.0190 *
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 93.2 on 85 degrees of freedom
Multiple R-squared: 0.4894,   Adjusted R-squared: 0.4654
F-statistic: 20.37 on 4 and 85 DF, p-value: 8.537e-12

From Coefficients table

The p-value = 0.5564 >0.01 so we conclude that there is no linear relationship between dependent and independent variable.

Multiple R-squared: 0.4894 Which indicates that 48.94% variation in the dependent variable is explained by independent variable.

The p-value for F statistics 8.537e-12 < 0.01 level of significance which indicates that independent variables are significant.

(b) Test whether cloudcover can be dropped from the regression model given that precipitation, hightemp, and lowtemp are retained.

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 35.308 59.796 0.590 0.5564
hightemp 6.571 1.153 5.699 1.7e-07 ***
lowtemp -1.290 1.387 -0.930 0.3551
cloudcover -7.501 3.851 -1.948 0.0547 .
precip -100.616 42.064 -2.392 0.0190 *

Since the p-value for cloudcover = 0.0547 > 0.01 level of significance so can drop cloudcover from the regression model.

Use the F statistic and level of significance 0.01. State the hypotheses, p-value, and conclusion in terms of the problem

Residual standard error: 93.2 on 85 degrees of freedom
Multiple R-squared: 0.4894,   Adjusted R-squared: 0.4654
F-statistic: 20.37 on 4 and 85 DF, p-value: 8.537e-12

State the hypotheses

p-value = 8.537e-12

conclusion : The p-value for F statistics 8.537e-12 < 0.01 level of significance which indicates that independent variables are significant.

(c) Assess whether both lowtemp and cloudcover can be dropped from the model given that hightemp and precipitation are retained. Discuss your results giving all relevant details to your solution. This includes any graphs or plots

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 35.308 59.796 0.590 0.5564
hightemp 6.571 1.153 5.699 1.7e-07 ***
lowtemp -1.290 1.387 -0.930 0.3551
cloudcover -7.501 3.851 -1.948 0.0547 .
precip -100.616 42.064 -2.392 0.0190

The p-values for lowtemp and cloudcover > 0.01 so we can drop both lowtemp and cloudcover can be dropped from the model.


Related Solutions

The data set ”airquality” in the R datasets library has data on ozone concentration, wind speed,...
The data set ”airquality” in the R datasets library has data on ozone concentration, wind speed, temperature, and solar radiation by month and day for May through September in New York. Attach airquality to your workspace and then construct side-by-side boxplots of Wind by Month. Month is a numeric variable in the airquality data frame. You can treat it as a factor by using the ”as.factor” function, e.g., > plot(Wind ∼ as.factor(Month)) Next, do an analysis of variance to determine...
When coding in R Studio install.packages("hflights") library(hflights) if filter for flights >= 3000 miles how many...
When coding in R Studio install.packages("hflights") library(hflights) if filter for flights >= 3000 miles how many flights in new dataframe
In R: Consider dataset “juul” from library “ISwR”. (juul is a built in data set) Are...
In R: Consider dataset “juul” from library “ISwR”. (juul is a built in data set) Are the means of igf1 equal among tanner groups at 5% level? Please use the six step process to test statistical hypotheses for this research problem. Note: You need to convert tanner from numeric to factor type and ignore all the NAs.
using the mtcars data set data(mtcars) USE data in mtcars library in R 5.Use k means...
using the mtcars data set data(mtcars) USE data in mtcars library in R 5.Use k means cluster analysis. 6. Get cluster means. 7. Visualize the clustering result.
Write R code: Here are the first six observations from the prostate data set found in...
Write R code: Here are the first six observations from the prostate data set found in the faraway library. Use help(prostate) to describe the dataset and the variables in the data sets. obs lcavol lweight age lbph svi lcp gleason pgg45 lpsa 1 -0.579819 2.7695 50 -1.38629 0 -1.38629 6 0 -0.43078 2 -0.994252 3.3196 58 -1.38629 0 -1.38629 6 0 -0.16252 3 -0.510826 2.6912 74 -1.38629 0 -1.38629 7 20 -0.16252 4 -1.203973 3.2828 58 -1.38629 0 -1.38629 6...
In R, Use library(MASS) to access the data sets for this test. Use the Pima.tr data...
In R, Use library(MASS) to access the data sets for this test. Use the Pima.tr data set to answer questions 1-5. What is the average age for women in this data set? What is the maximum number of pregnancies for women in this data set ? What is the median age for women who have diabetes? What is the median age for women who do not have diabetes? What is the third quartile of the skin variable?
Life Expectancy Part 3 Refer to the Data Set AllCountries. (Data sets can be found near...
Life Expectancy Part 3 Refer to the Data Set AllCountries. (Data sets can be found near the bottom of the Read, Study & Practice section of WileyPLUS.) Use the 199 life expectancies listed and StatKey to answer the following questions. a. Use an equation editor to formulate the null and alternative hypothesis to test the following claim: “The average life expectancy for all countries is not 68.9 years.” b. From the AllCountries data, do your best to randomly select 10...
Construct a scattergram for each data set. Then calculate r and r 2 for each data...
Construct a scattergram for each data set. Then calculate r and r 2 for each data set. Interpret their values. Complete parts a through d a. x −1 0 1 2 3 y −3 0 1 4 5 Calculate r. r=. 9853 ​(Round to four decimal places as​ needed.) Calculate r2. r2=0.9709. ​(Round to four decimal places as​ needed.) Interpret r. Choose the correct answer below. A.There is not enough information to answer this question. B.There is a very strong...
Construct a scattergram for each data set. Then calculate r and r2 for each data set....
Construct a scattergram for each data set. Then calculate r and r2 for each data set. Interpret their values. Complete parts a through d. a. x −1 0 1 2 3 y −3 0 1 4 5 Calculate r. r=. 9853.​(Round to four decimal places as​ needed.) Calculate r2. r2=0.9709​(Round to four decimal places as​ needed.) Interpret r. Choose the correct answer below. A.There is not enough information to answer this question. B.There is a very strong negative linear relationship...
For the data set below, calculate r, r 2, and a 95% confidence interval in r...
For the data set below, calculate r, r 2, and a 95% confidence interval in r units. Then write a one- to two-sentence conclusion statement that includes whether the null hypothesis was rejected or not. Assume a two-tailed hypothesis and α = .05. Case 1 Case 2 Case 3 Case 4 Case 5 Case 6 X 1.05 1.15 1.30 2.00 1.75 1.00 Y 2 2 3 4 5 2
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT