Question

In: Statistics and Probability

~~~~~~~~~~~~TO BE COMPLETED USING RSTUDIO~~~~~~~~~~~~~~ ~~~~~~~~~~~~(Please display all RCode used)~~~~~~~~~~~~~~ Regression Is there a relationship between...

~~~~~~~~~~~~TO BE COMPLETED USING RSTUDIO~~~~~~~~~~~~~~

~~~~~~~~~~~~(Please display all RCode used)~~~~~~~~~~~~~~

Regression

Is there a relationship between the number of stories a building has and its height? Some statisticians compiled data on a set of n = 60 buildings reported in the World Almanac. You will use the data set to decide whether height (in feet) can be predicted from the number of stories.

data from buildings.txt.
(Note that this is a text file, so use the appropriate instruction. If you are having trouble uploading the data, open it to see its contents and type the data in: one vector for heights and one vector for stories. Ignore the year data.)

buildings.txt

YEAR   Height   Stories
1990   770   54
1980   677   47
1990   428   28
1989   410   38
1966   371   29
1976   504   38
1974   1136   80
1991   695   52
1982   551   45
1986   550   40
1931   568   49
1979   504   33
1988   560   50
1973   512   40
1981   448   31
1983   538   40
1968   410   27
1927   409   31
1969   504   35
1988   777   57
1987   496   31
1960   386   26
1984   530   39
1976   360   25
1920   355   23
1931   1250   102
1989   802   72
1907   741   57
1988   739   54
1990   650   56
1973   592   45
1983   577   42
1971   500   36
1969   469   30
1971   320   22
1988   441   31
1989   845   52
1973   435   29
1987   435   34
1931   375   20
1931   364   33
1924   340   18
1931   375   23
1991   450   30
1973   529   38
1976   412   31
1990   722   62
1983   574   48
1984   498   29
1986   493   40
1986   379   30
1992   579   42
1973   458   36
1988   454   33
1979   952   72
1972   784   57
1930   476   34
1978   453   46
1978   440   30
1977   428   21

  1. Draw a scatterplot with stories in the x-axis and height in the y-axis. Describe the trend, strength and shape of the relationship between stories and height.

  2. Find the linear correlation coefficient between these variables. How does it support the description you gave in (b)?

  3. Draw diagnostic plots (a plot of stories vs. residuals, and a normal probability plot for the residuals). Do assumptions appear to be satisfied?

  4. Obtain a 95% confidence interval for the true value of the slope. How does the interval support your conclusion in (e)?

  5. What is the estimated height of a building that is 45 stories high? Write a concluding sentence supported by your results above.

Solutions

Expert Solution

The trend is used to predict future values based on recently observed data. But, this data does not appropriate to describe the trend. From the scatter plot, increase the value of stories increases the values of Height and vice versa. Hence, the strength is strong and the shape of the relationship is linear between stories and height.

Regression Analysis: Height versus Stories

The regression equation is
Height = 90.3 + 11.3 Stories

Predictor Coef SE Coef T P
Constant 90.31 20.96 4.31 0.000
Stories 11.2924 0.4844 23.31 0.000


S = 58.3259 R-Sq = 90.4% R-Sq(adj) = 90.2%

Comment: From the above plot between the residuals and stories, we observed that if we remove around four points which is available at 0 to 90 degree, the relationship is slightly negatively linear. Whereas, as all the relationship is random and satisfied the independence between the residual and stories.

From the normal probability plot for the residuals, the residual satisfied the normal distribution at 0.05 level of significance.

Now, from the above two plots, the assumptions are satisfied.

The 95% confidence interval for the true value of the slope is

=(10.19, 12.29).

The 95% confidence interval does not include the value zero. Hence, we can conclude that stories have a significant effect on height at 0.05 level of significance.

The estimated height of a building that is 45 stories high is

Y^=90.31+ 11.29*45=598.36.

When increasing the stories high by 45 the mean estimated height is 598.36.


Related Solutions

~~~~~~~~~~~~TO BE COMPLETED USING RSTUDIO~~~~~~~~~~~~~~ ~~~~~~~~~~~~(Please display all RCode used)~~~~~~~~~~~~~~ Regression Is there a relationship between...
~~~~~~~~~~~~TO BE COMPLETED USING RSTUDIO~~~~~~~~~~~~~~ ~~~~~~~~~~~~(Please display all RCode used)~~~~~~~~~~~~~~ Regression Is there a relationship between the number of stories a building has and its height? Some statisticians compiled data on a set of n = 60 buildings reported in the World Almanac. You will use the data set to decide whether height (in feet) can be predicted from the number of stories. data from buildings.txt. (Note that this is a text file, so use the appropriate instruction. If you...
~~~~~~~~~~~~TO BE COMPLETED USING RSTUDIO~~~~~~~~~~~~~~ ~~~~~~~~~~~~(Please display all RCode used)~~~~~~~~~~~~~~ Independence Are all employees equally prone...
~~~~~~~~~~~~TO BE COMPLETED USING RSTUDIO~~~~~~~~~~~~~~ ~~~~~~~~~~~~(Please display all RCode used)~~~~~~~~~~~~~~ Independence Are all employees equally prone to having accidents? To investigate this hypothesis, a researcher looked at a light manufacturing plant and classified the accidents by type and by age of the employee. The observed results of the study are found below: Age Under 25 25 or over Total Sprain 1111 6161 7272 Burn 1818 1414 3232 Cut 1010 1212 2222 Total 3939 8787 126126 Determine whether age is independent...
~~~~~~~~~~~~TO BE COMPLETED USING RSTUDIO~~~~~~~~~~~~~~ ~~~~~~~~~~~~(PLEASE display all RCode used)~~~~~~~~~~~~~~ Goodness of Fit Microhabitat factors associated...
~~~~~~~~~~~~TO BE COMPLETED USING RSTUDIO~~~~~~~~~~~~~~ ~~~~~~~~~~~~(PLEASE display all RCode used)~~~~~~~~~~~~~~ Goodness of Fit Microhabitat factors associated with forage and bed sites of barking deer in Hainan Island, China, were examined. The sample of 477 examined sites where the deer forage were categorized by habitat as follows: Habitat Woods Cultivated grassplots Deciduous forests Other Deer forage sites 15 16 50 386 In this region, woods make up 4.8% of the land, cultivated grassplots make up 14.7%, and deciduous forests make up...
~~~~~~~~~~~~TO BE COMPLETED USING RSTUDIO~~~~~~~~~~~~~~ ~~~~~~~~~~~~(PLEASE display all RCode used)~~~~~~~~~~~~~~ Goodness of Fit Microhabitat factors associated...
~~~~~~~~~~~~TO BE COMPLETED USING RSTUDIO~~~~~~~~~~~~~~ ~~~~~~~~~~~~(PLEASE display all RCode used)~~~~~~~~~~~~~~ Goodness of Fit Microhabitat factors associated with forage and bed sites of barking deer in Hainan Island, China, were examined. The sample of 477 examined sites where the deer forage were categorized by habitat as follows: Habitat Woods Cultivated grassplots Deciduous forests Other Deer forage sites 15 16 50 386 In this region, woods make up 4.8% of the land, cultivated grassplots make up 14.7%, and deciduous forests make up...
Please answer this using Rstudio For the oyster data, calculate regression fits (simple regression) for the...
Please answer this using Rstudio For the oyster data, calculate regression fits (simple regression) for the 2D and 3D data a.1) Give null and alternative hypotheses a.2) Fit the regression model a.3) Summarize the fit and evaluation of the regression model (is the linear relationship significant). a.4 )Calculate residuals and make a qqplot. Is the normal assumption reasonable? Actual   2D   3D 13.04   47.907   5.136699 11.71   41.458   4.795151 17.42   60.891   6.453115 7.23   29.949   2.895239 10.03   41.616   3.672746 15.59   48.070   5.728880 9.94  ...
**Using RStudio**Please show code** (airquality) What is the relationship between temperature and ozone levels in New...
**Using RStudio**Please show code** (airquality) What is the relationship between temperature and ozone levels in New York and how does the month influence this? Make a plot that would illustrate this relationship (hint: make sure you change Month into a factor, you don't need to include a line of best fit for this plot because there are so many categories in month).
1) Please find the partially completed multiple regression analysis below, which explores the relationship between the...
1) Please find the partially completed multiple regression analysis below, which explores the relationship between the sales (in hundreds) and the independent variables price(in dollars), promotional expenditure(in hundreds of dollars) and the quality score ( 0-100) for a very popular Christmas season toy. The regression equation is Sales = 343.2 - 0.23* Price + 2.7* Promotional exp + 0.22* Quality Predictor Coef Standard Error/ SE Coef    VIF Constant 343.20 / 62.59 / 0 Price -0.23358 / 0.0373 / 2.415...
A regression line describes the relationship between the dependent and independent variables and can be used...
A regression line describes the relationship between the dependent and independent variables and can be used to estimate specific points for x or y, provided an individual is supplied with the values of all the other variables but one. true or false
Regression analysis is often used to provide a means to express the relationship between one or...
Regression analysis is often used to provide a means to express the relationship between one or more input variables and a result. It is easy to plot in Excel (“add trendline”) so is found frequently in business presentations. Your company has made a model with 10 different factors measured from past years’ and states based upon the model, the company expects to make a 23 million dollar profit next year. Discuss possible concerns with banking on the 23 million dollar...
Linear regression is a statistical tool commonly used to find a relationship that exists between a...
Linear regression is a statistical tool commonly used to find a relationship that exists between a variable and one explanatory variable. What are the factors that affect a linear regression model? How can you accomplish linear regression in R? Please provide an example to illustrate your assertions.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT