Question

In: Statistics and Probability

A data set is provided, entitled oldfaithful_asst, on the duration and height of the Old Faithful...

A data set is provided, entitled oldfaithful_asst, on the duration and height of the Old Faithful geyser in the Yellowstone National Park.

  1. Construct a scatterplot using Excel or any software (SPSS or Minitab) between the variables “duration” and “height.” Please title the graph “Scatterplot 1 Old Faithful” and create labels for both axes.
  2. There seems to be a an outlier in the data set. Although an outlier is not a detriment to the data analysis, as part of an exercise, identify the outlier and delete it (simply erase its value, do not replace with zero).
  • Show the new (second) scatterplot and title the graph “Scatterplot 2 Old Faithful”
  • Describe the pattern that emerges. What might this relationship imply?
  1. Compute the correlation coefficient between the two variables and interpret this correlation. Refer to both the strength and direction of the correlation in your interpretation. Also interpret the correlation in terms of r-squared (coefficient of determination).
  2. Conduct a regression to predict duration of an eruption using height and interpret the sample slope coefficient in words.
  3. What is the predicted duration for an eruption that is 115?

Old Faithful data set to answer problem:

Duration Height
240 140
237 140
250 148
243 130
255 125
120 110
260 136
178 125
259 115
245 120
234 120
213 120
255 150
235 140
250 136
110 120
245 148
269 130
251 130
234 136
252 130
254 115
273 136
266 130
284 138
252 120
269 120
250 120
261 95
253 140
255 125
280 130
270 130
241 110
272 110
294 125
440 250
220 150
253 130
245 120
274 95

Solutions

Expert Solution

a)

b)

Firstly the data point (440,250) was removed as an outlier.

There are no such significant changes in the variable duration as the variable height increases.

This pattern indicates there is no such linear dependence between these two variables.

c)

Pearson correlation of Duration and Height = 0.092
The correlation indicates there is a very weak positive linear dependence between the two variables.

The r2 = 0.008464

In terms of the coefficient of determination, we can say that only 0.84% of the variability of Duration is getting explained by Height.

d)

The regression equation is given by

Duration = 212.8 + 0.253 Height

The slope coefficient is 0.253 and can be interpreted as the amount of increase in Duration due to a unit increase in Height.

f)

The predicted value of Duration when Height = 115, is 212.8 + 0.253*115 = 241.895.

PLEASE LEAVE AN UPVOTE!!


Related Solutions

The following data represents the heights of the old faithful geyser eruptions, the durations of the...
The following data represents the heights of the old faithful geyser eruptions, the durations of the eruption and the interval between eruptions. The data is attached and an excel file is also included on canvas. The data is arranged in duration, interval and height a) Use the paired data for durations and intervals after eruptions of the geyser. Is there significant linear correlation at the 0.05 significance level suggesting interval after an eruption is related to duration (use the r...
Here we are going to test a couple of hypotheses about the Old Faithful data in...
Here we are going to test a couple of hypotheses about the Old Faithful data in R. Remember, this is the faithful data frame that is built in to R. You can use data(faithful) to load data set. First split faithful into two separate data frames: (1) those entries with eruption times less than 3 minutes (eruptions < 3) and (2) those entries with eruption times greater than or equal to 3 minutes (eruptions >= 3). Answer the following about...
DaughtersHeight is a data set on the height of adult daughters and the heights of their...
DaughtersHeight is a data set on the height of adult daughters and the heights of their mothers and fathers, all in inches. The data were extracted from the US Department of Health and Human Services, Third National Health and Nutrition Examination Survey (use R studio for graphing). Gender daughtersheight mothersheight fathersheight F 58.6 63 64 F 64.7 67 65 F 65.3 64 67 F 61 60 72 F 65.4 65 72 F 67.4 67 72 F 60.9 59 67 F...
There are two variables in this data set. Variable Definition Height Height in inches Weight Weight...
There are two variables in this data set. Variable Definition Height Height in inches Weight Weight in pounds Using Excel, compute the standard deviation and variance (both biased and unbiased) for height and weight. Height weight 53 156 46 131 54 123 44 142 56 156 76 171 87 143 65 135 45 138 44 114 57 154 68 166 65 153 66 140 54 143 66 156 51 173 58 143 49 161 48 131
DaughtersHeight data set on the height of adult daughters and the heights of their mothers and...
DaughtersHeight data set on the height of adult daughters and the heights of their mothers and fathers, all in inches. analyze these data with child height as the dependent variable. using R studio graph the data and Conduct a separate analysis with Type I and Type III sum of squares tables. Gender daughtersheight mothersheight fathersheight F 58.6 63 64 F 64.7 67 65 F 65.3 64 67 F 61 60 72 F 65.4 65 72 F 67.4 67 72 F...
DaughtersHeight data set on the height of adult daughters and the heights of their mothers and...
DaughtersHeight data set on the height of adult daughters and the heights of their mothers and fathers, all in inches. analyze these data with child height as the dependent variable. using R studio graph the data (the problem is a multiple regression problem, include axis and a legend). Gender daughtersheight mothersheight fathersheight F 58.6 63 64 F 64.7 67 65 F 65.3 64 67 F 61 60 72 F 65.4 65 72 F 67.4 67 72 F 60.9 59 67...
The data set is height in inches and weight in pounds of random patients at the...
The data set is height in inches and weight in pounds of random patients at the Dr's office. Predict the weight of a patient that is 67 inches tall. Is it possible to predict using linear regression? Support your answer Linear regression was completed with the following results: Equation: Weight = -281.847 + 6.335*Height p-value = 0.00161 Height Weight 68 148 69 126 66 145 70 158 66 140 68 126 64 120 66 119 70 182 62 127 68...
Use y=faithful$eruptions; x=faithful$waiting to set the eruption durations and waiting time between eruptions of the R...
Use y=faithful$eruptions; x=faithful$waiting to set the eruption durations and waiting time between eruptions of the R data set faithful in objects y and x, respectively, and complete the following parts. 1. Make an scatterplot of the (x,y) data. Does it support the assumption that the data follows the simple linear regression model? (Include the plot with your answer.) 2. Fit the simple linear regression model and construct a 90% CI for: a) the slope, and b) the mean eruption duration...
The U.S. Geological Survey compiled historical data about Old Faithful Geyser (Yellowstone National Park) from 1870...
The U.S. Geological Survey compiled historical data about Old Faithful Geyser (Yellowstone National Park) from 1870 to 1987. Let x1 be a random variable that represents the time interval (in minutes) between Old Faithful eruptions for the years 1948 to 1952. Based on 9520 observations, the sample mean interval was x1 = 61.2 minutes. Let x2 be a random variable that represents the time interval in minutes between Old Faithful eruptions for the years 1983 to 1987. Based on 25,340...
The U.S. Geological Survey compiled historical data about Old Faithful Geyser (Yellowstone National Park) from 1870...
The U.S. Geological Survey compiled historical data about Old Faithful Geyser (Yellowstone National Park) from 1870 to 1987. Let x1 be a random variable that represents the time interval (in minutes) between Old Faithful eruptions for the years 1948 to 1952. Based on 9280 observations, the sample mean interval was x1 = 62.0 minutes. Let x2 be a random variable that represents the time interval in minutes between Old Faithful eruptions for the years 1983 to 1987. Based on 24,170...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT