In: Statistics and Probability
Fuming because you are stuck in traffic? Roadway congestion is a costly item, both in time wasted and fuel wasted. Let x represent the average annual hours per person spent in traffic delays and let y represent the average annual gallons of fuel wasted per person in traffic delays. A random sample of eight cities showed the following data. x (hr) 27 5 18 39 23 25 17 5 y (gal) 46 3 31 57 33 39 27 9 (a) Draw a scatter diagram for the data. Selection Tool Line Ray Segment Circle Vertical Parabola Horizontal Parabola Point No Solution Help 51015202530354051015202530354045505560 Clear Graph Delete Layer Fill WebAssign Graphing Tool Graph LayersToggle Open/Closed After you add an object to the graph you can use Graph Layers to view and edit its properties. Verify that Σx = 159, Σx2 = 4067, Σy = 245, Σy2 = 9755, and Σxy = 6276. Compute r. The data in part (a) represent average annual hours lost per person and average annual gallons of fuel wasted per person in traffic delays. Suppose that instead of using average data for different cities, you selected one person at random from each city and measured the annual number of hours lost x for that person and the annual gallons of fuel wasted y for the same person. x (hr) 21 4 22 44 15 29 2 38 y (gal) 63 8 15 55 21 30 4 73 (b) Compute x and y for both sets of data pairs and compare the averages. x y Data 1 Data 2 Compute the sample standard deviations sx and sy for both sets of data pairs and compare the standard deviations. sx sy Data 1 Data 2 In which set are the standard deviations for x and y larger? The standard deviations for x and y are larger for the first set of data. The standard deviations for x and y are larger for the second set of data. The standard deviations for x and y are the same for both sets of data. Look at the defining formula for r. Why do smaller standard deviations sx and sy tend to increase the value of r? Multiplying by smaller numbers results in a larger value. Dividing by smaller numbers results in a larger value. Multiplying by smaller numbers results in a smaller value. Dividing by smaller numbers results in a smaller value. (c) Make a scatter diagram for the second set of data pairs. Selection Tool Line Ray Segment Circle Vertical Parabola Horizontal Parabola Point No Solution Help 51015202530354045505101520253035404550556065707580 Clear Graph Delete Layer Fill WebAssign Graphing Tool Graph LayersToggle Open/Closed After you add an object to the graph you can use Graph Layers to view and edit its properties. Verify that Σx = 175, Σx2 = 5391, Σy = 269, Σy2 = 13,969, and Σxy = 8072. Compute r. (d) Compare r from part (a) with r from part (b). Do the data for averages have a higher correlation coefficient than the data for individual measurements? Yes, the data for averages have a higher correlation coefficient than the data for individual measurements. No, the data for averages do not have a higher correlation coefficient than the data for individual measurements. List some reasons why you think hours lost per individual and fuel wasted per individual might vary more than the same quantities averaged over all the people in a city..
In this problem,
A) we need to draw a scatter plot from the data.
From the dataset, we verified that, Σx = 159, Σx2 = 4067, Σy = 245, Σy2 = 9755, and Σxy = 6276.
Now, the Pearson correlation coefficient is given as,
using this formula upon the dataset we get, r = 0.9843
Now, according to the question, suppose that instead of using average data for different cities, we selected one person at random from each city and measured the annual number of hours lost x for that person and the annual gallons of fuel wasted y for the same person.
b.
We need to compare the average.
for the first data pair, the weighted average is, 39.4717
and for the 2nd pair, it is, 46.126
Again,
For the 1st pair Sx = 11.38216
Sy = 17.9359
And for the 2nd pair, Sx = 14.942
Sy = 26.521
Conclusion: The 2nd pair has more deviation.
For the 2nd pair r = 0.8939
In the formula for r, the SD is in the denominator. That is why when it is small, the value of r increases if it is big, the value of r decreases.
c.
Scatter diagram for the 2nd data,
Nextly, Σx = 175, Σx2 = 5391, Σy = 269, Σy2 = 13,969, and Σxy = 8072 are verified easily.
Using the previously stated formula of r, we get r = 0.8939
d.
Yes, the data for averages have a higher correlation coefficient than the data for individual measurements.
Some reasons to think why we think hours lost per individual and fuel wasted per individual might vary more than the same quantities averaged over all the people in a city is, often some people stop their engine while waiting on a signal. Moreover, the vehicle they are riding also varies in terms of usage of fuel. That's why the hours lost per individual and fuel wasted per individual might vary more than the same quantities averaged over all the people in a city.
Thank you!!