In: Statistics and Probability
Fuming because you are stuck in traffic? Roadway congestion is a costly item, both in time wasted and fuel wasted. Let x represent the average annual hours per person spent in traffic delays and let y represent the average annual gallons of fuel wasted per person in traffic delays. A random sample of eight cities showed the following data. x (hr) 29 5 23 37 18 23 19 5 y (gal) 45 3 31 53 31 38 28 9 (a) Draw a scatter diagram for the data. Submission Data Correct: Your answer is correct. Verify that Σx = 159, Σx2 = 4003, Σy = 238, Σy2 = 9074, and Σxy = 6003. Compute r. The data in part (a) represent average annual hours lost per person and average annual gallons of fuel wasted per person in traffic delays. Suppose that instead of using average data for different cities, you selected one person at random from each city and measured the annual number of hours lost x for that person and the annual gallons of fuel wasted y for the same person. x (hr) 22 4 19 40 19 26 2 39 y (gal) 63 8 12 54 21 35 4 71 (b) Compute x and y for both sets of data pairs and compare the averages. x y Data 1 Data 2 Compute the sample standard deviations sx and sy for both sets of data pairs and compare the standard deviations. sx sy Data 1 Data 2 In which set are the standard deviations for x and y larger? The standard deviations for x and y are larger for the first set of data. The standard deviations for x and y are larger for the second set of data. The standard deviations for x and y are the same for both sets of data. Correct: Your answer is correct. Look at the defining formula for r. Why do smaller standard deviations sx and sy tend to increase the value of r? Dividing by smaller numbers results in a smaller value. Multiplying by smaller numbers results in a larger value. Multiplying by smaller numbers results in a smaller value. Dividing by smaller numbers results in a larger value. Correct: Your answer is correct. (c) Make a scatter diagram for the second set of data pairs. Submission Data Correct: Your answer is correct. Verify that Σx = 171, Σx2 = 5023, Σy = 268, Σy2 = 13,816, and Σxy = 7892. Compute r. (d) Compare r from part (a) with r from part (b). Do the data for averages have a higher correlation coefficient than the data for individual measurements? No, the data for averages do not have a higher correlation coefficient than the data for individual measurements. Yes, the data for averages have a higher correlation coefficient than the data for individual measurements. Correct: Your answer is correct. List some reasons why you think hours lost per individual and fuel wasted per individual might vary more than the same quantities averaged over all the people in a city.
Data 1:
Following is the scatter plot:
Following table shows the calculations:
X | Y | X^2 | Y^2 | XY | |
29 | 45 | 841 | 2025 | 1305 | |
5 | 3 | 25 | 9 | 15 | |
23 | 31 | 529 | 961 | 713 | |
37 | 53 | 1369 | 2809 | 1961 | |
18 | 31 | 324 | 961 | 558 | |
23 | 38 | 529 | 1444 | 874 | |
19 | 28 | 361 | 784 | 532 | |
5 | 9 | 25 | 81 | 45 | |
Total | 159 | 238 | 4003 | 9074 | 6003 |
Sample size: n=8
The coefficient of correlation is :
Following table shows the calculations:
X | Y | (x-mean)^2 | (y-mean)^2 | |
29 | 45 | 83.265625 | 232.5625 | |
5 | 3 | 221.265625 | 715.5625 | |
23 | 31 | 9.765625 | 1.5625 | |
37 | 53 | 293.265625 | 540.5625 | |
18 | 31 | 3.515625 | 1.5625 | |
23 | 38 | 9.765625 | 68.0625 | |
19 | 28 | 0.765625 | 3.0625 | |
5 | 9 | 221.265625 | 430.5625 | |
Total | 159 | 238 | 842.875 | 1993.5 |
Mean:
The standard deviation:
Data 2:
Following is the scatter plot:
Following table shows the calculations:
X | Y | X^2 | Y^2 | XY | |
22 | 63 | 484 | 3969 | 1386 | |
4 | 8 | 16 | 64 | 32 | |
19 | 12 | 361 | 144 | 228 | |
40 | 54 | 1600 | 2916 | 2160 | |
19 | 21 | 361 | 441 | 399 | |
26 | 35 | 676 | 1225 | 910 | |
2 | 4 | 4 | 16 | 8 | |
39 | 71 | 1521 | 5041 | 2769 | |
Total | 171 | 268 | 5023 | 13816 | 7892 |
The coefficient of correlation is :
Following table shows the calculations:
X | Y | (x-mean)^2 | (y-mean)^2 | |
22 | 63 | 0.390625 | 867.3025 | |
4 | 8 | 301.890625 | 652.8025 | |
19 | 12 | 5.640625 | 464.4025 | |
40 | 54 | 346.890625 | 418.2025 | |
19 | 21 | 5.640625 | 157.5025 | |
26 | 35 | 21.390625 | 2.1025 | |
2 | 4 | 375.390625 | 873.2025 | |
39 | 71 | 310.640625 | 1402.5025 | |
Total | 171 | 268 | 1367.875 | 4838.02 |
Mean:
The standard deviation:
The standard deviations for x and y are larger for the second set of data.
Dividing by smaller numbers results in a larger value.
----------------
(d)
Yes, the data for averages have a higher correlation coefficient than the data for individual measurements.