In: Statistics and Probability
Fuming because you are stuck in traffic? Roadway congestion is a costly item, both in time wasted and fuel wasted. Let x represent the average annual hours per person spent in traffic delays and let y represent the average annual gallons of fuel wasted per person in traffic delays. A random sample of eight cities showed the following data. x (hr) 29 5 20 35 22 25 19 5 y (gal) 48 3 34 53 35 38 29 9 (a) Draw a scatter diagram for the data. Selection Tool Line Ray Segment Circle Vertical Parabola Horizontal Parabola Point No Solution Help 51015202530354051015202530354045505560 Clear Graph Delete Layer Fill WebAssign Graphing Tool Graph LayersToggle Open/Closed After you add an object to the graph you can use Graph Layers to view and edit its properties. Verify that Σx = 160, Σx2 = 3986, Σy = 249, Σy2 = 9869, and Σxy = 6258. Compute r. The data in part (a) represent average annual hours lost per person and average annual gallons of fuel wasted per person in traffic delays. Suppose that instead of using average data for different cities, you selected one person at random from each city and measured the annual number of hours lost x for that person and the annual gallons of fuel wasted y for the same person. x (hr) 21 4 22 42 19 28 2 39 y (gal) 64 8 11 50 22 33 4 70 (b) Compute x and y for both sets of data pairs and compare the averages. x y Data 1 Data 2 Compute the sample standard deviations sx and sy for both sets of data pairs and compare the standard deviations. sx sy Data 1 Data 2 In which set are the standard deviations for x and y larger? The standard deviations for x and y are larger for the first set of data. The standard deviations for x and y are larger for the second set of data. The standard deviations for x and y are the same for both sets of data. Look at the defining formula for r. Why do smaller standard deviations sx and sy tend to increase the value of r? Dividing by smaller numbers results in a larger value. Multiplying by smaller numbers results in a larger value. Multiplying by smaller numbers results in a smaller value. Dividing by smaller numbers results in a smaller value. (c) Make a scatter diagram for the second set of data pairs. Selection Tool Line Ray Segment Circle Vertical Parabola Horizontal Parabola Point No Solution Help 51015202530354045505101520253035404550556065707580 Clear Graph Delete Layer Fill WebAssign Graphing Tool Graph LayersToggle Open/Closed After you add an object to the graph you can use Graph Layers to view and edit its properties. Verify that Σx = 177, Σx2 = 5375, Σy = 262, Σy2 = 13,270, and Σxy = 7798. Compute r. (d) Compare r from part (a) with r from part (b). Do the data for averages have a higher correlation coefficient than the data for individual measurements? Yes, the data for averages have a higher correlation coefficient than the data for individual measurements. No, the data for averages do not have a higher correlation coefficient than the data for individual measurements. List some reasons why you think hours lost per individual and fuel wasted per individual might vary more than the same quantities averaged over all the people in a city. This answer has not been graded yet.
Here, we are given that roadway congestion is a costly item, both in time wasted and fuel wasted. Let x represent the average annual hours per person spent in traffic delays and let y represent the average annual gallons of fuel wasted per person in traffic delays. A random sample of eight cities showed the following data. x (hr) 29 5 20 35 22 25 19 5 y (gal) 48 3 34 53 35 38 29 9
a) Here, we have to draw a scatter plot with the given data. We perform the following steps in MS-Excel to do so-
1. Enter the values of x and y in two different columns.
2. Select the entered data and click on Insert tab.
3. Select scatter plot from charts and then select the last scatter plot type.
After running the above steps we get the following output-
To verify the results we construct the following table-
x | y | x2 | y2 | xy |
29 | 48 | 841 | 2304 | 1392 |
5 | 3 | 25 | 9 | 15 |
20 | 34 | 400 | 1156 | 680 |
35 | 53 | 1225 | 2309 | 1855 |
22 | 35 | 484 | 1225 | 770 |
25 | 38 | 625 | 1444 | 950 |
19 | 29 | 361 | 841 | 2639 |
5 | 9 | 25 | 81 | 45 |
Now we compute r by using the formula
= 10224/10324.14 = 0.9903
b) Now, we have to calculate the average and standard deviation of both the data sets.
For x-
For y-
Now for data set 1-
Average of x = 160/8 = 20
Average of y = 249/8 =31.125
sx = 10.5965
sy = 17.39817
For data set 2-
Average of x = 177/8 = 22.125
Average of y = 262/8 = 32.75
sx = 14.43644
sy = 25.88298
From this data we see the standard deviations for x and y are larger for the second set of data.
By looking at the formula we see smaller standard deviations sx and sy tend to increase the value of r because dividing by smaller numbers results in a larger value.
c) We have to construt a scatter plot for data set 2.
We perform the following steps in MS-Excel to do so-
1. Enter the values of x and y in two different columns.
2. Select the entered data and click on Insert tab.
3. Select scatter plot from charts and then select the last scatter plot type.
After running the above steps we get the following output-
To verify the results we construct the following table-
x | y | x2 | y2 | xy |
21 | 64 | 441 | 4096 | 1344 |
4 | 8 | 16 | 64 | 32 |
22 | 11 | 484 | 121 | 242 |
42 | 50 | 1764 | 2500 | 2100 |
19 | 22 | 361 | 484 | 418 |
28 | 33 | 784 | 1089 | 924 |
2 | 4 | 4 | 16 | 8 |
39 | 70 | 1521 | 4900 | 2730 |
Now we compute r by using the formula
= 16010/20924.847 = 0.7651
d) By comparing the r from both the data sets we conclude that yes the data for averages have a higher correlation coefficient than the data for individual measurements.
Since the standard deviations for average hour loss and fuel wastage are smaller their variation is larger and since the standarrd deviation of the individual hours loss and fuel wastage are larger their variation is smaller.