In: Statistics and Probability
Use Excel to graph the following data sets (these graphs do not need to be adjustable) and then find the equation of the linear regression line and its R2 Value. Use the equation to answer any questions.
A. The data shows the results of a study that compared the daily average number of cigarettes an individual smoked per day to his or her death.
Daily average # of cigarettes | 12 | 15 | 22 | 30 | 35 | 38 | 42 | 46 | 55 | 60 |
Average age at death | 75 | 72 | 69 | 66 | 64 | 62 | 61 | 58 | 56 | 51 |
If a person smokes no cigarettes, what is the expected age at the time of death?
B. Population density is a measure of how crowded a population is. In the United States, the number of people per square mile is given. Use a base year of 0 for 1900 in your equation.
Year | 1900 | 1910 | 1920 | 1930 | 1940 | 1950 | 1960 | 1970 | 1980 | 1990 | 2000 |
Density (people/mi2) | 21.5 | 26.0 | 29.9 | 34.7 | 37.2 | 42.6 | 50.6 | 57.5 | 64.0 | 70.3 | 74.9 |
Predict the population density of the United States in 2020.
A. The scatter plot, regression equation and R2 value are shown below:
The regression equation is y (avg age at death) = -0.4573 * x + 79.634, x being the daily avg # cigarettes
If a person doesn't smoke, x = 0
Expected avg age at death, y = -0.4537 x 0 + 79.634 = 79.634 years.
B. Following is the table of revised years basis base year of 1990:
Year | Revised year | Population Density |
1900 | 0 | 21.5 |
1910 | 10 | 26 |
1920 | 20 | 29.9 |
1930 | 30 | 34.7 |
1940 | 40 | 37.2 |
1950 | 50 | 42.6 |
1960 | 60 | 50.6 |
1970 | 70 | 57.5 |
1980 | 80 | 64 |
1990 | 90 | 70.3 |
2000 | 100 | 74.9 |
Below is the scatter plot along with regression equation and R2 value:
The regression equation for population density, y = 0.5505*x + 18.768
where y is the population density and x is the revised year.
For the year 2020, revised year, x = 2020 - 1900 = 120
expected population density in the year 2020, y = 0.5505*120 + 18.768 = 66.06 + 18.768 = 84.828 (people/mi)