In: Statistics and Probability
You will be performing an analysis on a dataset that contains data on fertility and life expectancy for 198 different countries. All data is from the year 2013. The fertility numbers are the average number of children per woman in each of the countries. The life expectancy numbers are the average life expectancy in each of the countries.
You will be turning in a paper that should include section headings, graphics and tables when appropriate and complete sentences which explain all analysis that was done in addition to all conclusions and results. There is not a specified length, however it is important that you follow all steps below and grade yourself using the rubric provided since it is the rubric that I will be using to grade your submissions. All work should be your own. Plagiarism will result in a project score of 0.
Steps (all statistical analysis to be done in Excel and/or StatCrunch):
Watch the TED talk by Hans Roling titled “The best stats you’ve ever seen”. You will need to include comments on this in your paper. Here is a link: http://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen?language=en
Create histograms of each of the variables (one histogram for fertility, one for life expectancy). Use the histograms to identify the shapes of the distribution. StatCrunch will be the easier tool to use for this particular task.
Calculate some descriptive statistics for each of the variables, including but not limited to the mean, median and standard deviation. Organize these numbers nicely in a table.
Using fertility as the predictor variable and life expectancy as the response variable, create a scatter diagram, come up with the least-squares regression line and calculate the linear correlation coefficient as well as the coefficient of determination. Make sure that you understand all interpretations and include them in your paper. Please carefully review the rubric below to see the full list of required interpretations.
Use the regression line to predict life expectancy for the United States given fertility and then compare this to the actual value in the United States.
Name some possible lurking variables that may be at work here.
Explain the difference between correlation and causation and why we cannot say that there is a cause and effect relationship in this situation.
Explain why we cannot use our regression model to predict the life expectancy of one particular individual.
Take a look at the website where this data was pulled from and comment on how the model might have been different if we used the data from 20, 40 or 60 years ago. Navigate to http://gapminder.org and click on “Gapminder World”. Use the x-axis and y-axis dropdown menus to ensure that ‘life expectancy (years)’ is selected on the y-axis and ‘children per woman (total fertility)’ is selected on the x-axis.
Put everything together into an organized paper and submit !!!
Country | 2013 Fertility | 2013 Life Expectancy |
Afghanistan | 4.9 | 56.2 |
Albania | 1.771 | 75.8 |
Algeria | 2.795 | 76.3 |
Angola | 5.863 | 60.4 |
Antigua and Barbuda | 2.089 | 75.2 |
Argentina | 2.175 | 76 |
Armenia | 1.74 | 73.8 |
Aruba | 1.673 | 75.455 |
Australia | 1.882 | 81.8 |
Austria | 1.471 | 80.8 |
Azerbaijan | 1.924 | 72.3 |
Bahamas | 1.888 | 72.5 |
Bahrain | 2.075 | 79 |
Bangladesh | 2.177 | 69.5 |
Barbados | 1.849 | 75.6 |
Belarus | 1.494 | 70.2 |
Belgium | 1.854 | 80.2 |
Belize | 2.676 | 70 |
Benin | 4.845 | 64.9 |
Bhutan | 2.232 | 69.4 |
Bolivia | 3.221 | 71.9 |
Bosnia and Herzegovina | 1.283 | 77.5 |
Botswana | 2.619 | 65.8 |
Brazil | 1.801 | 75 |
Brunei | 1.994 | 78.7 |
Bulgaria | 1.541 | 74.5 |
Burkina Faso | 5.605 | 62 |
Burundi | 6.033 | 59.8 |
Cambodia | 2.861 | 67.8 |
Cameroon | 4.78 | 58.7 |
Canada | 1.67 | 81.5 |
Cape Verde | 2.292 | 74.2 |
Chad | 6.263 | 57.1 |
Channel Islands | 1.459 | 80.324 |
Chile | 1.82 | 79.1 |
China | 1.668 | 76.5 |
Colombia | 2.286 | 75.6 |
Comoros | 4.714 | 63.7 |
Congo, Dem. Rep. | 5.933 | 57.5 |
Congo, Rep. | 4.969 | 61.5 |
Costa Rica | 1.795 | 79.8 |
Cote d'Ivoire | 4.866 | 58.9 |
Croatia | 1.501 | 77.8 |
Cuba | 1.449 | 78.3 |
Cyprus | 1.461 | 82.2 |
Czech Rep. | 1.566 | 78.2 |
Denmark | 1.88 | 79.9 |
Djibouti | 3.387 | 63.4 |
Dominican Rep. | 2.484 | 73.6 |
Ecuador | 2.559 | 74.8 |
Egypt | 2.77 | 70.9 |
El Salvador | 2.184 | 73.9 |
Equatorial Guinea | 4.845 | 58.8 |
Eritrea | 4.696 | 62.1 |
Estonia | 1.604 | 76.6 |
Ethiopia | 4.519 | 62.6 |
Fiji | 2.588 | 66.1 |
Finland | 1.853 | 80.6 |
France | 1.98 | 81.7 |
French Guiana | 3.058 | 77.121 |
French Polynesia | 2.058 | 76.257 |
Gabon | 4.087 | 59.1 |
Gambia | 5.751 | 64.3 |
Georgia | 1.817 | 72.9 |
Germany | 1.419 | 80.7 |
Ghana | 3.857 | 64.9 |
Greece | 1.529 | 79.8 |
Greenland | 2.077 | 71.5 |
Grenada | 2.17 | 71.5 |
Guadeloupe | 2.08 | 80.947 |
Guam | 2.405 | 78.854 |
Guatemala | 3.783 | 72.3 |
Guinea | 4.915 | 60.2 |
Guyana | 2.546 | 64 |
Haiti | 3.148 | 64.3 |
Honduras | 3.001 | 72 |
Hong Kong, China | 1.135 | 83.378 |
Hungary | 1.411 | 75.8 |
Iceland | 2.083 | 82.8 |
India | 2.479 | 66.2 |
Indonesia | 2.338 | 70.5 |
Iran | 1.92 | 78.3 |
Iraq | 4.026 | 71.3 |
Ireland | 1.997 | 80.4 |
Israel | 2.898 | 82.2 |
Italy | 1.487 | 82.1 |
Jamaica | 2.26 | 75.5 |
Japan | 1.419 | 83.3 |
Jordan | 3.244 | 78.1 |
Kazakhstan | 2.455 | 67.8 |
Kenya | 4.382 | 65.2 |
Kiribati | 2.952 | 62 |
Korea, Dem. Rep. | 1.988 | 71.2 |
Korea, Rep. | 1.321 | 80.5 |
Kuwait | 2.6 | 80.3 |
Kyrgyzstan | 3.075 | 68.6 |
Laos | 3.02 | 65.8 |
Latvia | 1.607 | 75.3 |
Lebanon | 1.495 | 78.3 |
Liberia | 4.792 | 63.1 |
Libya | 2.356 | 75.6 |
Lithuania | 1.519 | 75 |
Luxembourg | 1.671 | 81.1 |
Macao, China | 1.083 | 80.4 |
Macedonia, FYR | 1.431 | 76.6 |
Madagascar | 4.468 | 64.3 |
Malawi | 5.389 | 57.3 |
Malaysia | 1.964 | 74.7 |
Maldives | 2.256 | 79.3 |
Mali | 6.847 | 57.2 |
Malta | 1.356 | 82.1 |
Martinique | 1.827 | 81.41 |
Mauritania | 4.67 | 65.1 |
Mauritius | 1.501 | 73.3 |
Mayotte | 3.802 | 79.19 |
Mexico | 2.185 | 75.5 |
Micronesia, Fed. Sts. | 3.294 | 66.8 |
Moldova | 1.456 | 71.9 |
Mongolia | 2.436 | 64.7 |
Montenegro | 1.666 | 75.6 |
Morocco | 2.735 | 74.3 |
Mozambique | 5.188 | 56.2 |
Myanmar | 1.938 | 67.1 |
Namibia | 3.051 | 60.6 |
Nepal | 2.3 | 70.6 |
Netherlands | 1.774 | 80.6 |
Netherlands Antilles | 1.89 | 76.894 |
New Caledonia | 2.127 | 76.306 |
New Zealand | 2.052 | 80.6 |
Nicaragua | 2.498 | 76.4 |
Niger | 7.561 | 61.6 |
Nigeria | 5.976 | 60.1 |
Norway | 1.931 | 81.4 |
Oman | 2.853 | 75.5 |
Pakistan | 3.185 | 65.7 |
Panama | 2.466 | 77.8 |
Papua New Guinea | 3.781 | 59.8 |
Paraguay | 2.864 | 73.7 |
Peru | 2.417 | 77.1 |
Philippines | 3.043 | 70 |
Poland | 1.417 | 76.9 |
Portugal | 1.315 | 79.8 |
Puerto Rico | 1.636 | 78.864 |
Qatar | 2.019 | 81.8 |
Reunion | 2.232 | 79.646 |
Romania | 1.417 | 76 |
Russia | 1.595 | 71.3 |
Rwanda | 4.508 | 65.3 |
Saint Lucia | 1.912 | 74.5 |
Saint Vincent and the Grenadines | 1.997 | 72.7 |
Samoa | 4.147 | 71.8 |
Sao Tome and Principe | 4.075 | 68.4 |
Saudi Arabia | 2.644 | 77.9 |
Senegal | 4.934 | 65.7 |
Serbia | 1.365 | 77.7 |
Seychelles | 2.18 | 73.3 |
Sierra Leone | 4.705 | 57.7 |
Singapore | 1.282 | 81.9 |
Slovak Republic | 1.396 | 76.2 |
Slovenia | 1.509 | 80 |
Solomon Islands | 4.031 | 63.7 |
Somalia | 6.563 | 57.7 |
South Africa | 2.387 | 60.4 |
South Sudan | 4.92 | 57.2 |
Spain | 1.505 | 81.7 |
Sri Lanka | 2.339 | 76.1 |
Sudan | 4.42 | 68.9 |
Suriname | 2.268 | 70.1 |
Sweden | 1.928 | 81.8 |
Switzerland | 1.533 | 82.7 |
Syria | 2.964 | 72.4 |
Taiwan | 1.065 | 79.3 |
Tajikistan | 3.815 | 70.6 |
Tanzania | 5.214 | 62.2 |
Thailand | 1.399 | 74.9 |
Timor-Leste | 5.855 | 71.4 |
Togo | 4.639 | 63 |
Tonga | 3.767 | 70.3 |
Trinidad and Tobago | 1.797 | 71.2 |
Tunisia | 2.008 | 77.1 |
Turkey | 2.041 | 76.3 |
Turkmenistan | 2.326 | 67.5 |
Uganda | 5.867 | 59.8 |
Ukraine | 1.47 | 71.7 |
United Arab Emirates | 1.801 | 76.4 |
United Kingdom | 1.892 | 81 |
United States | 1.976 | 78.9 |
Uruguay | 2.046 | 76.9 |
Uzbekistan | 2.309 | 69.7 |
Vanuatu | 3.382 | 64.6 |
Venezuela | 2.39 | 75.4 |
Vietnam | 1.743 | 76.3 |
Virgin Islands (U.S.) | 2.487 | 80.152 |
West Bank and Gaza | 4.01 | 74.6 |
Western Sahara | 2.363 | 67.764 |
Yemen, Rep. | 4.075 | 67 |
Zambia | 5.687 | 56.7 |
Zimbabwe | 3.486 | 56 |
Here I write R-code for given problem in which I calculate some
thing which you want
Also I write those things in the front of " # " for every
questions. And it output is
write end of code.Also I give picture of Histograms, And write
Regression equation.
But before run code first copy given data as it is in
Excel.
Then copy the only second and last column without selecting Lable
row/Tital.
Then Run it then we get required answers.
The R-code is as follows:
a=read.table("clipboard",header=F)
attach(a)
summary(V1)
summary(V2)
var(V1)
var(V2)
summary(lm(V2~V1))
cor(V1,V2)
hist(V1)
hist(V2)
And the output is as follows:
> summary(V1)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.065 1.798 2.277 2.783 3.461 7.561
> summary(V2)
Min. 1st Qu. Median Mean 3rd Qu. Max.
56.00 65.88 74.25 72.17 78.28 83.38
> var(V1)
[1] 1.9415
> var(V2)
[1] 56.29439
> summary(lm(V2~V1))
Call:
lm(formula = V2 ~ V1)
Residuals:
Min 1Q Median 3Q Max
-13.476 -3.074 0.452 3.223 12.468
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 84.1613 0.7176 117.28 <2e-16 ***
V1 -4.3090 0.2307 -18.68 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1
Residual standard error: 4.511 on 196 degrees of
freedom
Multiple R-squared: 0.6404, Adjusted R-squared: 0.6385
F-statistic: 349 on 1 and 196 DF, p-value: <
2.2e-16
> cor(V1,V2)
[1] -0.8002306
And the histograms are:
Here we cannot use our regression model to predict the life expectancy of one particular individual because
Our R-squared value is 0.6385 which explains only 63.85% variation in given data.
The Regression equation is as :
Life Expectancy =84.1613 - (4.3090) *Fertility For united state The Fertility is 1.976 and the Life Expectancy is 78.9 Now, by using equation we get, Life Expectancy for united state = 84.1613 - (4.3090) *1.967 = 75.6855? Which is different than actual value. This difference is called Residual / Error. Also it seen that there strong negative correlation between two variables. Thus we can say that fertility increases resulting the Life Expectancy goes down. |