In: Statistics and Probability
For this discussion, we are going to look at data that should follow a linear relationship.
1. Begin by gathering data that interests you.
- You can look online, in the newspaper, conduct an experiment yourself… Be creative!
2. What variables did you choose to explore?
- Which is the explanatory variable and which is the response variable? Fully explain why.
3. Describe how you obtained your data.
(Since I'm only allowed about 3 questions per post, this would be part 1 of the question, so if the answer you get seems open-ended/incomplete, it's because it is. But that's fine)
1. The data that interests me and has a roughly linear relationship is the population of China for over 9 years. I have collected data online.
X | Y |
Year |
Population (millions) |
2008 | 1328.02 |
2009 | 1334.5 |
2010 | 1340.91 |
2011 | 1347.35 |
2012 | 1354.01 |
2013 | 1360.72 |
2014 | 1367.82 |
2015 | 1374.62 |
2016 | 1382.71 |
2017 | 1390.08 |
2. Explanatory variable is Year
The response variable is the population of China
as with increasing year, the population is increasing thus year is explaining the population.
3. The data is available online.
I have also performed regression analysis in excel to help understand the relation between them
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.999434295 | |||||
R Square | 0.99886891 | |||||
Adjusted R Square | 0.998727524 | |||||
Standard Error | 0.741821775 | |||||
Observations | 10 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 3887.769644 | 3887.77 | 7064.824 | 4.47758E-13 | |
Residual | 8 | 4.402396364 | 0.5503 | |||
Total | 9 | 3892.17204 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | -12457.18964 | 164.3648449 | -75.7899 | 1.02E-12 | -12836.21565 | -12078.16362 |
Year | 6.864727273 | 0.081671889 | 84.05251 | 4.48E-13 | 6.676391558 | 7.053062987 |
The rehression equation is given by
Thus we can say for one unit increase in the year the population increases by 6.864 millions. The coefficient of determination is 0.998 which implies 99.8% variation in the population is explained by year.
please like the solution if it helps you. Thank you.