In: Math
As a researcher, you will be considering the relationships between variables regularly. Linear correlation is the most basic relationship that may exist between two variables. This assignment will you give the opportunity to see how statistics can start to answer the research questions you will be faced with in your field of study.
Correlation Project Directions:
Consider a possible linear relationship between two variables that you would like to explore.
Show all work to receive full credit. Provide complete sentence explanations for each of the above.
Air quality forecasting is very important to identify the variables that control concentration of concerned pollutant and to develop a function F which gives a relationship between the pollutant concentration and the correlated variables. We have a data of AQI and PM10, PM2.5 .
We see there is a relationship between AQI and PM2.5 or AQI and PM10
The data used in this study consists of daily observations of AQI, PM10 and PM2.5 for the period 2017-2019 taken from Central Pollution Control Board (CPCB), Delhi.
We first missing values in the whole data set were completed by using the mean of the corresponding series.
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.779421 | |||||
R Square | 0.607498 | |||||
Adjusted R Square | 0.606888 | |||||
Standard Error | 48.52492 | |||||
Observations | 646 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 2347025 | 2347025 | 996.7541 | 6.6E-133 | |
Residual | 644 | 1516406 | 2354.668 | |||
Total | 645 | 3863431 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 8.022893 | 5.343874 | 1.501325 | 0.133762 | -2.47063 | 18.51641 |
PM10 | 1.072772 | 0.033979 | 31.57141 | 6.6E-133 | 1.006048 | 1.139495 |
The result shows that 60% variation is explained by the independent variable. There is a positive relationship exist in these variables.
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.94475 | |||||
R Square | 0.892552 | |||||
Adjusted R Square | 0.892385 | |||||
Standard Error | 25.38887 | |||||
Observations | 646 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 3448312 | 3448312 | 5349.581 | 0 | |
Residual | 644 | 415119.1 | 644.5948 | |||
Total | 645 | 3863431 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 11.36439 | 2.333399 | 4.870316 | 1.4E-06 | 6.782401 | 15.94638 |
X Variable 1 | 2.057698 | 0.028133 | 73.14083 | 0 | 2.002454 | 2.112942 |
The result shows that 89% variation is explained by the independent
variable. There is a strong positive relationship exist in these
variables.