In: Statistics and Probability
The following data were recorded as part of a study on sustainable farming techniques that took place in Boone County, IA. (Davis, Adam S. et al. Increasing Cropping System Diversity Balances Productivity, Profitability and Environmental Health. PLOS ONE. October 10, 2012. DOI:10.1371/journal.pone.0047149) Means are for the April-November growing seasons.
(1): Construct a two-way scatter plot for “air temperature” against the “total precipitation” and on a separate graph construct a two-way scatter plot for “air temperature” against “log of total precipitation”. Looking at the two graphs you plotted, explain as to which of these two do you consider to be closest to a linear relationship?
(2): At the 0.05 level of significance, test the null hypothesis that the (“air temperature” and the “total precipitation”) population correlation coefficient [ρ] is equal to 0.
(3): Compute the equation of the linear regression relationship between the “air temperature” and “total precipitation”.
Year |
Mean air temperature (centigrade) [X] |
Total precipitation (mm) [Y] |
|
2003 |
14.9 |
790 |
|
2004 |
15.0 |
697 |
|
2005 |
15.9 |
748 |
|
2006 |
15.6 |
777 |
|
2007 |
16.4 |
839 |
|
2008 |
15.2 |
1145 |
|
2009 |
14.8 |
755 |
|
2010 |
16.5 |
1165 |
|
2011 |
15.2 |
701 |
1.
Converting y to log value we get below data and scatter plot
Mean air temperature (centigrade) [X] | Total precipitation (mm) [Y] |
14.9 | 2.897627091 |
15 | 2.843232778 |
15.9 | 2.873901598 |
15.6 | 2.890421019 |
16.4 | 2.923761961 |
15.2 | 3.058805487 |
14.8 | 2.877946952 |
16.5 | 3.066325925 |
15.2 | 2.845718018 |
2.
X Values
∑ = 139.5
Mean = 15.5
∑(X - Mx)2 = SSx = 3.26
Y Values
∑ = 7617
Mean = 846.333
∑(Y - My)2 = SSy = 260218
X and Y Combined
N = 9
∑(X - Mx)(Y - My) = 392.2
R Calculation
r = ∑((X - My)(Y - Mx)) /
√((SSx)(SSy))
r = 392.2 / √((3.26)(260218)) = 0.4258
The sample size is n=9, so then the number of degrees of freedom is df=n−2=9−2=7
The corresponding critical correlation value rc for a significance level of α=0.05, for a two-tailed test is:
rc=0.666
Observe that in this case, the null hypothesis is rejected if ∣r∣>rc=0.666.
Here r=0.4258<rc=0.666, so we fail to reject the null hypothesis
Hence test is not significant
c.
Sum of X = 139.5
Sum of Y = 7617
Mean X = 15.5
Mean Y = 846.3333
Sum of squares (SSX) = 3.26
Sum of products (SP) = 392.2
Regression Equation = ŷ = bX + a
b = SP/SSX = 392.2/3.26 =
120.3068
a = MY - bMX = 846.33 -
(120.31*15.5) = -1018.4213
ŷ = 120.3068X - 1018.4213