In: Statistics and Probability
Hundreds of pounds per acre (x) |
Bushels per acre (y) |
1.0 |
25 |
2.5 |
32 |
3.0 |
35 |
3.0 |
32 |
3.4 |
35 |
4.0 |
39 |
4.0 |
41 |
4.5 |
40 |
b. Do the data appear to be positively or negatively correlated?
Sol:
with plot function in R obtain a scatterplot
Rcode:
df2 =read.table(header = TRUE, text ="
fertilzer_peracre bushels_produced
1.0 25
2.5 32
3.0 35
3.0 32
3.4 35
4.0 39
4.0 41
4.5 40
"
)
df2
linreg <-
lm(bushels_produced~fertilzer_peracre,data=df2)
summary(linreg)
plot(x= df2$fertilzer_peracre,y=df2$bushels_produced,
main="Scatterplot of bushels_produced vs fertilzer_peracre",
xlab="fertilzer_peracre",ylab="bushels_produced")
Intrpetation:
From scatterplot we observe there exists a strong linear realtionship between amount of fertilizer (x) and the number of bushels (y) of soybeans produced.
With lm fucntion in R fit a linear regression of the number of bushels (y) of soybeans produced and amount of fertilizer (x)
Rcode:
linreg <-
lm(bushels_produced~fertilzer_peracre,data=df2)
summary(linreg)
plot(x= df2$fertilzer_peracre,y=df2$bushels_produced,
main="Scatterplot of bushels_produced vs fertilzer_peracre",
xlab="fertilzer_peracre",ylab="bushels_produced")
abline(coef(linreg)[1:2],col='red')
## rounded coefficients for better output
cf <- round(coef(linreg), 4)
check to avoid having plus followed by minus for negative
coefficients
eq <- paste0("bushels_produced = ", cf[1],
ifelse(sign(cf[2])==1, " + ", " - "), abs(cf[2]),
"fertilzer_peracre "
)
## printing of the equation
mtext(eq, 3, line=0)
Solution-b:
we have
bushels_produced= 20.094177 + 4.655377 *fertilzer_peracre
for x=fertilzer_peracre =3.5
bushels_produced= 20.094177 + 4.655377 *3.5
=36.388
the number of bushels given 3.5 hundred pounds of fertilizer.=36.388