In: Statistics and Probability
The article “Estimating Population Abundance in Plant Species With Dormant Life-Stages: Fire and the Endangered Plant Grevillea caleye R. Br.” (T. Auld and J. Scott, Ecological Management and Restoration, 2004:125-129) presents estimates of population sizes of a certain rare shrub in areas burnt by fire. The following table presents population counts and areas (in m^2) for several patched containing the plant:
Area |
Population |
Area |
Population |
3739 |
3015 |
2521 |
707 |
5277 |
1847 |
213 |
113 |
400 |
17 |
11958 |
1392 |
345 |
142 |
1200 |
157 |
392 |
40 |
12000 |
711 |
7000 |
2878 |
10880 |
74 |
2259 |
223 |
841 |
1720 |
81 |
15 |
1500 |
300 |
33 |
18 |
228 |
31 |
1254 |
229 |
228 |
17 |
1320 |
351 |
10 |
4 |
1000 |
92 |
a) Create a scatter plot for the two variables, does there seem to be a linear relationship? (Do this scatter plot by hand)
data=read.csv('data.csv')
#install.packages("ggplot2")
library(ggplot2)
ggplot(data, aes(x=data$Area , y=data$Population)) +
geom_point()
b) Compute the least-square line. (show your work for the sum of squares, slope and intercept)
ggplot(data, aes(x=data$Area , y=data$Population)) +
geom_point()+
geom_smooth(method=lm, se=FALSE)
c)
The scatter plot done in part a) is based on the model form y(hat)=a+bx. Create a scatter plot of y versus lnx. What do you notice? (Note: you can find ln y and x using excel, but show your values. Do this scatter plot by hand)
ggplot(data, aes(x=log(data$Area) , y=data$Population)) +
geom_point()+
geom_smooth(method=lm, se=FALSE)
We observe that this model is a better fit for the data than the earlier one.
d) In part c), the data is considered to be “transformed”. Of this “transformed” data, compute the least-square line for predicting lny from lnx. (show all your work)
Solution:
model<-lm(formula= log(Population)~log(Area),data =
data)
summary(model)
model