In: Statistics and Probability
We observe n1=78 woman with skin cancer and n2=51 men with skin cancer in a small island community. The women with skin cancer had a total of ∑xi = 207 excisions to remove the cancers and the men with skin cancer had a total of ∑ yi = 212 excisions to remove the cancers.
A. Substitute the sample values into the likelihood function (p/1-p)^n (1-p) ^∑xi + ∑yi and plot the likelihood function for p.
B. Calculate the possible estimates of n1 + n2/ ∑xi + ∑yi AND 1/2 ((n1/ ∑xi) + (n2/ ∑yi))
C. What do each of these above estimates represent?
D. Which estimate is better supported by the data? Hint: You'll need to calculate the likelihood ratio and log-likelihood difference.
D. Is there a better supported value than either of these? Mark both estimates on your plot from part a.
(C) These two estimates represent the estimated value of 'p'. For first case, the proportion for women and men are assumed to be same and hence the estimate is calculated based on total sample by considering as a single sample. For second case, the estimate is calculated based on two samples considering as two samples are taken and the proportion may be different.
(D) Both the estimates support the data and the estimated values of 'p' are not different as much. Both the are plotted in the above figure and they almost coincide such that we can not observe them as two different lines.
######## R code #######
n1=78
n2=51
n=n1+n2
Sx=207
Sy=212
p=seq(0,1,0.01)
f=(p/(1-p))^n * (1-p)^(Sx+Sy)
plot(p,f,type="l",ylab="L(p)",main="Likelihood function against different values of p" )
####### (B) ######
p_hat1=n/(Sx+Sy)
p_hat1
## 0.3078759
p_hat2=((n1/Sx)+(n2/Sy))/2
p_hat2
## 0.3086888
plot(p,f,type="l",ylab="L(p)",main="Likelihood function with
estimates of p" )
abline(v=p_hat1)
abline(v=p_hat2)