In: Statistics and Probability
a. The Log likelihood function is ?(?) = (a1 + a2) log(?) − ?(b1 + b2) write this as a function of θ, by substituting in θ = log(λ).
b. Write down the likelihood equation for θ, using the log-likelihood in part a, and hence determine θ^ the MLE for θ.
c. Show that θˆlog = (λ^). Show this algebraically, what property of MLEs is this?
d. Differentiate the LHS of the likelihood equation, obtain the expected information ?(?) = ?{??(?, ?)} = [(?(?1 + ?2)) / (?2)].
e. For the below data, use the expected information to calculate a standard error based on θ ˆ, and hence a confidence interval for θ. In your answer include:
- The formula for the expected information.
- The formula for the variance.
- Substitute the data and evaluate the standard error.
- The 95% confidence interval
DATA:
Samole Group 1 (low salt) 2 (high salt)
yi 245 239
ai 1150 850
bi 5980 3825
f. Using part d, calculate a confidence interval for λ.
g. Consider the two approaches to calculating confidence intervals in part f. Would it be possible (in other samples) for either of these two approaches to yield confidence intervals which include values that are not in the parameter space i.e negative values? Which approach is more desirable from this point of view?
The table gives the information about the data on the
relationship between smoker/non-smoker and dead/alive broken down
by age groups.
Collapsing over the age groups, the table is rewritten as
Here, the death rate for smokers is lower when compared to non
smokers. This is a good model of Simpson's paradox, as the data
reverses direction when collapsed over the variable “Age group”.
Hence, it can be concluded that younger women tend to smoke than
older women.