In: Economics
(a) Discuss briefly the principles behind maximum
likelihood.
(b) Describe briefly the three hypothesis testing procedures that
are available
under maximum likelihood estimation. Which is likely to be the
easiest
to calculate in practice, and why?
(c) OLS and maximum likelihood are used to estimate the parameters
of a
standard linear regression model. Will they give the same
estimates?
Explain your answer.
a'
The principle of maximum likelihood is a strategy for acquiring the ideal estimations of the parameters that characterize a model. And keeping in mind that doing as such, you improve the likelihood of your model coming to the "valid" model
I am obtaining this astounding toy model from Nando de Fretais' talk, to show the principle of maximum likelihood. Consider 3 information focuses, y1=1,y2=0.5,y3=1.5y1=1,y2=0.5,y3=1.5, which are free and drawn from a guassian with obscure mean θθ and change 1. Suppose we have two options for θθ : {1, 2.5}. Which would you pick? Which model (θθ) would clarify the information better?
When all is said in done, any information point drawn from a gaussian with mean θθ and fluctuation 1, can be composed as,
yi∼N(θ,1)=θ+N(0,1)yi∼N(θ,1)=θ+N(0,1)
θθ, the mean, moves the focal point of the standard ordinary dissemination (μ=0μ=0 and σ2=1σ2=1)
The likelihood of data (y1,y2,y3y1,y2,y3) having been drawn from N(θ,1)N(θ,1), can be defined as,
P(y1,y2,y3|θ)=P(y1|θ)P(y2|θ)P(y3|θ)P(y1,y2,y3|θ)=P(y1|θ)P(y2|θ)P(y3|θ)
as y1,y2,y3y1,y2,y3 are independent.
Now, we have two normal distributions defined by θθ = 1 and θθ = 2.5. Let us draw both and plot the data points. In the figure below, notice the dotted lines that connect the bell curve to the data points. Consider the point y2=0.5y2=0.5 in the first distribution(N(μ=1,σ2=1)N(μ=1,σ2=1)). The length of the dotted line gives the probability of the y2=0.5y2=0.5 being drawn from N(μ=1,σ2=1)N(μ=1,σ2=1).
The likelihood of data (y1,y2,y3y1,y2,y3) having been drawn from N(μ=1,σ2=1)N(μ=1,σ2=1), is given by,
P(y1,y2,y3|θ=1)=P(y1|θ=1)P(y2|θ=1)P(y3|θ=1)P(y1,y2,y3|θ=1)=P(y1|θ=1)P(y2|θ=1)P(y3|θ=1)
The individual probabilities in the condition above, are equivalent to the statures of relating specked lines in the figure. We see that the likelihood, given by the result of individual probabilities of information focuses given model, is fundamentally the result of lengths of specked lines. Clearly the likelihood of model θ=1θ=1 is higher. We pick the model (θ=1θ=1), that augments the likelihood.
b
In insights, maximum likelihood estimation (likewise shortened as MLE) alludes to a methodology for estimation for parameters if a likelihood dispersion with the boost of a likelihood work, along these lines accept that the watched information in the measurable model is generally plausible. MLE essentially utilizes three theory testing methods, specifically score, wald, and likelihood proportion tests. These are examined as: - Wald test: The estimator an., of eo will be gotten with the augmentation of likelihood log over the entire space in parameter 0: d,. = ere me fiff S I)] where e. is the example and 40.l.) is the likelihood work. Therefore, it is developed with a measure ■ on how far 6). is from fulfillment of the invalid theory - Score test: It is an estimator 4 of es with the amplification of likelihood over the limited space in parameter metal: =„02,:w(a.t„)] Subsequently, it is developed with a correlation of the vector of subsidiaries of the likelihood of log at F: with the normal incentive under the invalid speculation - Likelihood proportion test. It depends on two the estimators of ee . The first, meant by F. th the augmentation of likelihood log over the whole parameter space , and the second, signified by „ is gotten with the boost of likelihood log over the limited space in parameter % Consequently, a test measurement is built with an examination of the log-likelihood of S. to that of S. .
Practically speaking, it is generally advantageous to apply the characteristic likelihood logarithm work, called the log-likelihood For the testing of fixed impacts, and when there is huge example size, the surmised likelihood proportion test will be generally solid by and by
As per policy we have to answer first question but isolve 2 que plz like