In: Statistics and Probability
Suppose we have three sets of random variables Wh, Xi, and Yj (for h= 1,...,k, i= 1,...,m, and j= 1,...,n) all of which are mutually independent. Assume that the three sets of random variables are all normally distributed with different means but the same standard deviation. The MLE for the means are just the group means and the MLE for the variance is the mean of the squared errors of the observations from the groups when taking into account the group means. Write a function to fit the this model to three observed data vectors w, x, y and return both the MLE and log-likelihood evaluated at the MLE. Use the commands
data("iris")
w = iris$Sepal.Width[iris$Sepecies=="setosa"]
x = iris$Sepal.Width[iris$Sepecies=="versicolor"]
y = iris$Sepal.Width[iris$Sepecies=="virginica"]
to make some data to analyze using your function. Compare the results from analyzing the data with the model for difference means to the results from analyzing the data when it would be assumed that the means are all the same. Comment on your results.
A statistic is a property of a sample, whereas a parameter is a property of a population. Often it’s natural to estimate a parameter θ (such as the population mean µ) by the corresponding property of the sample (here the sample mean X). Note that θ may be a vector or more complicated object. Unobserved quantities are treated mathematically as random variables. Potentially observable quantities are usually denoted by capital letters (Xi , X, Y etc.) Once the data have been observed, the values taken by these random variables are known (Xi = xi , X = x etc.) Unobservable or hypothetical quantities are usually denoted by Greek letters (θ, µ, σ 2 etc.), and estimators are often denoted by putting a hat on the corresponding symbol (θb, µb, σb 2 etc.) Nearly all statistics books use the above style of notation, so it will be adopted in these notes. However, sometimes I shall wish to distinguish carefully between knowns and unknowns, and shall denote all unknowns by capitals. Thus Θ represents an unknown parameter vector, and θ represents a particular assumed value of Θ. This is especially useful when considering probability distributions for parameters; one can then write fΘ(θ) and Pr(Θ = θ) by exact analogy with fX(x) and Pr(X = x). The set of possible values for a RV X is called its sample space ΩX. Similarly the parameter space ΩΘ is the set of possible values for the parameter Θ.
Fix the size of the test to be α. Let A be a positive constant and C0 a subset of the sample space satisfying 1. Pr(X ∈ C0 | θ = θ0) = α, 2. X ∈ C0 ⇐⇒ L(θ0; x) L(θ1; x) = f(x|θ0) f(x|θ1) ≤
To every bounded Bore1 set, B, of R there corresponds a random variable X(B) with E ) X(B) j2 < co. (2) If B, , BT ,... are disjoint Bore1 sets whose union, B, is bounded, then X(B) = X(B,) + X(B,) $- e-e
The random measures considered in this paper are assumed to be real and satisfy EX(B) = 0 for every Bore1 set B. A random measure has independent components if for every collection of disjoint Bore1 sets B, ,..., B, , the random variables X(B,),..., X(B,) are mutually independent. If X has independent components, the set function V defined for every bounded Bore1 set B by V(B) = E 1 X(B) I2 is a Bore1 measure. A random measure has stationary components if, for every collection of bounded Bore1 sets B, ,..., B,, , the joint distribution of the family X(T + BJ,..., X(T + B,) is independent of 7. For random measures with independent components, stationarity is equivalent to requiring that X(B) and X(T + B) be identically distributed for every B and every T. In the stationary case V is a Haar measure and is equal to Lebesgue measure on the Bore1 sets to within a nonnegative multiplicative constant. The points of R for which E 1 X((t)) I2 > 0 are called singul