In: Advanced Math
True or false
A) you are trying to estimate a parameter of interest, you have two estimators with the same variance but the first estimator is an un-biased estimator and the second one is a biased estimator ,. In this case you should choose the un/ biased estimator
True?. False?
Please Explain
B.) you are trying to estimate Q1 ( the 25th percentile) for the anual household income in the us based on a sample of households earnings can you suggest an estimator for Q1!? Please explain why you believe this is a good estimator
(a)A standard example is ridge regression. We have which is biased; but if X is ill conditioned then may be monstrous whereas can be much more modest.
Another example is the kNN classifier. Think about k=1: we assign a new point to its nearest neighbor. If we have a ton of data and only a few variables we can probably recover the true decision boundary and our classifier is unbiased; but for any realistic case, it is likely that k=1 will be far too flexible (.were having too much variance) and so the small bias is not worth it (the MSE is larger than more biased but less variable classifiers). Suppose that these are the sampling distributions of two estimators and we are trying to estimate 0. The flatter one is unbiased, but also much more variable.
Let us consider the Equation and we want to estimate μ value. we decide that a 'good' estimator is one that is unbiased. estimator is T1(X1,...,Xn)=X1 is unbiased for μμ, we have n data points.To make that idea more formal, we think that we ought to be able to get an estimator that varies less from μ for a given sample than T1. This means that we want an estimator with a smaller variance.
We say that we still want only unbiased estimators, but among all unbiased estimators we'll choose the one with the smallest variance. This leads us to the concept of the uniformly minimum variance unbiased estimator (UMVUE), an object of much study in classical statistics. IF we only want unbiased estimators, then choosing the one with the smallest variance is a good idea. In our example, consider that and
all three are unbiased but they have different variances are and has the smallest variance of these, and it's unbiased, so this is our chosen estimator.
We can quantify this concept with the mean squared error (MSE) which is like the average squared distance between our estimator and the thing we're estimating. If T is an estimator of θ, then
As I've mentioned earlier, it turns out that where bias is defined to be Bias(T)=E(T)−θ. Thus we may decide that rather than UMVUEs we want an estimator that minimizes MSE.
Suppose that T is unbiased. Then so if we are only considering unbiased estimators then minimizing MSE is the same as choosing the UMVUE. But, as I showed above, there are cases where we can get an even smaller MSE by considering non-zero biases.we want to minimize Var(T)+Bias(T)^2 We could require Bias(T)=0 and then pick the best T among those that do that, or we could allow both to vary. Allowing both to vary will likely give us a better MSE, since it includes the unbiased cases
The estimators that have this shape: one example is ridge regression, where you can think of each estimator as
(perhaps using cross-validation) make a plot of MSE as a function of λ and then choose the best Tλ.