In: Statistics and Probability
Explain one way you can determine whether a sample of data is considered a "good sample" from a confidence interval.
First some background on confidence intervals.
A confidence interval of the form (a,b), which is constructed with (1-α)% confidence interval means we can say with (1-α) % probability that the value of the true population parameter lies within this interval.
Now to understand this concept, we have to know that calculating the exact population parameters is a very difficult task, which may involve considerable amount of time and money as well. We collect samples instead and try to estimate the population parameters using these samples. Different samples provide us with different mean values. We thus obtain a distribution of these sample means which is basically a distribution reached by calculating the means for different samples. Now most of the sample mean values would lie within a particular range. There may be a few outliers sure, but majority of these sample means would lie within a specified interval. So anyone of these values contained within this interval has the maximum probability of being the actual unknown population parameter.That is why it is called the confidence interval as it the maximum probability of containing the true population parameter with some confidence.
Now if the sample selected contains a lot of outliers, then the sample estimate of the population parameter would be an outlier as well. This sample estimate of say the population mean would then lie outside the confidence interval by a great very margin. Hence this would enable us to determine whether a sample of data is considered a "good sample" from a confidence interval.
Please upvote/ Thanks!