In: Math
What does a sampling distribution have to do with a confidence interval?
First we need to establish the population parameter or what is it that we want to estimate:
Example: You want to know the average weight of a Californian. (This is the population parameter you are looking for)
Now if you want the correct answer you need to measure the weight of every single Californian and get the average. Unfortunately we do not have the money and time to do this.
So here is what we can do, we take a limited number of weights called a sample. Let’s say we took a sample of 100 Californians and got the average, which was 150lbs.
Now what we have is a “sample mean or sample average” from a single sample. If we took another 10 such samples of size 100, we would have 10 more “sample means or sample averages”.
If you create a distribution of such “sample averages” by taking as many samples as possible, the mean of the sample distribution will converge to the true population parameter. Below is an example:
Ex: Sequence of numbers { 1, 2 and 3}
The average of these three numbers is 2.
Let’s take samples of 2 from the sequence of three numbers and calculate the sample average
1, 2 =1.5
2, 3=2.5
3, 1=2 So the sample averages are {1.5, 2.5, and 2}. The average of these sample average is: 1.5 +2.5+3/3=2 ,
which is the true average.
Another example : You have a sample of apples and you have an average value. Which is just one sample average. We call this a point estimate. With one sample average you cannot say anything about the true value. So we create an interval around this point estimate and claim that there is a 95% probability that the true value would fall inside this interval. This what we call a confidence interval. (confidence interval around the point estimate)
And this is how they are linked