In: Statistics and Probability
Question 4
What is the sampling distribution of the sample mean? Provide examples.
Question 5
What is the central limit theorem? Provide examples.
Question 6
What is the standard error of the mean? Provide examples.
Question 4:
A sampling distribution of sample means is a theoretical distribution of the values that the mean of a sample takes on in all of the possible samples of a specific size that can be made from a given population
Example: Pumpkin Weights
The population is the weight of six pumpkins (in pounds) displayed in a carnival "guess the weight" game booth. You are asked to guess the average weight of the six pumpkins by taking a random sample without replacement from the population.
Pumpkin |
A |
B |
C |
D |
E |
F |
Weight (in pounds) |
19 |
14 |
15 |
9 |
10 |
17 |
a. Calculate the population mean ?.
? = (19 + 14 + 15 + 9 + 10 + 17 ) / 6 = 14 pounds
b. Obtain the sampling distribution of the sample mean for a sample size of 2 when one samples without replacement.
Sample |
Weight |
Ybar |
Probability |
A, B |
19, 14 |
16.5 |
1/15 |
A, C |
19, 15 |
17.0 |
1/15 |
A, D |
19, 9 |
14.0 |
1/15 |
A, E |
19, 10 |
14.5 |
. |
A, F |
19, 17 |
18.0 |
. |
B, C |
14, 15 |
14.5 |
. |
B, D |
14, 9 |
11.5 |
. |
B, E |
14, 10 |
12.0 |
. |
B, F |
14, 17 |
15.5 |
. |
C, D |
15, 9 |
12.0 |
. |
C, E |
15, 10 |
12.5 |
. |
C, F |
15, 17 |
16.0 |
. |
D, E |
9, 10 |
9.5 |
. |
D, F |
9, 17 |
13.0 |
1/15 |
E, F |
10, 17 |
13.5 |
1/15 |
Distribution of Y - Bar:
Y bar |
9.5 |
11.5 |
12.0 |
12.5 |
13.0 |
13.5 |
14.0 |
14.5 |
15.5 |
16.0 |
16.5 |
17.0 |
18.0 |
Probability |
1/15 |
1/15 |
2/15 |
1/15 |
1/15 |
1/15 |
1/15 |
2/15 |
1/15 |
1/15 |
1/15 |
1/15 |
1/15 |
( i ) One can thus see that the chance that the sample mean is exactly the population mean is only 1 in 15, very small. (In some other examples, it may happen that the sample mean can never be the same value as the population mean.) When using the sample mean to estimate the population mean, some possible error will be involved since sample mean is random.
( ii ) The mean of the sample mean when the sample size is 2:
Mean of sample mean
= (16.5 + 17.0 + 14.0 + 14.5 + 18.0 + 14.5 + 11.5 + 12.0 + 15.5 + 12.0 + 12.5 + 16.0 + 9.5 + 13.0 + 13.5) / 15
= 14 pounds
Thus, even though each sample may give you an answer involving some error, the expected value is right at the target: exactly the population mean. In other words, if one does the experiment over and over again, the overall average of the sample mean is exactly the population mean.
Question 5:
The Central Limit Theorem
The central limit theorem (CLT) is a statistical theory that states that given a sufficiently large sample size from a population with a finite level of variance, the mean of all samples from the same population will be approximately equal to the mean of the population. Furthermore, all of the samples will follow an approximate normal distribution pattern, with all variances being approximately equal to the variance of the population divided by each sample's size.
For a large sample size (rule of thumb: n ? 30), ? is approximately normally distributed, regardless of the distribution of the population one samples from. If the population has mean ?? and standard deviation ??, then ? has mean ? and standard error ?/?n.
Example: Speedboat Engines
The engines made by Ford for speedboats had an average power of 220 horsepower (HP) and standard deviation of 15 HP.
1. A potential buyer intends to take a sample of four engines and will not place an order if the sample mean is less than 215 HP. What is the probability that the buyer will not place an order?
We want to find P(¯yy¯ < 215) = ?
Answer: We need to know whether the distribution of the population is normal since the sample size is too small: n = 4 (less than 30 which is required in the central limit theorem). If someone confirms that the population normal, then we can proceed since the sampling distribution of the mean of a normal distribution is also normal for all sample sizes.
If the population follows a normal distribution, we can conclude that ? has a normal distribution with mean 220 HP and a standard error of ?/?n=15/?4=7.5HP.
P(? < 215)
= P(Z < (215 - 220) / 7.5)
= P(Z < -0.67)
= 0.2514
If the customer just samples four engines, the probability that the customer will not place an order is 25.14%.
Question 6:
In particular, the standard error of a sample statistic (such as sample mean) is the actual or estimated standard deviation of the error in the process by which it was generated. In other words, it is the actual or estimated standard deviation of the sampling distribution of the sample statistic.
The standard error of the mean is a technique for estimating the standard deviation of given sampling distribution. It is also termed as standard deviation of the mean. Standard error is defined as the standard deviation of a number of sample statistics usually mean or median.
Standard error of the mean is said to be the standard deviation of various distributions of sample means that are extracted from the population. The sample to be more representative of the population, the standard error should be as small as possible. The more the sample size, the lesser the standard error would be.
For Example:
In an experiment, suppose we are to calculate the speed of sound along the directions of X, Y and Z axes in some material. We may find average speed of sound by calculating mean of the observed values. Also, note that there are external factors influencing the speed of sound and creating the errors.
These factors may include pressure changes and temperature variations in the lab, changes in velocity of wind, reaction time of stopwatch and random errors while collecting data.
Therefore, we should prefer considering a number of measurements and finding mean every time, instead of getting the mean of one measurement. This process is called the sampling distribution. The standard deviation of these means is known as standard error of the mean.
The term standard error of the mean is eventually abbreviated by SEM. It is usually measured by the sample standard deviation divided by the square root of sample size. Generally, it is denoted by SEx?. The formula for standard error of the mean is given below :
SEx? = s/?n
Where
s denotes sample standard deviation
n represents is the sample size or the number of observations.
By having a look at the formula, we may conclude that standard error of the mean is inversely proportional to the sample size. A larger sample size given small SEM and a smaller sample size provides a bigger SEM.