In: Statistics and Probability
Now, imagine you are a statistician called to do some analysis to show a quality improvement in a manufacturing plant. The plant manager wants to use the statistical analysis in a presentation to the company executives to show that quality improvements are being made in the plant. The data is constructed with a sample of size n equal to 300 items and the after data is composed of a sample size n equal to 300 items. When ensuring the integrity of the data, you discover the following information:
The before data is composed of:
180 items or 60% from crew A known for their poor quality,
60 items or 20% from crew B known for their average quality,
60 items or 20% from crew C known for their excellent quality; and
The after data is composed of:
60 items or 20% from crew A known for their poor quality,
60 items or 20% from crew B known for their average quality,
180 items or 60% from crew C known for their excellent quality.
The data composition presents you with a concern about the data being skewed, and you feel obligated to discuss your concerns with the plant manager. From your previous discussions with the plant manager, you have the impression that the plant manager needs to present the analysis to upper management and may try to insist on performing the analysis with the present data. So, you want to have, fully, in your mind the ethical reasons why the analysis should not proceed with all data as is. Also, from your past experience, you believe that your concern will become mutual if you can propose a solution to the potential problem. You need to propose a solution of how to proceed with the analysis to test if an improvement was made.
For this we conducted/suggested the paired t test to find out if, in general, to test if an quality improvement was made in the plant. In this analysis we test if true mean difference is zero between before and after data is composed. Process used was as follows:
Let x = quality improvement data, before the data was composed, y = quality improvement data, after the data was composed
1. Calculate the difference (di = yi ? xi) between the two observations on each pair, making sure you distinguish between positive and negative differences.
2. Calculate the mean difference, d-bar.
3. Calculate the standard deviation of the differences, sd, and use this to calculate the standard error of the mean difference, SE(d-bar ) = sd/sqrt(n) , where n = 300
4. Calculate the t-statistic, which is given by T = d-bar/ SE(d-bar) . Under the null hypothesis, this statistic follows a t-distribution with n ? 1 degrees of freedom.
5. We Use tables of the t-distribution to compare your value for T to the t(n?1) distribution. This will give the p-value for the paired t-test.
Computations were as follows:
Crew | Quality | Before Data was Composed | After Data was Composed | Difference (d) | ||
A | Poor | 180 | 60 | -120 | ||
B | Average | 60 | 60 | 0 | ||
C | Excellent | 60 | 180 | 120 | ||
d-bar = | 0 | |||||
Sd = | 120 | |||||
SE(d-bar) = | '=120 / sqrt(300) = | 6.9282 | ||||
So we have, t = 0/6.9282 = | 0 |
On 299 |
Looking this up in tables gives p = 1. Therefore, there is strong evidence that, on average, the quality improvement efforts does not lead to improvements. |