In: Statistics and Probability
Major car manufacturers make use of computer simulations to
experiment with possible plant layouts. The simulations are
designed to reflect stochastic effects such as different demand
patterns, chance variations in processing times and breakdowns.
Hence the results of running a simulation to calculate shifts’
production rates need to be analysed using appropriate statistical
methods. Often these computer simulations are quite complex with
quite long runtimes, and hence there is an interest in using as few
runs of the simulation as possible, whilst still being able to
detect meaningful differences in performance between different
layouts.
The performance measure on which layouts are being compared is
hourly production rate. Suppose that two possible plant layouts are
being considered and that each been simulated for 12 shifts. The
observed production rates are tabulated below.
Plant Layout Production rates (cars per hour)
Layout 1 128 112 111 114 121 119 131 121 122 114 116 129
Layout 2 126 112 107 114 118 114 131 122 118 113 112 131
a) Suppose that the results from the two layouts are independent
samples. What conditions must be satisfied if we are to apply a
parametric (i.e. z or t) test to compare the production rates of
the two layouts?
b) Describe a way in which the observation rates from the two
layouts could be paired. What conditions must be satisfied if we
are to apply a parametric (i.e. z or t) test to compare the
production rates of the two layouts if the samples are
paired?
c) Supposing that the conditions you outlined in part (b) are
satisfied, conduct the appropriate parametric test to compare the
production rates of the two layouts. Clearly justify your choice of
test, state your null and alternative hypotheses and explain your
conclusion. Use a 5% significance level.
d) What is the approximate power of the test you have applied in
part (c) if the true difference in production rates is that Layout1
is on average 2.0 cars per hour faster than Layout2?
a.
Conditions for independent samples t-test/z-test:
b:
Conditions for paired samples t-test/z-test:
It must be kept in mind that, even if the populations are not normally distributed, if the sample size is at least 30, it is safe to assume an approximate normal distribution and proceed with the above tests.
c:
It is assumed that the conditions of a paired samples t-test/z-test are satisfied. Note that the population variance of the differences is unknown. As a result, it has to be estimated by the sample standard deviation of differences. Hence, it is suitable to use the paired samples t-test.
A scenario in which the paired samples t-test can be used is, if the layouts are somehow related, such as, if the layouts are based on the same machinery, or same operators, etc.
Suppose μd denotes the true mean difference between the mean of Layout 1 and Layout 2, that is, μd = μ Layout 1 – μ Layout 2. The null and alternative hypotheses are as follows:
H0: μd = 0, vs. H1: μd ≠ 0.
The formula for the test statistic is t = (x̅d – μd)/se, where x̅d is the sample mean difference, and se is the sample standard error for differences. Since there are n pairs of observations, the degrees of freedom is (n – 1).
The calculations have been done using Excel. Enter the data for the two layouts in two columns. Go to Data > Data Analysis > t-Test: Paired Two Sample for Means > OK; In Variable 1 Range, enter $A$1:$A$13, in Variable 2 Range, enter $B$1:$B$13; Tick on Labels; Enter Alpha as 0.05; Enter Hypothesized Mean Difference as 0. Click OK.
The output is as follows:
Since the alternative hypothesis is two-tailed, the p-value is: P(T<=t) two-tail ≈ 0.0295.
Since p-value < α (level of significance = 0.05), the null hypothesis is rejected.
Conclusion:
The hourly production rates of the two layouts are significantly different from one another.
d:
Power = PH1 (| t | > t*), where t* is the test statistic value when H1 is true.
Here, since Layout 1 is on average 2.0 cars per hour faster than Layout 2, and μd = μ Layout 1 – μ Layout 2, under H1, the value of μd = 2. Using Excel formulae =AVERAGE(C2:C13) and =STDEV.S(C2:C13), the mean and standard deviation of the sample differences are respectively, x̅d = 1.67, se = 2.3094. Then, the value of t* is:
t
= (x̅d – μd)/se
= (1.67 – 2)/2.3094
≈ –0.1429.
Using the Excel formula =T.DIST.2T(0.1429,11) [the negative sign is ignored here; degrees of freedom is n – 1 = 12 – 1 = 11], Power = PH1 (| t | > t*) ≈ 0.8890.
Hence, the approximate power is 0.8890.