In: Math
1. Parkruns are free 5km timed runs usually run on weekends in different cities, towns and suburbs around the world. It is suspected that they are competitive and hence faster than 5km family fun runs which require payment for registration. Parkrun finishers do not get medals whereas all family fun run finishers get medals. The summary statistics below are from two random samples of 10 park runners and 10 5km family fun runners in Bellville. The races were run on the same route and on 2 separate days but with identical weather conditions. The summary statistics are for the recorded finishing times (in minutes) of the runners. park runners family fun runners Sample Mean: 20.58 25.67 Sample standard deviation : 3.5 5.7 (a) Is there evidence that the mean finishing time differs between the park runners and family fun runners? Perform an appropriate statistical test. (b) Construct a 95% confidence interval for the difference between the population means of the two groups. Compare your results to your conclusions in (a). (c) What assumption(s) are necessary for performing the hypothesis tests and constructing the confidence interval above? Hint: Use a 5% significance level for the F-test but report a p-value for the t-test
PART A.
HYPOTHESIS TEST
given that,
mean(x)=20.58
standard deviation , s.d1=3.5
number(n1)=10
y(mean)=25.67
standard deviation, s.d2 =5.7
number(n2)=10
null, ho: u1 = u2
alternate, mean finishing time differs between the park runners and
family fun runners h1: u1 != u2
level of significance, α = 0.05
from standard normal table, two tailed t α/2 =2.262
since our test is two-tailed
reject ho, if to < -2.262 or if to > 2.262
we use test statistic (t) = (x-y)/sqrt(s.d1^2/n1)+(s.d2^2/n2)
to =20.58-25.67/sqrt((12.25/10)+(32.49/10))
to =-2.4064
| to | =2.4064
critical value
the value of |t α| with min (n1-1, n2-1) i.e 9 d.f is 2.262
we got |to| = 2.40641 & | t α | = 2.262
make decision
hence value of | to | > | t α| and here we reject ho
p-value: two tailed ( double the one tail ) - ha : ( p != -2.4064 )
= 0.039
hence value of p0.05 > 0.039,here we reject ho
answers
---------------
null, ho: u1 = u2
alternate, h1: u1 != u2
test statistic: -2.4064
critical value: -2.262 , 2.262
decision: reject ho
p-value: 0.039
we have evidence to support mean finishing time differs between the
park runners
and family fun runners
PART B. CONFIDENCE
INTERVAL
given that,
mean(x)=20.58
standard deviation , s.d1=3.5
number(n1)=10
y(mean)=25.67
standard deviation, s.d2 =5.7
number(n2)=10
i.
stanadard error = sqrt(s.d1^2/n1)+(s.d2^2/n2)
where,
sd1, sd2 = standard deviation of both
n1, n2 = sample size
stanadard error = sqrt((12.25/10)+(32.49/10))
= 2.115
ii.
margin of error = t a/2 * (stanadard error)
where,
t a/2 = t -table value
level of significance, α = 0.05
from standard normal table, two tailed and
value of |t α| with min (n1-1, n2-1) i.e 9 d.f is 2.262
margin of error = 2.262 * 2.115
= 4.785
iii.
ci = (x1-x2) ± margin of error
confidence interval = [ (20.58-25.67) ± 4.785 ]
= [-9.875 , -0.305]
-----------------------------------------------------------------------------------------------
direct method
given that,
mean(x)=20.58
standard deviation , s.d1=3.5
sample size, n1=10
y(mean)=25.67
standard deviation, s.d2 =5.7
sample size,n2 =10
ci = x1 - x2 ± t a/2 * sqrt ( sd1 ^2 / n1 + sd2 ^2 /n2 )
where,
x1,x2 = mean of populations
sd1,sd2 = standard deviations
n1,n2 = size of both
a = 1 - (confidence level/100)
ta/2 = t-table value
ci = confidence interval
ci = [( 20.58-25.67) ± t a/2 * sqrt((12.25/10)+(32.49/10)]
= [ (-5.09) ± t a/2 * 2.115]
= [-9.875 , -0.305]
PAR C.
INTERPRETATIONS
we consider that both samples are randomly chosen and normall
distributed and
from hypothesis
we have evidence to support mean finishing time differs between the
park runners
and family fun runners
from confidence,
interpretations:
1. we are 95% sure that the interval [-9.875 , -0.305] contains the
true population proportion
2. if a large number of samples are collected, and a confidence
interval is created
for each sample, 95% of these intervals will contains the true
population proportion