In: Statistics and Probability
. Brightly blarney but blissfully blas´e bloodhound Sheerluck Hopeless has trained herself to recognize at a glance suspect heights and weights. From experience she has found that her estimation errors tend to follow approximately Normal distributions. Hopeless tests her estimation abilities on five London constables: her estimation errors, in cm and kg, are given in the table below: Hopeless Errors Heights (cm) -0.6 -0.2 0.2 0.5 0.6 Weights (kg) -1.2 -0.7 -0.1 0.1 0.9 “My compliments, Hopeless!” Dr. Witless puzzled, “To find the probability a thing does NOT happen, how did you describe it?” “Complementary, my dear Witless,” Hopeless rubbed her eyes, bowed her head, and exhaled an uncomplimentary sigh, “Complementary!” a. Find a 95% confidence interval for the mean difference between Sheerluck’s errors in estimating suspect heights and weights. (5) b. Test that there IS a difference between Sheerluck’s mean estimation errors for suspect heights and weights against the hypothesis of NO difference. (25) c. Compare Sheerluck’s estimation errors for heights against weights: i. Draw a simple plots of weight errors (y) against height errors (x). (5) ii. Find the correlation coefficient between Sheerluck’s two types of errors. (5) iii. Find the regression line for weight errors as a function of height errors. (10) iv. Use the coefficient of variation to interpret how well this linear model explains the relationship between Sheerluck’s height and weight estimation errors
a)
Degree
of freedom, DF= n1+n2-2 = 8
t-critical value = t α/2 =
2.3060 (excel formula =t.inv(α/2,df)
pooled std dev , Sp= √([(n1 - 1)s1² + (n2 -
1)s2²]/(n1+n2-2)) = 0.6671
std error , SE = Sp*√(1/n1+1/n2) =
0.4219
margin of error, E = t*SE = 2.3060
* 0.4219 =
0.9729
difference of means = x̅1-x̅2 =
0.1000 - -0.200 =
0.3000
confidence interval is
Interval Lower Limit= (x̅1-x̅2) - E =
0.3000 - 0.9729
= -0.6729
Interval Upper Limit= (x̅1-x̅2) + E =
0.3000 + 0.9729 =
1.2729
b)
Ho
: µ1 - µ2 = 0
Ha : µ1-µ2 ╪ 0
Level of Significance , α =
0.05
Sample #1 ----> sample 1
mean of sample 1, x̅1= 0.10
standard deviation of sample 1, s1 =
0.50
size of sample 1, n1= 5
Sample #2 ----> sample 2
mean of sample 2, x̅2= -0.20
standard deviation of sample 2, s2 =
0.80
size of sample 2, n2= 5
difference in sample means = x̅1-x̅2 =
0.1000 - -0.2 =
0.30
pooled std dev , Sp= √([(n1 - 1)s1² + (n2 -
1)s2²]/(n1+n2-2)) = 0.6671
std error , SE = Sp*√(1/n1+1/n2) =
0.4219
t-statistic = ((x̅1-x̅2)-µd)/SE = ( 0.3000
- 0 ) / 0.42
= 0.711
Degree of freedom, DF= n1+n2-2 =
8
t-critical value , t* =
2.3060 (excel formula =t.inv(α/2,df)
Decision: | t-stat | < | critical value |, so, Do
not Reject Ho
p-value = 0.497247
(excel function: =T.DIST.2T(t stat,df) )
Conclusion: p-value>α , Do not reject null
hypothesis
There is not enough evidence that there IS a difference
between Sheerluck’s mean estimation errors for suspect heights and
weights
c)
-----------------------------
1)
2)
ΣX | ΣY | Σ(x-x̅)² | Σ(y-ȳ)² | Σ(x-x̅)(y-ȳ) | |
total sum | 0.5 | -1 | 1 | 2.6 | 1.53 |
mean | 0.10 | -0.20 | SSxx | SSyy | SSxy |
sample
size , n = 5
here, x̅ = Σx / n= 0.10 ,
ȳ = Σy/n = -0.20
SSxx = Σ(x-x̅)² = 1.0000
SSxy= Σ(x-x̅)(y-ȳ) = 1.5
estimated slope , ß1 = SSxy/SSxx = 1.5
/ 1.000 = 1.5300
intercept, ß0 = y̅-ß1* x̄ =
-0.3530
3)
so, regression line is Ŷ =
-0.3530 + 1.5300
*x
SSE= (SSxx * SSyy - SS²xy)/SSxx =
0.219
std error ,Se = √(SSE/(n-2)) =
0.270
2)
correlation coefficient , r = Sxy/√(Sx.Sy) = 0.9563
4)
Coefficient of variation=
Height = mean/ s.d * 100 = .1 / 1 * 100 = 10
Weight = mean/ s.d * 100 = - .2 / 2.6 * 100 = -8
R² = (Sxy)²/(Sx.Sy) = 0.9144
Relationship is strong
Please give me a thumbs-up if this helps you out. Thank you! :)