In: Statistics and Probability
Listed below are annual data for various years. The data are weights (metric tons) of imported lemons and car crash fatality rates per 100,000 population. Construct a scatterplot, find the value of the linear correlation coefficient r, and find the P-value using alpha α equals=0.05
Is there sufficient evidence to conclude that there is a linear correlation between lemon imports and crash fatality rates? Do the results suggest that imported lemons cause car fatalities?
Lemon Imports |
229 |
264 |
357 |
483 |
532 |
||
---|---|---|---|---|---|---|---|
Crash Fatality Rate |
15.9 |
15.7 |
15.5 |
15.2 |
14.9 |
What are the null and alternative hypotheses?
The linear correlation coefficient r is _
Test statistic t is?
PValue?
Because the P-value is less than the significance level 0.05, there is not sufficient evidence to support the claim that there is a linear correlation between lemon imports and crash fatality rates for a significance level of α equals=0.05
Do the results suggest that imported lemons cause car fatalities?
The results suggest that an increase in imported lemons causes car fatality rates to remain the same.
A. The results suggest that an increase in imported lemons causes car fatality rates to remain the same.
B.The results suggest that imported lemons cause car fatalities.
C.The results do not suggest any cause-effect relationship between the two variables.
D.The results suggest that an increase in imported lemons causes in an increase in car fatality rates.
We will use R-software to Construct a scatterplot ( any other software can be used ) similar scatterplot can be drawn manually ,
And also we will use R to find Test statistic t and P-value , if manual calculation are required , you can ask for that in comment box .
First we will import data in to R
> Lemon=c(229,264,357,483,532)
> Crash=c(15.9,15.7,15.5,15.2,14.9)
> Lemon
[1] 229 264 357 483 532
> Crash
[1] 15.9 15.7 15.5 15.2 14.9
Now we will draw scatter plot
> plot(Lemon ,Crash,xlab="Lemon Imports",
+ ylab="Crash Fatality Rate" , col=12,pch=19,main="Scatter
Plot")
To find the value of the linear correlation coefficient r
> cor(Lemon , Crash)
[1] -0.9861693
Thus the value of the linear correlation coefficient r is -0.9861693
What are the null and alternative hypotheses?
H0 : = 0 { there is no linear correlation between lemon imports and crash fatality rates }
H1 :
0 { there is a linear correlation between lemon imports and crash
fatality rates }
The linear correlation coefficient r is _ -0.9861693
Test statistic t is?
Test statistic t (T.S) =
Here n = 5 , and r = -0.9861693
Calculation ( From R )
> r= -0.9861693
> n=5
> test_statistics=r*sqrt(n-2) / sqrt(1-r^2)
> test_statistics
[1] -10.3058
Hence t value = -10.3058
i.e Test statistic t = -10.3058
PValue?
Now P-value is given by
P-Value = P( < - | T.S | ) + P( > | T.S | )
= P( < -10.3058) + P( > 10.3058 )
P-Value = 2 * P( < -10.3058) { de to symmetry )
Now is t-distributed with n-2 = 3 degree of freedom and α =0.05
2 * P( < -10.3058) can be computed from R
>
2*pt(-10.3058,df=3)
# 2 * P(
< -10.3058)
[1] 0.001948487
i.e P-Value = 2 * P( < -10.3058) = 0.001948487
Hence P-Value = 0.001948487
Thus
The linear correlation coefficient r is = -0.9861693
Test statistic t = -10.3058
P-Value = 0.001948487
Also from R-software
> cor.test(Lemon, Crash, method=c( "pearson"))
Pearson's product-moment correlation
data: Lemon and Crash
t = -10.306, df = 3, p-value =
0.001948
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.9991293 -0.7996469
sample estimates:
cor
-0.9861693
Conclusion :
Now P-Value = 0.001948487 << 0.05 , hence we reject null hypothesis H0.
Because the P-value is less than the significance level 0.05, there is sufficient evidence to support the claim that there is a linear correlation between lemon imports and crash fatality rates for a significance level of α equals=0.05
Do the results suggest that imported lemons cause car fatalities?
Since lemon imports and crash fatality rates are strongly negatively correlated , hence it implies that if one variable increases other will decrease i.e increase in imported lemons causes in an decrease in car fatality rates.
Thus Correct option is
B.The results suggest that imported lemons cause car fatalities.