In: Statistics and Probability
To study the effect the ecological impact of malaria, researchers in California measured the effect of malaria on what distance an animal could run in 2 minutes. They used a sample of a local lizard species, Sceloporis occidentalis, collected in the field. They selected 15 lizards from those found to be infected with the malarial parasite Plasmodium mexicanum and 15 lizards found not infected. The distance each lizard could run in a time limit of 2 minutes was recorded in a controlled environment.
Infected 16.4 29.4 37.1 23.0 24.1 24.5 16.4 29.1 36.7 28.7 30.2 21.8 37.1 20.3 28.3
Uninfected 22.2 34.8 42.1 32.9 26.4 30.6 32.9 37.5 18.4 27.5 45.5 34.0 45.5 24.5 28.7
a) Does this design use paired data or independent samples? Explain.
b) Compute a 95% confidence interval for the difference in distance ran between the two malarial infections of the lizards. Show all of your working.
c) Based on this confidence interval, can you conclude there is a difference in mean distance ran between the infected and uninfected lizards?
d) Conduct a test of significance (at a significance level of 5%) on the difference between infected and uninfected lizards to decide whether the uninfected lizards can run a further distance on average. Follow all steps clearly and write a clear conclusion.
e) Do you have the same conclusion from the confidence interval calculation in c) as that concluded by the test done in d) above? Explain why or why not.
a) The reseachers collected 2 samples of 15 lizards each, the first sample of 15 lizards from the population of infected lizards and a second sample of 15 lizards from the pouplation of not infected lizards. Hence the design use independent samples
Let be the true mean distance run by infected lizards and be the true mean distance run by uninfected lizards.
the following are the sample information
sample size of 2 samples is
sample mean of infected lizard
sample mean of uninfected lizrards
sample variance of infected lizard
sample variance of uninfected lizards
Since the sample sizes are less than 30, we need to assume that the population variance of the dustance run by infected and uninfected lizards are the same. That is
Since we do not know the population variances of the distance run, we will estimate it using a pooled variance
Now we estimate the standard error of the difference between 2 sample means as
b) Since the sample size is small (less than 30) and we do not know the population standard deviation we will use t statistics
The 95% confidence interval can be written as , giving
We want to find 2 limits L and U such that
95% confidence interval is
The t statistics needed at degrees of freedom =
is
can be got from t tables and
95% confidence interval is
c) The confidence interval has 0 with in its range. That means we can not conclude that there is a difference in mean distance ran between the infected and uninfected lizards
d) Let be the true mean distance run by infected lizards and be the true mean distance run by uninfected lizards.
We want to test if whether the uninfected lizards can run a further distance on average, that is we want to test if
The following are the hypotheses
The hypothesized difference in the means
The sample test statistics is
This is a one tailed test (left tail). The rejection region is the area under the curve for significance level alpha=0.05
The critical value from t tables for degrees of freedom is
We reject the null hypothesis if the sample t statistics is less than -1.701.
Since the sample statistics is -1.97, we reject the null hypothesis.
We conclude that there is sufficient evidence to support the claim that the uninfected lizards can run a further distance on average.
e) In part c we concluded that there is no difference in mean distance ran, where as in part d we concluded that the uninfected lizards can run a further distance on average.
This is due to the fact that in part c we did a 2 tailed test to check if there is difference between 2 means. That is the alternative hypothesis in part c is
In part c), at 95% confidence level, the rejection region in either tail has an area of alpha/2=0.05/2=0.025.
This rejection region is smaller in part c), compared to the one tail test in part d. In part d the rejection region has an area = 0.05 on the left tail and hence has a larger region to reject the null hypothesis.
Hence we get different conclusions