In: Math
- Using R Randomization Test -
"When waiting to get someone's parking space, have you ever thought that the driver you are waiting for is taking longer than necessary? Ruback and Juieng (1997) ran a simple experiment to examine that question. They hung out in parking lots and recorded the time that it took for a car to leave a parking place. They broke the data down on the basis of whether or not someone in another car was waiting for the space.
The data are positively skewed, because a driver can safely leave a space only so quickly, but, as we all know, they can sometimes take a very long time. But because the data are skewed, we might feel distinctly uncomfortable using a parametric t test. So we will adopt a randomization test."
```{r}
# no waiting records the time it took a driver to leave the parking
spot if no one was waiting for the driver
no_waiting <- c(36.30, 42.07, 39.97, 39.33, 33.76, 33.91, 39.65,
84.92, 40.70, 39.65,
39.48, 35.38, 75.07, 36.46, 38.73, 33.88, 34.39, 60.52, 53.63,
50.62)
# waiting records the time it takes a driver to leave if someone
was waiting on the driver
waiting <- c(49.48, 43.30, 85.97, 46.92, 49.18, 79.30, 47.35,
46.52, 59.68, 42.89,
49.29, 68.69, 41.61, 46.81, 43.75, 46.55, 42.33, 71.48, 78.95,
42.06)
mean(waiting)
mean(no_waiting)
obs_dif <- mean(waiting) - mean(no_waiting)
```
Conduct a randomization test to test the hypothesis that there is no difference in average time for drivers who have a person waiting vs those who do not have a person waiting, against the alternative that drivers who have a person waiting will take *longer* than if they did not.
Be sure to calculate an empirical p-value and make the appropriate conclusion.
Here to test the difference in average time for drivers, we use Two-sample t-test.
Here, the hypothesis is: H0: µ1=µ2 v/s H1: µ1≠µ2
output of the test: (R-output)
> no_waiting <- c(36.30, 42.07, 39.97, 39.33, 33.76, 33.91, 39.65, 84.92, 40.70, 39.65,39.48, 35.38, 75.07, 36.46, 38.73, 33.88, 34.39, 60.52, 53.63, 50.62)
> no_waiting
[1] 36.30 42.07 39.97 39.33 33.76 33.91 39.65 84.92 40.70 39.65 39.48
[12] 35.38 75.07 36.46 38.73 33.88 34.39 60.52 53.63 50.62
> waiting <- c(49.48, 43.30, 85.97, 46.92, 49.18, 79.30, 47.35, 46.52, 59.68, 42.89,49.29, 68.69, 41.61, 46.81, 43.75, 46.55, 42.33, 71.48, 78.95, 42.06)
> waiting
[1] 49.48 43.30 85.97 46.92 49.18 79.30 47.35 46.52 59.68 42.89 49.29
[12] 68.69 41.61 46.81 43.75 46.55 42.33 71.48 78.95 42.06
> mean(waiting)
[1] 54.1055
> mean(no_waiting)
[1] 44.421
> obs_dif <- mean(waiting) - mean(no_waiting)
> obs_dif
[1] 9.6845
> t.test(waiting,no_waiting)
Welch Two Sample t-test
data: waiting and no_waiting
t = 2.1496, df = 37.984, p-value = 0.03802
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.5639702 18.8050298
sample estimates:
mean of x mean of y
54.1055 44.4210
this is the output of given data-set.
conclusion:
Here, the p-value=0.0382 if we consider at 5% level of significance then, in this case, p-value(0.0382) is less than 0.05(level of significance) and we reject the null hypothesis.