In: Statistics and Probability
Programming in R:
Some days Zach finds that he has nothing better to do than sit on a porch and watch cars pass him by. On average, he sees about 15 cars pass in a single day. Simulate an entire year of Zach’s car-watching pastime (assume every month is 30 days long), compute the mean for each month, and generate a histogram of the means. Does the distribution of the means look normal? Why or why not?
According to the question,Here Zach is watches 15 cars per day,And the next day he is watching 15+15cars i.e 30 cars per second day.So Here it is increasing day by day.So we will find it by using the Exponential distribution Using R studio.
First we will create a dat frame and we will do as follows the script,
Watching_count<-data.frame("Days"=c(1:360),"avarage_cars"=15*c(1:360))
Watching_count$month<-ifelse(Watching_count$Days<=30,1,
ifelse(Watching_count$Days>30 & Watching_count$Days<=60,2,
ifelse(Watching_count$Days>60 & Watching_count$Days<=90,3,
ifelse(Watching_count$Days>90 & Watching_count$Days<=120,4,
ifelse(Watching_count$Days>120 & Watching_count$Days<=150,5,
ifelse(Watching_count$Days>150 & Watching_count$Days<=180,6,
ifelse(Watching_count$Days>180 & Watching_count$Days<=210,7,
ifelse(Watching_count$Days>210 & Watching_count$Days<=240,8,
ifelse(Watching_count$Days>240 & Watching_count$Days<=270,9,
ifelse(Watching_count$Days>270 & Watching_count$Days<=300,10,
ifelse(Watching_count$Days>300 & Watching_count$Days<=330,11,12 )))))))))))
library(sqldf)
aggre_by_month<-sqldf("select month,sum(avarage_cars)/30 as mean_of_month from Watching_count group by month")
library(stats)
plot(aggre_by_month$mean_of_month,type = "h", xlab="month",ylab="mean_of_month")
Histogram:
In these image,We are showing Monthy wise Mean is calculated here,.According to these dataframe,We are prepared Histogram.
Now we are checking the normality test of means
In R,We are testing the shapiro-wilk normality test,
shapiro.test(aggre_cate_by_month$mean_of_month),
By these test,we are getting P-value=0.87,
p-value is greater than 0.05,implying that the distribution of the mean's are not significantly different from normal distribution.In other words we can assuem that data is normally distributed.