In: Statistics and Probability
We have a random sample of size n (which is large), and we wish to test - Ho: X~UNIF(0,1) vs Ha: X~exp(1). How would you conduct hypothesis testing? Describe procedure.
There can be several approaches for this kind of testing of distributions.
1. If the sample comes from uniform(0,1), then the value of the sample will never exceed 1. Hence, if you have a sample of sufficiently large size, and the range of the sample is between 0 and 1, then you can not say that the sample definitely comes from uniform, BUT if the sample range exceeds 1 then you can definitely say that the sample is not drawn from Uniform(0,1).
2. the above test is quite deterministic. Now if you have sample whose range is between 0 and 1, then you can not say that the sample comes from uniform(0,1). Now, in this case, you can use empirical histogram drawing to have a preliminary idea about the distributions.
Here, I am showing you two different histograms for two different samples coming from the uniform and exponential respectively.
3.
Till now we were talking about crude ways of preparing the test criterions. Now for the exact test of whether the sample comes from a uniform or exponential distribution, we will use the Kolmogorov-Smirnov test.
In statistics, the Kolmogorov–Smirnov test (K–S test or KS test) is a nonparametric test of the equality of continuous, one-dimensional probability distributions that can be used to compare a sample with a reference probability distribution (one-sample K–S test), or to compare two samples (two-sample K–S test). It is named after Andrey Kolmogorov and Nikolai Smirnov.
The Kolmogorov–Smirnov statistic quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution, or between the empirical distribution functions of two samples. The null distribution of this statistic is calculated under the null hypothesis that the sample is drawn from the reference distribution (in the one-sample case) or that the samples are drawn from the same distribution (in the two-sample case). In the one-sample case, the distribution considered under the null hypothesis may be continuous, purely discrete or mixed. In the two-sample case, the distribution considered under the null hypothesis is a continuous distribution but is otherwise unrestricted.
Now I shall take help from R to conduct the test. I shall append all the codes and output in the Appendix section.
This particular test is very helpful in determining whether a sample is coming from a particular distribution or not. If the p-value is less than 0.05 then we can conclude that with 95% confidence we can say that the sample is not from distribution mentioned. Otherwise, the sample comes from a specified distribution.
Hope this explanation has helped you.
Appendix:
hist(rexp(10000,1)) #10000 random samples from
exponential(1)
hist(runif(10000,0,1)) #10000 random samples from
uniform(0,1)
ks.test(rexp(10000,1),runif(10000,0,1))
Two-sample Kolmogorov-Smirnov test
data: rexp(10000, 1) and runif(10000, 0, 1)
D = 0.3641, p-value < 2.2e-16
alternative hypothesis: two-sided
Hope this answer has helped you. If you have some queries regarding this, do let me know in the comment section.
Hit like if the answer really helped you.
Thanks !!