In: Statistics and Probability
Recall that Benford's Law claims that numbers chosen from very
large data files tend to have "1" as the first nonzero digit
disproportionately often. In fact, research has shown that if you
randomly draw a number from a very large data file, the probability
of getting a number with "1" as the leading digit is about 0.301.
Now suppose you are the auditor for a very large corporation. The
revenue file contains millions of numbers in a large computer data
bank. You draw a random sample of n = 226 numbers from
this file and r = 87 have a first nonzero digit of 1. Let
p represent the population proportion of all numbers in
the computer file that have a leading digit of 1.
(i) Test the claim that p is more than 0.301. Use
α = 0.10.
(a) What is the level of significance?
State the null and alternate hypotheses.
H0: p = 0.301; H1: p > 0.301H0: p = 0.301; H1: p ≠ 0.301 H0: p = 0.301; H1: p < 0.301H0: p > 0.301; H1: p = 0.301
(b) What sampling distribution will you use?
The Student's t, since np > 5 and nq > 5.The standard normal, since np < 5 and nq < 5. The standard normal, since np > 5 and nq > 5.The Student's t, since np < 5 and nq < 5.
What is the value of the sample test statistic? (Round your answer
to two decimal places.)
(c) Find the P-value of the test statistic. (Round your
answer to four decimal places.)
Sketch the sampling distribution and show the area corresponding to
the P-value.
(d) Based on your answers in parts (a) to (c), will you reject or
fail to reject the null hypothesis? Are the data statistically
significant at level α?
At the α = 0.10 level, we reject the null hypothesis and conclude the data are statistically significant.At the α = 0.10 level, we reject the null hypothesis and conclude the data are not statistically significant. At the α = 0.10 level, we fail to reject the null hypothesis and conclude the data are statistically significant.At the α = 0.10 level, we fail to reject the null hypothesis and conclude the data are not statistically significant.
(e) Interpret your conclusion in the context of the
application.
There is sufficient evidence at the 0.10 level to conclude that the true proportion of numbers with a leading 1 in the revenue file is greater than 0.301.There is insufficient evidence at the 0.10 level to conclude that the true proportion of numbers with a leading 1 in the revenue file is greater than 0.301.
(ii) If p is in fact larger than 0.301, it would seem
there are too many numbers in the file with leading 1's. Could this
indicate that the books have been "cooked" by artificially lowering
numbers in the file? Comment from the point of view of the Internal
Revenue Service. Comment from the perspective of the Federal Bureau
of Investigation as it looks for "profit skimming" by unscrupulous
employees.
Yes. There does not seem to be too many entries with a leading digit 1.No. There seems to be too many entries with a leading digit 1. Yes. There seems to be too many entries with a leading digit 1.No. There does not seem to be too many entries with a leading digit 1.
(iii) Comment on the following statement: If we reject the null
hypothesis at level of significance α , we have not proved
H0 to be false. We can say that the probability
is α that we made a mistake in rejecting
Ho. Based on the outcome of the test, would you
recommend further investigation before accusing the company of
fraud?
We have not proved H0 to be false. Because our data lead us to reject the null hypothesis, more investigation is merited.We have proved H0 to be false. Because our data lead us to reject the null hypothesis, more investigation is not merited. We have not proved H0 to be false. Because our data lead us to accept the null hypothesis, more investigation is not merited.We have not proved H0 to be false. Because our data lead us to reject the null hypothesis, more investigation is not merited.
a) α=0.10
Ho : p = 0.301
H1 : p > 0.301 (Right tail
test)
b)
The standard normal, since np > 5 and nq > 5
Number of Items of Interest, x =
87
Sample Size, n = 226
Sample Proportion , p̂ = x/n =
0.3850
Standard Error , SE = √( p(1-p)/n ) =
0.03051
Z Test Statistic = ( p̂-p)/SE =
(0.385-0.301)/0.0305= 2.75
c)
p-Value = 0.0030 [Excel
function =NORMSDIST(-z)
Decision: p-value<α , reject null hypothesis
e)At the α = 0.10 level, we reject the null hypothesis and conclude the data are statistically significant.
ii) Yes. There seems to be too many entries with a leading digit 1.
iii)We have proved H0 to be false. Because our data lead us to reject the null hypothesis, more investigation is not merited