Question

In: Statistics and Probability

Internal auditors sometimes check random samples of transactions within a database. Suppose that in a particular...

Internal auditors sometimes check random samples of transactions within a database. Suppose that in a particular set of transactions, 2% contain an error of some kind. The auditor takes a random sample of 20 transactions for checking. Let X denote the number of transactions found to be in error in the sample.

(a) State the probability distribution of X (including the values of all parameters) and find the probability that 2 transactions are found to be in error.

(b) If three or more transactions are found to be in error then a larger sample is taken for checking. How often will this happen? (Use the appropriate template).

(c) What assumption is required for the validity of the above answers?

Solutions

Expert Solution

(a) We know that in a particular set of transactions, 2% contain an error of some kind. Hence if X is the random variable defined as: X= number of transactions found to be in error in a sample of 20,

Probability Distribution of X = Binomial with success probability p = 0.02 and number of trials n = 20.

P[X=2] = Combin(20,2) * (0.02)^2 * (1-0.02)^18

= 190*0.0004 * 0.695135 = 0.05283

Probability that 2 transactions are found to be in error = 0.05283

b) If three or more transactions are found to be in error then a larger sample is taken for checking. How often will this happen?

P[X>=3] = 1- P[X<3] = 1- { P[X=0] + P[X=1] + P[X=2] }

Using the formula from (a),

P[X=0] = Combin(20,0) * (0.02)^0 * (1-0.02)^20

P[X=1] = Combin(20,1) * (0.02)^1 * (1-0.02)^19

P[X=2] = Combin(20,2) * (0.02)^2 * (1-0.02)^18

Evaluate these in Excel and we get

P[X=0] =

0.667607972

P[X=1] =

0.27249305

P[X=2] =  

0.05283

Therefore P[X=0] + P[X=1] + P[X=2] =

0.992931

Therefore P[X>=3] = 1- 0.992931 =

0.007069

Hence  three or more transactions are found to be in error happens only 0.707% of times

(c) What assumption is required for the validity of the above answers?

Assumptions: Sampling of 20 is done without replacement, but we assume that the total population of transactions N is sufficiently large. So that, we can use Binomial distribution for the sample of 20 and do not have to use the Hypergeometric distribution.


Related Solutions

A sociologist studied random samples of full-time employees in a particular occupation – six women and...
A sociologist studied random samples of full-time employees in a particular occupation – six women and six men – to determine whether gender has an influence on the average (mean) number of hours worked per day. She obtained the following results:            Women                Men                   10                      12                  9                        9                  7                        8                  4                       10                  9                       11                  6                        7      Use a .01 alpha level to test whether there is a gender difference in the mean...
Suppose that the helium porosity (in percent) of carbon samples taken from any particular seam is...
Suppose that the helium porosity (in percent) of carbon samples taken from any particular seam is normally distributed with a true standard deviation of 0.75. a. Calculate a 95% confidence interval for the true average porosity of a seam, if the average porosity in 20 specimens of the seam was 4.85. b. How large should a sample size be if the width of the 95% range has to be 0.40? c. What sample size is needed to estimate true average...
Suppose that a random sample of 50 bottles of a particular brand of cough syrup is...
Suppose that a random sample of 50 bottles of a particular brand of cough syrup is selected and the alcohol content of each bottle is determined. Let u denote the average alcohol content for the population of all bottles of the brand under study. Suppose that the resulting 95% confidence interval is (7.1, 9.8). (a) Would a 90% confidence interval calculated from this same sample have been narrower or wider than the given interval? Explain your reasoning. The 90% would be wider...
Suppose that a random sample of 50 bottles of a particular brand of cough syrup is...
Suppose that a random sample of 50 bottles of a particular brand of cough syrup is selectedand the alcohol content of each bottle is determined. Let denote the average alcohol contentfor the population of all bottles of the brand under study. Suppose that the resulting 95%confidence interval is (7.8, 9.4).a) Would a 90% confidence interval calculated from this same sample have been narroweror wider than the given interval? Explain your reasoning.b) Consider the following statement: There is a 95% chance...
Paired Samples t-test (30pts)Suppose you are interested in deciding if a particular diet is effective in...
Paired Samples t-test (30pts)Suppose you are interested in deciding if a particular diet is effective in changing people’s weight. You decide to run a “within subject” experiment. You select 6 people and weight each of them. Two weeks, you weight them again. For each person, you compute how much weight they lost over this period. This is what you find:Non-diet(subject 1-6): 0,8,3,2,-10,-1.You then put them on the diet and weigh them again after two weeks and compute how much they...
Paired Samples t-test (30pts)Suppose you are interested in deciding if a particular diet is effective in...
Paired Samples t-test (30pts)Suppose you are interested in deciding if a particular diet is effective in changing people’s weight. You decide to run a “within subject” experiment. You select 6 people and weight each of them. Two weeks, you weight them again. For each person, you compute how much weight they lost over this period. This is what you find:Non-diet(subject 1-6): 0,8,3,2,-10,-1.You then put them on the diet and weigh them again after two weeks and compute how much they...
Suppose that X1, X2, , Xm and Y1, Y2, , Yn are independent random samples, with...
Suppose that X1, X2, , Xm and Y1, Y2, , Yn are independent random samples, with the variables Xi normally distributed with mean μ1 and variance σ12 and the variables Yi normally distributed with mean μ2 and variance σ22. The difference between the sample means, X − Y, is then a linear combination of m + n normally distributed random variables and, by this theorem, is itself normally distributed. (a) Find E(X − Y). (b) Find V(X − Y). (c)...
Suppose independent random samples that are taken to test the difference between the means of two...
Suppose independent random samples that are taken to test the difference between the means of two populations (n1 = 66 and n2 =46). The variances of the populations are unknown but are assumed to be unequal. The sample standard deviations are s1=82 and s2=68. The appropriate distribution to use is the: A) t distribution with df = 110 B ) t distribution with df = 107 C) t distribution with df = 106 D) F distribution with numerator df =...
Suppose that X and Y are random samples of observations from a population with mean μ...
Suppose that X and Y are random samples of observations from a population with mean μ and variance σ2. Consider the following two unbiased point estimators of μ. A = (7/4)X - (3/4)Y    B = (1/3)X + (2/3)Y [Give your answers as ratio (eg: as number1 / number2 ) and DO NOT make any cancellation]   1.    Find variance of A. Var(A) = ? *σ2 2.    Find variance of B. Var(B) = ? *σ2 3. Efficient and unbiased...
Suppose that for a dataset the mean is known. Using the 25 random samples, we computed...
Suppose that for a dataset the mean is known. Using the 25 random samples, we computed the sample variance as s^2=0.001. a) Does the data support the claim that the true standard deviation is less than 0.05? (use alpha = 0.05 and alternative hypothesis sigma^2 < 0.0025) b) Compute a two-sided 95% confidence interval for the true variance of the data.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT