In: Statistics and Probability
Spam filters try to sort your incoming e-mails, deciding which are real messages and which are unwanted. One method used is a point system. The filter reads each incoming e-mail and assigns points according to the sender, the subject, key words in the message, and so on. The higher the point total the more likely it is that the message is unwanted. The filter has a cutoff value for the point total; any message rated lower than that cutoff passes through to your inbox, and the rest, suspected to be spam, are diverted to the junk mailbox.
We can think of the filter's decision as a hypothesis test. The null hypothesis is that the e-mail is a real message and should go in your inbox. A high point total provides evidence that the message may be spam. When there is sufficient evidence, the filter rejects the null, classifying the message as junk. This usually works pretty well, but of course, sometimes the filter makes a mistake. Complete parts (a) through (d) below.
(a) When the filter allows spam to slip through into your inbox, what kind of error is that?
A. This is a Type I error because H0 is true, and the filter rejected it.
B. This is a Type II error because H0 is false, but the filter failed to reject it.
C. This is a Type I error because H0 is true, but the filter failed to reject it.
D. This is a Type II error because H0 is false, and the filter rejected it.
(b) Which kind of error is it when a real message gets classified as junk?
A. This is a Type II error because H0 is false, but the filter failed to reject it.
B. This is a Type II error because H0 is false, and the filter rejected it.
C. This is a Type I error because H0 is true, and the filter rejected it.
D. This is a Type I error because H0 is true, but the filter failed to reject it.
(c) Some filters allow you to adjust the cutoff. Suppose your filter has a default cutoff of 50 points, but you reset it to 60. What impact does this change in the cutoff value have on the chance of each type of error?
A. Decreased Type I error, increased Type II error.
B. Increased Type I error, decreased Type II error.
C. Increased Type I error, increased Type II error.
D. Decreased Type I error, decreased Type II error.
(d) Is the above change in cutoff analogous to choosing a higher or lower value of α for a hypothesis test?
A. A higher α, because it takes stronger evidence to classify the e-mail as spam.
B. A lower α, because it takes stronger evidence to classify the e-mail as spam.
C. A higher α, because it takes less evidence to classify the e-mail as spam.
D. A lower α, because it takes less evidence to classify the e-mail as spam.
Lightbulbs
From past test records it is known that the mean lifetime of the Fillips bulbs produced is 2000 hours with a standard deviation (s) of 120 hours. The manufacturer tests a random sample of 16 light bulbs to assess the reliability of the production process with the following result (in hours).
2010 |
2010 |
1529 |
2450 |
1628 |
1976 |
1379 |
2068 |
2537 |
2687 |
2128 |
2156 |
1987 |
2020 |
1879 |
2356 |
Based on the above sample, can we say the average lifetime of Fillips lightbulbs is different from the past? Conduct the test at the significance level of 5%, using the critical value or p-value approach.
State any assumption you make in your calculations.
Notes:
Show all 6 steps.
To calculate the sample mean, you can use Excel or calculator; no working is required.
While you can get full marks for this question without a diagram, you are encouraged to draw one to help analyse the problem.
(a) Correct option : B. This is a Type II error because H0 is false, but the filter failed to reject it.
(b) Correct option : C. This is a Type I error because H0 is true, and the filter rejected it.
(c) Correct option : A. Decreased Type I error, increased Type II error.
(d) Correct option : B. lower α, because it takes stronger evidence to classify the e-mail as spam.
Lightbulbs
Null hypothesis Ho : The average lifetime of Fillips lightbulbs is not different from the past.
Alternative hypothesis H1 : The average lifetime of Fillips lightbulbs is different from the past.
Test statistic = ( sample mean - population mean ) / (population standard deviation / n0.5)
sample mean = ( 2010 + 2010 + 1529 + 2450 + ............... + 2020 + 1879 + 2356 ) / 16
sample mean = 2050
population mean = 2000
population standard deviation = 120
n = 16
Test statistic = ( 2050 - 2000) / ( 120 / 160.5) = 1.67
we are given, Alpha = level of significance = 5% = 0.05
Tabulated value ( critical value ) = Zalpha/2 = Z0.05/2 = Z0.025 = 1.96
Since, Test statistic = 1.67 < 1.96 ( critical value ), we do not reject Ho nd conclude that The average lifetime of Fillips lightbulbs is not different from the past.