Question

In: Statistics and Probability

If you want to know how important spam filters are to your online experience, try turning...

If you want to know how important spam filters are to your online experience, try turning them off for a day. You’ll quickly see why these tools we tend to take for granted are so essential. Generally speaking, a filtering solution applied to your email system uses a set of protocols to determine which incoming messages are spam and which are not. What the filters checks on can vary, but often they all do basically the same thing: scan header information for evidence of malice, look up senders on blacklists of known spammers, and filter content for patterns that point to junk mail.

Suppose that a particular spam filter uses a points-based system in which various aspects of an email trigger an accumulation of points – with 100 points being the maximum and strongly indicating spam. So, more points for a particular email becomes stronger evidence that it is spam. After accumulating a sufficient number of points, the spam filter classifies the email as spam and it does not reach your inbox. This process is similar to hypothesis testing in the following way for each email it reviews: H0: The email is a real message (not spam) HA: The email is spam Using the above hypothesis setting context, answer the following questions using language/terms we have covered related to hypothesis testing:

a. When the filter allows spam to slip through into your inbox, which kind of error is that? Explain in terms of the hypotheses above.

b. Which kind of error is it when a real (i.e., non-spam) email gets classified as spam and does not get to your inbox? Explain in terms of the hypotheses above.

c. Suppose that this particular spam filter classifies spam as any email getting 50 points or higher. However, you reset the filter to use 60 points or higher before classifying it as spam. Is that analogous to choosing a higher or lower alpha level for a hypothesis test. Explain in terms of the hypotheses above.

d. What impact does this change in the spam cutoff value have on the chance of each type of error in hypothesis testing? Explain.

e. What does “power” mean in this context of the spam filter, and how is it related to one of the two types of errors? Explain in terms of the hypotheses above.

Solutions

Expert Solution


Related Solutions

If you want to know how important spam filters are to your online experience, try turning...
If you want to know how important spam filters are to your online experience, try turning them off for a day. You’ll quickly see why these tools we tend to take for granted are so essential. Generally speaking, a filtering solution applied to your email system uses a set of protocols to determine which incoming messages are spam and which are not. What the filters checks on can vary, but often they all do basically the same thing: scan header...
Spam Spam filters try to sort your e-mails, deciding which are real messages and which are...
Spam Spam filters try to sort your e-mails, deciding which are real messages and which are unwanted. One method used is a point system. The filter reads each incoming message and assigns points to the sender, the subject, key words in the message and so on. The higher the point total, the more likely it is that the message is unwanted. The filter has a cutoff value for the point total; any message rated lower than the cutoff passes through...
Spam filters try to sort your incoming e-mails, deciding which are real messages and which are...
Spam filters try to sort your incoming e-mails, deciding which are real messages and which are unwanted. One method used is a point system. The filter reads each incoming e-mail and assigns points according to the sender, the subject, key words in the message, and so on. The higher the point total the more likely it is that the message is unwanted. The filter has a cutoff value for the point total; any message rated lower than that cutoff passes...
Research methods spammers use to bypass Spam filters and comment on how effective they have been
Research methods spammers use to bypass Spam filters and comment on how effective they have been
How well do you really know your customers? And how well did they want to know...
How well do you really know your customers? And how well did they want to know you right?
You want to know if there is a relationship between surgeons’ surgical experience and surgical site...
You want to know if there is a relationship between surgeons’ surgical experience and surgical site infection (SSI) rates. Since SSI’s are relatively rare (infections develop in roughly 1-3 of every 100 patients who have surgery), you decide to do a case control study in which you select hospital patients who experienced a SSI and hospital patients who did not, and compare patients’ exposures to surgeons with <10 years of experience, and >10 years of experience. Assume you have access...
This is not an easy task and we want you to try your best toaccess...
This is not an easy task and we want you to try your best to access as much information to break down the cost of producing one of your products. Your product may be so scaled/efficient that it costs pennies to produce. If that is the case, we are fine if you select a larger bundle of several of the products like soft drinks or one piece candy - these might be too 'small'. That's your task to show us...
If you currently work for an employer, we know how important it is to receive your...
If you currently work for an employer, we know how important it is to receive your paycheck on time. However, have you ever reviewed your pay stub to understand the deduction items on the paycheck and if the paycheck is calculated accurately? In your initial post address the following: Summarize and explain common deduction items on a pay stub. Identify and compare which payroll taxes employers and employees are required to pay.
You want to know how the commuting time of your employees compare to Departments competing for...
You want to know how the commuting time of your employees compare to Departments competing for the same staff, so you get some survey data on average commuting times for 25 randomly selected people in your line of business. Data: 59, 74, 73, 66, 80, 84, 90, 122.5, 37.2, 50, 139.3, 39, 80, 35, 88, 34, 82, 65, 61, 63, 66, 88, 70, 79, 75 You find that the mean commuting time is 72 minutes, with a standard deviation of...
Try to get in touch with any person you know( it could be your relative or...
Try to get in touch with any person you know( it could be your relative or friend) who undergone or undergoing a special diet. Describe what it is made of and for what specific condition it is. You may cite specific food items and other details pertinent to the diet.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT