Question

In: Statistics and Probability

. A web-based algorithm classifies emails as spams or no-spam at a success rate of 70%...

. A web-based algorithm classifies emails as spams or no-spam at a success rate of 70% of detecting a spam. A. Find a 95% confidence interval for the number of spam emails expected within a sample of 100 emails. B. Find a 95% confidence interval for the accuracy (standard deviation) of the detected number of spams within the 100 emails. C. Determine the probability that at least 85 emails out of 100 emails are spams. D. If for the 100 emails above, there was indeed actually 85 spam emails, test the claim that the same algorithm can still detect at least 85 emails with a 95% confidence.

Solutions

Expert Solution

the probability that the success rate of detecting spam is p = 0.70

A) For sample of n= 100

mean = np = 100*0.70 =70

standard deviation =

95% confidence interval for the expected number of spam emails is

for 95% confidence

rounding off to a nearest integer

B)

c) This probability can be calculated using the binomial distribution

The probability that at least 85 emails out of 100 emails are spams =


Related Solutions

Google has hired you to develop an algorithm to detect spam emails. You have the following...
Google has hired you to develop an algorithm to detect spam emails. You have the following information for a user: a) Of all the emails she has previously sent to ‘spam’, 85% include the term “Check this out…”; b) Of all her ‘non-spam’ emails 10% include the term “Check this out…”; c) She sends 15% of all emails she receives to spam. What would be your rational guess about the probability that an email containing “Check this out...” is spam?...
Suppose that 8% of emails is spam and 92% (prior probabilities) are normal. The probabilities (likelihood...
Suppose that 8% of emails is spam and 92% (prior probabilities) are normal. The probabilities (likelihood of evidence) of occurrence of various worlds in normal and spam emails are given in the following table: word P(word|spam) P(word|normal) abandoned fund 0.5 0.01 deceased customer 0.6 0.05 Bank account 0.2 0.1 Consider the following email message “I am Mrs Sarah Boardman. I have decided to seek a confidential co-operation with you, During the course of our bank year auditing, I discovered an...
The five most common words appearing in spam emails are shipping!, today!, here!, available, and fingertips!....
The five most common words appearing in spam emails are shipping!, today!, here!, available, and fingertips!. Many spam filters separate spam from ham (email not considered to be spam) through application of Bayes' theorem. Suppose that for one email account, in every messages is spam and the proportions of spam messages that have the five most common words in spam email are given below. shipping!        0.050       today!             0.047 here!              0.034 Available       0.016 fingertips!      0.016 Also suppose that the proportions of...
NYU is testing out two different versions of filtering software in order to reduce spam emails....
NYU is testing out two different versions of filtering software in order to reduce spam emails. The old version is called "Spam-A-Lot" and the new version is called "Spam-A-Little." In testing each version of the software the following data was produced: Email Account Solicited Mail Unsolicited Mail TOTAL Spam-A-Lot 305 95 400 Spam-A-Little 150 38 188 Let p1 and p2 denote the true proportion of unsolicited mail that make it through the "Spam-A-Lot" and "Spam-A-Little" filters, respectively. (a) Determine the...
NYU is testing out two different versions of filtering software in order to reduce spam emails....
NYU is testing out two different versions of filtering software in order to reduce spam emails. The old version is called "Spam-A-Lot" and the new version is called "Spam-A-Little." In testing each version of the software the following data was produced: Email Account Solicited Mail Unsolicited Mail TOTAL Spam-A-Lot 305 95 400 Spam-A-Little 150 38 188 Let p1 and p2 denote the true proportion of unsolicited mail that make it through the "Spam-A-Lot" and "Spam-A-Little" filters, respectively. (a) Determine the...
The five most common words appearing in spam emails are shipping!, today!, here!, available, and fingertips!....
The five most common words appearing in spam emails are shipping!, today!, here!, available, and fingertips!. Many spam filters separate spam from ham (email not considered to be spam) through application of Bayes' theorem. Suppose that for one email account, in every messages is spam and the proportions of spam messages that have the five most common words in spam email are given below. shipping!        0.050       today!             0.047 here!              0.034 Available       0.016 fingertips!      0.016 Also suppose that the proportions of...
3. You receive emails by a Poisson Arrival Process at a rate of 12 emails per...
3. You receive emails by a Poisson Arrival Process at a rate of 12 emails per hour. (a) (6 points) Find the probability that you receive exactly 3 emails between 4:10 PM and 4:20 PM. (b) (6 points) You start checking your email at 10:00 AM. What is the expected time of your first email? (c) (9 points) Given that you receive exactly 10 emails between 4:00 PM and 5:00 PM, what is the (conditional) distribution of the number of...
Use the multi-layer perceptron algorithm to learn a model that classifies IRIS flower dataset. Split the...
Use the multi-layer perceptron algorithm to learn a model that classifies IRIS flower dataset. Split the dataset into a train set to train the algorithm and test set to test the algorithm. Calculate the accuracy. Use Scikit-Learn
Emails arrive in an inbox according to a Poisson process with rate λ (so the number...
Emails arrive in an inbox according to a Poisson process with rate λ (so the number of emails in a time interval of length t is distributed as Pois(λt), and the numbers of emails arriving in disjoint time intervals are independent). Let X, Y, Z be the numbers of emails that arrive from 9 am to noon, noon to 6 pm, and 6 pm to midnight (respectively) on a certain day. (a) Find the joint PMF of X, Y, Z....
Write a program to implement Apriori Algorithm on web log data?   do a google search for...
Write a program to implement Apriori Algorithm on web log data?   do a google search for any keyword and store the results in a file or take some web log data from internet and apply apriori algorithm to get a meaningful conclusion from the data
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT