Question

In: Statistics and Probability

Suppose that 8% of emails is spam and 92% (prior probabilities) are normal. The probabilities (likelihood...

Suppose that 8% of emails is spam and 92% (prior probabilities) are normal. The probabilities (likelihood of evidence) of occurrence of various worlds in normal and spam emails are given in the following table:

word

P(word|spam)

P(word|normal)

abandoned fund

0.5

0.01

deceased customer

0.6

0.05

Bank account

0.2

0.1

Consider the following email message “I am Mrs Sarah Boardman. I have decided to seek a confidential co-operation with you, During the course of our bank year auditing, I discovered an abandoned fund, sum total of US $3.5 Million in the bank account that belongs to a deceased customer who unfortunately lost his life and entire family in fatal gassy car accident. Reply me for more clarification if you are interested”.

Calculate the posterior probability that this message is spam.

Solutions

Expert Solution

The prior probability of a mail being spam is P(spam) = 0.08 and a mail being normal P(Normal) = 0.92

Let

be the event that the word "abandoned fund" occurs in a mail.

be the event that the word "deceased customer" occurs in a mail.

be the event that the word "Bank account" occurs in a mail.

We know the following conditional probabilities

The message contains all the 3 words

“I am Mrs Sarah Boardman. I have decided to seek a confidential co-operation with you, During the course of our bank year auditing, I discovered an abandoned fund, sum total of US $3.5 Million in the bank account that belongs to a deceased customer who unfortunately lost his life and entire family in fatal gassy car accident. Reply me for more clarification if you are interested”.

The joint probability of finding the 3 words given that the mail is spam is

The joint probability of finding the 3 words given that the mail is normal is

The unconditional joint probability is

Using the Bayes rule the posterior probability that the message is spam is

ans: the posterior probability that this message is spam is 0.9905


Related Solutions

. A web-based algorithm classifies emails as spams or no-spam at a success rate of 70%...
. A web-based algorithm classifies emails as spams or no-spam at a success rate of 70% of detecting a spam. A. Find a 95% confidence interval for the number of spam emails expected within a sample of 100 emails. B. Find a 95% confidence interval for the accuracy (standard deviation) of the detected number of spams within the 100 emails. C. Determine the probability that at least 85 emails out of 100 emails are spams. D. If for the 100...
Google has hired you to develop an algorithm to detect spam emails. You have the following...
Google has hired you to develop an algorithm to detect spam emails. You have the following information for a user: a) Of all the emails she has previously sent to ‘spam’, 85% include the term “Check this out…”; b) Of all her ‘non-spam’ emails 10% include the term “Check this out…”; c) She sends 15% of all emails she receives to spam. What would be your rational guess about the probability that an email containing “Check this out...” is spam?...
The five most common words appearing in spam emails are shipping!, today!, here!, available, and fingertips!....
The five most common words appearing in spam emails are shipping!, today!, here!, available, and fingertips!. Many spam filters separate spam from ham (email not considered to be spam) through application of Bayes' theorem. Suppose that for one email account, in every messages is spam and the proportions of spam messages that have the five most common words in spam email are given below. shipping!        0.050       today!             0.047 here!              0.034 Available       0.016 fingertips!      0.016 Also suppose that the proportions of...
NYU is testing out two different versions of filtering software in order to reduce spam emails....
NYU is testing out two different versions of filtering software in order to reduce spam emails. The old version is called "Spam-A-Lot" and the new version is called "Spam-A-Little." In testing each version of the software the following data was produced: Email Account Solicited Mail Unsolicited Mail TOTAL Spam-A-Lot 305 95 400 Spam-A-Little 150 38 188 Let p1 and p2 denote the true proportion of unsolicited mail that make it through the "Spam-A-Lot" and "Spam-A-Little" filters, respectively. (a) Determine the...
NYU is testing out two different versions of filtering software in order to reduce spam emails....
NYU is testing out two different versions of filtering software in order to reduce spam emails. The old version is called "Spam-A-Lot" and the new version is called "Spam-A-Little." In testing each version of the software the following data was produced: Email Account Solicited Mail Unsolicited Mail TOTAL Spam-A-Lot 305 95 400 Spam-A-Little 150 38 188 Let p1 and p2 denote the true proportion of unsolicited mail that make it through the "Spam-A-Lot" and "Spam-A-Little" filters, respectively. (a) Determine the...
The five most common words appearing in spam emails are shipping!, today!, here!, available, and fingertips!....
The five most common words appearing in spam emails are shipping!, today!, here!, available, and fingertips!. Many spam filters separate spam from ham (email not considered to be spam) through application of Bayes' theorem. Suppose that for one email account, in every messages is spam and the proportions of spam messages that have the five most common words in spam email are given below. shipping!        0.050       today!             0.047 here!              0.034 Available       0.016 fingertips!      0.016 Also suppose that the proportions of...
3.1 Probabilities are a “likelihood” that something is going to happen. It is not a certainty....
3.1 Probabilities are a “likelihood” that something is going to happen. It is not a certainty. How does this type of statistic help us when conducting research in criminal justice? How does it hurt the process? Find an example online to back your response.
suppose y has a normal distribution with mean=0 and var=theta. a) what is the maximum likelihood...
suppose y has a normal distribution with mean=0 and var=theta. a) what is the maximum likelihood estimator (mle) for theta b) show that the mle is unbiased for theta OR show it is biased and construct an unbiased estimator based on it
Consider the drawing of a probability tree for this data. What are the prior probabilities that...
Consider the drawing of a probability tree for this data. What are the prior probabilities that would be on the tree and what would they be for?           WOMEN                      EYE COLOR               CHILDREN                 18                            Brown                                No                 22                            Brown                               Yes                 09                            Blue                                   No                 21                            Blue                                  Yes                 12                            Green                                No                 18                            Green                               Yes             MEN                            EYE COLOR               CHILDREN                 24                            Brown                                No                 16                           ...
15. How is the standard normal distribution used to compute normal probabilities?
15. How is the standard normal distribution used to compute normal probabilities?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT