In: Statistics and Probability
From the given information: The following table is formulated:
Contain the word free | Does not contain the word free | Total | |
Spam | 300 | 1200 | |
not Spam | 228 | ||
Total | 5000 |
And the blank cell values are filled:
Contain the word free | Does not contain the word free | Total | |
Spam | 300 | 1200-300=900 | 1200 |
not Spam | 228 | 3800-228=3572 | 5000-1200=3800 |
Total | 528 | 3572+900=4472 | 5000 |
Probability that a random email will be a spam and will not contain the word ‘free’ in its subject line
= Number of spam emails that does not contain the word free in the subject line/Total number of emails
From the above table,
Number of spam emails that does not contain the word free in the subject line = 900
Total number of mails = 5000
Probability that a random email will be a spam and will not contain the word ‘free’ in its subject line
= Number of spam emails that does not contain the word free in the subject line/Total number of emails = 900/5000=0.18
Probability that a random email will be a spam and will not contain the word ‘free’ in its subject line = 0.18
If the word ‘free’ does not appear in the subject line of an email, probability that the email is not a spam =
Number of emails that does not contain word "free" in the subject line and not a spam / Number of emails that does not contain word "free" in the subject line
From the above table,
Number of not spam emails that does not contain the word free in the subject line = 3572
Number of emails that does not contain word "free" in the subject line = 4472
If the word ‘free’ does not appear in the subject line of an email, probability that the email is not a spam
= Number of emails that does not contain word "free" in the subject line and not a spam / Number of emails that does not contain word "free" in the subject line
= 3572/4472 = 0.798747764
If the word ‘free’ does not appear in the subject line of an email, probability that the email is not a spam = 0.798747764
Probability that a random email will be neither a spam and nor will contain the word ‘free’ in its subject line
= Number of emails that does not contain word "free" in the subject line and not a spam / Total number of emails
From the above table,
Number of not spam emails that does not contain the word free in the subject line = 3572
Total number of emails = 5000
Probability that a random email will be neither a spam and nor will contain the word ‘free’ in its subject line
= Number of emails that does not contain word "free" in the subject line and not a spam / Total number of emails
= 3572/5000=0.7144
Probability that a random email will be neither a spam and nor will contain the word ‘free’ in its subject line = 0.7144
Are an email being a spam and its subject line containing the word ‘free’ independent events
Answer : No.
Probability that a email is spam = Number of spam email / total number of mails = 1200/5000=0.24
Probability that an email contains the word " free" in the subject line = Number of emails that contain "free" in the subject line / total number of mails = 528/5000=0.1056
If the email being a spam and its subject line containing the word ‘free’ are independent events then;
Probability that a email is spam and contains the word " free" in the subject line = Probability that a email is spam x Probability that an email contains the word " free" in the subject line
Probability that a email is spam x Probability that an email contains the word " free" in the subject line = 0.24 x 0.1056 = 0.025344
Probability that a email is spam and contains the word " free" in the subject line = Number of spam emails that contain the word "free" in the subject line / Total number of emails = 300/5000 = 0.06
As 0.06 is not equal 0.025344 i.e
Probability that a email is spam and contains the word " free" in the subject line Probability that a email is spam x Probability that an email contains the word " free" in the subject line
Therefore,
email being a spam and its subject line containing the word ‘free’ independent events are not independent events