In: Statistics and Probability
An article in Information Security Technical Report [“Malicious Software—Past, Present and Future” (2004, Vol. 9, pp. 6–18)] provided the following data on the top 10 malicious software instances for 2002. The clear leader in the number of registered incidences for the year 2002 was the Internet worm “Klez,” and it is still one of the most widespread threats. This virus was first detected on 26 October 2001, and it has held the top spot among malicious software for the longest period in the history of virology.
The 10 most widespread malicious programs for 2002
Place | Name | % Instances |
1 | I-Worm.Klez | 61.22% |
2 | I-Worm.Lentin | 20.52% |
3 | I-Worm.Tanatos | 2.09% |
4 | I-Worm.BadtransII | 1.31% |
5 | Macro.Word97.Thus | 1.19% |
6 | I-Worm.Hybris | 0.60% |
7 | I-Worm.Bridex | 0.32% |
8 | I-Worm.Magistr | 0.30% |
9 | Win95.CIH | 0.27% |
10 | I-Worm.Sircam | 0.24% |
(Source: Kaspersky Labs).
Suppose that 20 malicious software instances are reported. Assume that the malicious sources can be assumed to be independent. (a) What is the probability that at least one instance is “Klez?” (b) What is the probability that three or more instances are “Klez?” (c) What are the mean and standard deviation of the number of “Klez” instances among the 20 reported?
here there are 20 malicious software instances are reported.
And the sources are independent.
So here as the questions are on the virus "Klez" which has the probability to occur = 61.22% = 0.6122
(a) We can see that here the distribution of occuring x number of "klez" virus out of 20 times follows binomial distribution with parameter
n = 20 and p = 0.6122
so here
p(x) = 20Cx (0.6122)x(1 - 0.6122)(20-x)
Pr(At least one instance is klez) = 1 - Pr(No instance is klez) = 1 - Pr(x = 0)
= 1 - 20C0 (0.6122)0(1 - 0.6122)20
= 1 - (1 - 0.6122)20 = 1 - 5.92 x 10-9 ~ 1
(b) Pr(Three of more instances are klez) = Pr(x > = 3) = 1 - Pr(x = 0) - Pr(x = 1) - Pr(x = 2)
=1 - 20C0 (0.6122)0(1 - 0.6122)20 - 20C1 (0.6122)1(1 - 0.6122)19 - 20C2 (0.6122)2(1 - 0.6122)18
= 1- 5.92 x 10-9 - 1.87 * 10-7 - 2.80 x 10 -6
= 0.999997 ~ 1
(c) Here as we know the distribution is binomial.
so here mean instances = np = 20 * 0.6122 = 12.244
standard deviation of number of instances = sqrt [np(1-p)] = sqrt [20 * 0.6122 * (1- 0.6122) ] = 2.179