In: Statistics and Probability
There is a disease going around in the country. As opposed to the previous cases, this time the disease develops slowly in the body and in that phase the infected people are also contagious. This led to an unprecedented number of infections. The data shows the number of infected people by day. As the king is worried, he gave you the task of predicting the number of cases for next 7 days.
Day | Number of Cases |
1 | 85 |
2 | 121 |
3 | 166 |
4 | 228 |
5 | 282 |
6 | 401 |
7 | 525 |
8 | 674 |
9 | 1,231 |
10 | 1,695 |
11 | 2,277 |
12 | 3,146 |
13 | 5,232 |
14 | 6,391 |
15 | 7,988 |
16 | 9,942 |
17 | 11,826 |
18 | 14,769 |
19 | 18,077 |
20 | 21,571 |
21 | 25,496 |
22 | 29,909 |
23 | 35,480 |
24 | 42,058 |
25 | 50,105 |
26 | 57,786 |
27 | 65,719 |
28 | 73,232 |
29 | 80,110 |
30 | 87,956 |
31 | 95,923 |
a.) Draw a graph of the data in Excel. How does it look like, which methods would be suitable to forecast?
b.) A wise man told you that if a variable is exponentially increasing, then the logarithm of that variable would have a linear trend. With that information, you decided to take a scatter of the logarithm of number of cases. (log(X)). You can also do the same by changing the scale of the vertical axis to log scale. You are not convinced that fitting a trend line would bring a good forecast. Why would you think so? Why fitting a trend line on logarithm may not generate good forecasts in this case?
c.) Growth rate of a variable is defined as: (X(t+1)-X(t))/X(t) Therefore, if you have a forecast for growth rate g, than you can generate a forecast for the variable using X(t+1)=X(t) x (1+g). Calculate the daily growth rate, and scatter the growth rate of cases. How does it look like?
d.) Forecast the number of cases for the next 7 days. Choose the method based on your judgement on previous graphs.
1)
a.) Draw a graph of the data in Excel. How does it look like, which methods would be suitable to forecast?
2)
Day | Number of Cases | LN(Day) | LN(Number of cases) | Growth |
1 | 85 | 0 | 4.442651 | |
2 | 121 | 0.693147 | 4.795791 | 0.423529 |
3 | 166 | 1.098612 | 5.111988 | 0.371901 |
4 | 228 | 1.386294 | 5.429346 | 0.373494 |
5 | 282 | 1.609438 | 5.641907 | 0.236842 |
6 | 401 | 1.791759 | 5.993961 | 0.421986 |
7 | 525 | 1.94591 | 6.263398 | 0.309227 |
8 | 674 | 2.079442 | 6.51323 | 0.28381 |
9 | 1,231 | 2.197225 | 7.115582 | 0.826409 |
10 | 1,695 | 2.302585 | 7.435438 | 0.376929 |
11 | 2,277 | 2.397895 | 7.730614 | 0.343363 |
12 | 3,146 | 2.484907 | 8.053887 | 0.381643 |
13 | 5,232 | 2.564949 | 8.562549 | 0.663064 |
14 | 6,391 | 2.639057 | 8.762646 | 0.221521 |
15 | 7,988 | 2.70805 | 8.985696 | 0.249883 |
16 | 9,942 | 2.772589 | 9.204523 | 0.244617 |
17 | 11,826 | 2.833213 | 9.378056 | 0.189499 |
18 | 14,769 | 2.890372 | 9.600286 | 0.248858 |
19 | 18,077 | 2.944439 | 9.802396 | 0.223983 |
20 | 21,571 | 2.995732 | 9.979105 | 0.193284 |
21 | 25,496 | 3.044522 | 10.14628 | 0.181957 |
22 | 29,909 | 3.091042 | 10.30591 | 0.173086 |
23 | 35,480 | 3.135494 | 10.47672 | 0.186265 |
24 | 42,058 | 3.178054 | 10.6468 | 0.1854 |
25 | 50,105 | 3.218876 | 10.82188 | 0.191331 |
26 | 57,786 | 3.258097 | 10.9645 | 0.153298 |
27 | 65,719 | 3.295837 | 11.09314 | 0.137282 |
28 | 73,232 | 3.332205 | 11.20139 | 0.11432 |
29 | 80,110 | 3.367296 | 11.29116 | 0.093921 |
30 | 87,956 | 3.401197 | 11.38459 | 0.09794 |
31 | 95,923 | 3.433987 | 11.4713 | 0.090579 |
Scatter of the logarithm of number of cases. (log(X)).
We are not convinced that fitting a trend line would bring a good forecast.
because it might not be able to model the specific curve that exists in your data.
Hence fitting a trend line on logarithm may not generate good forecasts in this case.
C) Daily groeth rate
Day | Growth |
2 | 0.423529 |
3 | 0.371901 |
4 | 0.373494 |
5 | 0.236842 |
6 | 0.421986 |
7 | 0.309227 |
8 | 0.28381 |
9 | 0.826409 |
10 | 0.376929 |
11 | 0.343363 |
12 | 0.381643 |
13 | 0.663064 |
14 | 0.221521 |
15 | 0.249883 |
16 | 0.244617 |
17 | 0.189499 |
18 | 0.248858 |
19 | 0.223983 |
20 | 0.193284 |
21 | 0.181957 |
22 | 0.173086 |
23 | 0.186265 |
24 | 0.1854 |
25 | 0.191331 |
26 | 0.153298 |
27 | 0.137282 |
28 | 0.11432 |
29 | 0.093921 |
30 | 0.09794 |
31 | 0.090579 |
It has decline trend hence
d.) Forecast the number of cases for the next 7 days. Choose the method based on your judgement on previous graphs.
Answer: growth has decline trend which we will predict using fit linear model and we get
g(D)=0.4802-0.01256*D
where D= Day
this estimated growth we will use to forecsat # of cases for next 7 days using
X(t+1)=X(t) x (1+g).
Day | g=est(g(t)) | X(t+1)=X(t) x (1+g). |
32 | 0.078024 | 103407 |
33 | 0.065456 | 110176 |
34 | 0.052888 | 116003 |
35 | 0.04032 | 120680 |
36 | 0.027752 | 124029 |
37 | 0.015184 | 125913 |
38 | 0.002616 | 126242 |