In: Statistics and Probability
The table below shows the data of the new type of virus disease (COVID-19) from 11 March 2020, the day when the first case occurred in our country, until 21 April 2020, when the virus peaked. In response to these data, the number of patients recovering within the same time frame is given. Find a 2nd order polynomial equation (Ŷ = a0 + a1 x + a2 x2) that fits these data. Then calculate the correlation coefficient. Using the parabola equation, find a regression curve for the number of patients recovering based on the number of cases. Also, estimate when the number of cases will end according to these data.
DAILY CASE NUMBER(Y) | DAILY HEALING PATIENTS(x) | x^2 | |
11-Mar | 0 | 1 | 1 |
12-Mar | 0 | 0 | 0 |
13-Mar | 0 | 4 | 16 |
14-Mar | 0 | 1 | 1 |
15-Mar | 1 | 12 | 144 |
16-Mar | 1 | 29 | 841 |
17-Mar | 2 | 41 | 1681 |
18-Mar | 3 | 93 | 8649 |
19-Mar | 4 | 168 | 28224 |
20-Mar | 9 | 311 | 96721 |
21-Mar | 21 | 277 | 76729 |
22-Mar | 30 | 289 | 83521 |
23-Mar | 37 | 293 | 85849 |
24-Mar | 44 | 343 | 117649 |
25-Mar | 59 | 561 | 314721 |
26-Mar | 75 | 1196 | 1430416 |
27-Mar | 92 | 2069 | 4280761 |
28-Mar | 108 | 1704 | 2903616 |
29-Mar | 131 | 1815 | 3294225 |
30-Mar | 168 | 1610 | 2592100 |
31-Mar | 214 | 2704 | 7311616 |
1-Apr | 277 | 2148 | 4613904 |
2-Apr | 356 | 2456 | 6031936 |
3-Apr | 425 | 2786 | 7761796 |
4-Apr | 501 | 3013 | 9078169 |
5-Apr | 574 | 3135 | 9828225 |
6-Apr | 649 | 3148 | 9909904 |
7-Apr | 725 | 3892 | 15147664 |
8-Apr | 812 | 4117 | 16949689 |
9-Apr | 908 | 4056 | 16451136 |
10-Apr | 1006 | 4747 | 22534009 |
11-Apr | 1101 | 5138 | 26399044 |
12-Apr | 1198 | 4789 | 22934521 |
13-Apr | 1296 | 4093 | 16752649 |
14-Apr | 1403 | 4062 | 16499844 |
15-Apr | 1518 | 4281 | 18326961 |
16-Apr | 1643 | 4801 | 23049601 |
17-Apr | 1769 | 4353 | 18948609 |
18-Apr | 1890 | 3783 | 14311089 |
19-Apr | 2017 | 3977 | 15816529 |
20-Apr | 2140 | 4674 | 21846276 |
21-Apr | 2259 | 4611 | 21261321 |
i know how to do with excel so please dont do this with excel
We have used R software to fit 2nd order polynomial equation (Ŷ = a0 + a1X+ a2 X2) to data of the new type of virus disease (COVID-19) from 11 March 2020, the day when the first case occurred in our country, until 21 April 2020, when the virus peaked. Y is the “Daily Case Number” and X is the “Daily Healing Patients”. The model output is given below
The fitted Ŷ = a0 + a1X+ a2 X2) is
# Fitted quadratic model is
Model
Y= -1.246e+01+ (8.925e-03)*X+(7.039e-05)*X^2
Note that we have given coefficients in scientific notation, if we convert them to numer form, model is given as
Y= -12.46+0.0089*X+0.00007*X2
Daily Healing Patients (X)= 363
The model shows that
Daily Case Number (Y)= -12.46+0.0089*363+0.00007*3632
Daily Case Number (Y)=0 (approx.)
Thus, the number of cases will end, when Daily Healing Patients (X) reaches to 363
The fitted values given by the model are
The correlation coefficient between Y and Y fitted values is 0.886, which is highly positive and the fitted model is good.
####################### R code employed for data analysis ######################
data<-read.csv("data.csv",sep=",",header=T)
head(data)
Model<-lm(DAILY.CASE.NUMBER.Y~DAILY.HEALING.PATIENTS.X+DAILY.HEALING.PATIENTS.X_Square,data=data)
summary(Model)
fitted.values(Model)# Fitted quadratic model is Model
Y= -1.246e+01+ (8.925e-03)*X+(7.039e-05)*X^2
X=362
Y= -1.246e+01+ (8.925e-03)*X+(7.039e-05)*X^2
Y<-round(Y,2)
Y
X=363
Y= -12.46+0.0089*X+0.00007*X^2
Y
cor(fitted.values(Model), data$DAILY.CASE.NUMBER.Y)