In: Economics
Addressing COVID-19 is a pressing health and social concern. To
date, many epidemic projections and policies addressing COVID-19
have been designed without seroprevalence1 data to inform epidemic
parameters. A study published on April 11 stirred huge
public/media/government interest. It measured the seroprevalence of
antibodies to SARS-CoV-2, name of the virus causing COVID-19, in
Santa Clara County, California.
The study found the population prevalence of SARS-CoV-2 antibodies
in Santa Clara County implies that the infection is much more
widespread than indicated by the number of confirmed cases. Based
on different scenarios, the population prevalence of COVID-19 in
Santa Clara ranged from 2.49% (with 95% CI of 1.80% - 3.17%) to
4.16% (with 95% CI of 2.58% - 5.70%). These prevalence estimates
are 50 to 85 times more than the number of cases confirmed by the
routine tests. Population prevalence estimates can be used to
calibrate epidemic and mortality projections and more importantly
can be used as criteria for reopening the economy.
Canada is facing the same pressing health and social concern and
there is consensus that our confirmed COVID-19 cases are
significantly underestimated due to insufficient tests. Suppose you
are designated as a member of the important task force to conduct a
similar study in the city of Vaughan, the particularly hard hit
city by the COVID-19 outbreak in the GTA. Vaughan’s current
confirmed infection rate (from the on-going insufficient tests) is
0.113%. Assume there is a consensus among epidemiologists that the
confirmed infection rate of 0.113% is many folds underestimated and
the true infection rate might be around 3.5%.
Elaborate on your proposal as a member of the task force. You may
choose to discuss random sampling design, sample size (you have the
concern of time/money cost if the sample is too big; in the
meanwhile, you have the concern of the validity of the estimates if
the sample is too small.), hypothesis setting, size of the test or
other statistical issues you think to be important for such a
medical study. For every aspect of your decision, it is important
to communicate the rationale.
For the hypothesis testing, you will have to first select a random sample of say 50 people, after this you need to fix your null hypothesis which is an assumption about the study that you wish to reject.
Lets say we wish to see that the average number of people that are have been infacted are greater than that of the already indicated number of cases. Now, we might aim to test it in the form of mean number of people affected or the proportion of people that are actually affected. Since it is given to us that it is a 95% confidence interval, we can take the level of significance as being 5%.
Null hypothesis: Ho: m=mo ( that the population mean is equal to the sample mean)
Alternative Hypothesis: H1: m>mo
as we can see that it uses a right tailed test which would be a one tailed test. After collecting the sample mean and sample standard deviation (s/) . You could also consider several other variables that play a role in a person's infection with coronavirus for eg. how many times a day the person makes a contact with people outside, or orders food from outside. and you could similarly do the hypothesis testing for all of these variables and see whether it plays a role.
after calculating the test statistic i.e. z=
where = sample mean
= estimated population mean
s= sample standard deviation
n= number of observations
when the test statistic value> critical value( from the z table)
then you reject the null hypohesis
and when, T.S value < critical value
then you dont reject the null hypothesis