In: Statistics and Probability
[Lumley] Write an R function that takes inputs n1, n2, N1, N2, σ12,σ2and computes the variance of the population total in a stratified sample. Choose some reasonable values of the population sizes and variances, and graph this function as n1 and n2 change, to find the optimum and to examine how sensitive the variance is the precise values of n1 and n2.
Solution:
For stratified sampling, variance of population total is given as,
Variance of population total=variance=N^2 * sum((Ni / N)^2 *(Ni-ni) / Ni *(s^2 / Ni))
=
where,
L=number of strata
N=the sum of all stratum sizes
Ni=size of ith stratum
Yi bar =sample mean of ith stratum
ni =number of observations in ith stratum
si =sample standard deviation of ith stratum
We have to create function var_ptotal as,
R code:
var_ptotal=function(n1,n2,N1,N2,s1,s2,y1bar,y2bar) {
y_bar=(N1*y1_bar+N2*y2_bar)/N;
Var_ybar=((N1/N)^2)*((N1-n1)/N1)*(s1^2)/n1+((N2/N)^2)*((N2-n2)/N2)*(s2^2)/n2;
#estimation for total and its variance
tau_hat=N*y_bar;
Varp=(N^2)*Var_ybar;
print(Varp)
}
var()
n1=40:60
n2=30:50
for(i in 1:n1)
{
for(j in 1:n2)
{
v[i,j]=var_ptotal(n1,n2,100,150,23,21,2,3)
plot(v,main="Variance plot",ylab="variance")
}
}
R= output:
[1] 343950.0 330055.0 316980.5 304655.1 293015.5 282005.6
271575.0 261678.9 252276.8 243332.3
[11] 234812.5 226687.7 218930.8 211517.1 204424.3 197631.8 191120.8
184874.0 178875.6 [20] 173111.0 167566.7
from above plot it can be seen that variance decreases as sample size increases.