Question

In: Statistics and Probability

Suppose we want to estimate S=E[X2] when X has the density that is proportional to q(x)...

Suppose we want to estimate S=E[X2] when X has the density that is proportional to q(x) = exp{-|x|3/3}.

1. Use rejection sampling to estimate E[X2]. Describe your algorithm first. Attach your R code and output. Be sure to count and report your acceptance ratio

Solutions

Expert Solution

For a random variable X with un-normalized density q(x) = exp(−|x|
3/3), consider
estimating σ
2 = E(X2
). By numerical integration, I found that the “true” value is
σ
2 = 0.776.
(a) We can do importance sampling using a standard normal proposal—the cubic
in the exponent of q means the tails are lighter than Gaussian which has a
quadratic in the exponent. That is, we sample Y ∼ g = N(0, 1), compute
importance weights w
?
(Y ) = q(Y )/g(Y ), and estimate σ
2 with
PN
i=1 w
?
(Yi)Y
2
P
i
N
i=1 w?
(Yi)
.
Based on a sample of size N = 1000, I got an estimate of 0.734 for σ
2
.
(b) We can employ the accept–reject strategy to sample from q, using the same
standard normal proposal as in (a). It’s not difficult to show (using calculus)
that
exp(−|x|
3
/3) ≤ A exp(−|x|
2
/2),
where A = exp(1/6) ≈ 1.18. Run accept–reject until a sample X1, . . . , XN of
size N is available, and estimate σ
2 with
1
N
Xn
i=1
X
2
i
.
Based on a sample of size N = 1000, I got an estimate of 0.728 for σ
2
.
(c) Using the sample X1, . . . , XN of size N = 1000 from part (b) above, I get 0.759
as an estimate of σ
2 based on the Philippe–Robert strategy.
(d) To compare the estimators in parts (b) and (c), I repeat the above experi-
ments 1000 times, each based on N = 1000, and compute the empirical stan-
dard deviation of the two estimators. In this case, I found that the standard
deviations for the accept–reject and Philippe–Robert methods are 0.029 and
0.009, respectively. Therefore, the Philippe–Robert method is significantly
more accurate than a basic accept–reject in this problem.

R code:-

rnorm.newton <- function(n, mean=0, sd=1) {
X <- numeric(n)
U <- runif(n)
for(i in 1:n) {
f <- function(x) pnorm(x) - U[i]
df <- function(x) dnorm(x)
X[i] <- newton(f, df, 0)$solution # newton function from my website
}
return(mean + sd * X)
}
rnorm.bm <- function(n=1, mean=0, sd=1) {
if(n %% 2 == 0) p <- n / 2 else p <- (n %/% 2) + 1
U <- runif(p)
V <- runif(p)
R <- sqrt(-2 * log(V))
Theta <- 2 * pi * U
X <- R * cos(Theta)
Y <- R * sin(Theta)
out <- mean + sd * c(X, Y)
return(out[1:n])
}
M <- 50000
o.nwtn <- system.time(nwtn <- rnorm.newton(M))[3]
o.bm <- system.time(bm <- rnorm.bm(M))[3]
o.rn <- system.time(rn <- rnorm(M))[3]
print(cbind(Newton=o.nwtn, BoxMuller=o.bm, rnorm=o.rn))
Problem 2.
rgamma.ar <- function(n, shape, scale=1) {
s <- shape
s.int <- floor(s)
b <- s / s.int
M <- gamma(s.int) / gamma(s) * b**s.int * (s * exp(-1))**(s - s.int)
f <- function(y) dgamma(y, shape=s)
Mg <- function(y) M * dgamma(y, shape=s.int, rate=1 / b)
acpt <- 0
total <- 0

X <- numeric(n)
while(acpt < n) {
total <- total + 1
Y <- sum(-b * log(runif(s.int)))
if(runif(1) <= f(Y) / Mg(Y)) {
acpt <- acpt + 1
X[acpt] <- Y
}
}
return(list(X=scale * X, rate.true=1 / M, rate.obs=acpt / total))
}
o <- rgamma.ar(1000, shape=5.5)
print(o[-1])
ylim <- c(0, 1.05 * dgamma(4.5, shape=5.5))
hist(o$X, freq=FALSE, col="gray", border="white", xlab="x", ylim=ylim, main="")
curve(dgamma(x, shape=5.5), add=TRUE)
Problem 3.
H.naive <- function(z, M) mean(rnorm(M) + rexp(M) <= z)
H.rb <- function(z, M) mean(pexp(z - rnorm(M)))
reps <- 5000
o.naive <- o.rb <- numeric(reps)
for(r in 1:reps) {
o.naive[r] <- H.naive(1, 5000)
o.rb[r] <- H.rb(1, 5000)
}
print(cbind(sd.naive=sd(o.naive), sd.rb=sd(o.rb)))
Problem 4.
qq <- function(x) exp(-abs(x)**3 / 3)
sigma2.is <- function(N) {
X <- rnorm(N)
w <- qq(X) / dnorm(X)
return(sum(w * X**2) / sum(w))
}
qq.ar <- function(N) {
X <- numeric(N)

r <- 1
A <- exp(1 / 6)
R <- function(x) qq(x) / A / exp(-x**2 / 2)
repeat {
Y <- rnorm(1)
if(runif(1) <= R(Y)) {
X[r] <- Y
r <- r + 1
}
if(r > N) break
}
return(X)
}
sigma2.pr <- function(N, X=NULL) {
if(is.null(X)) X <- sigma2.ar(N)
XX <- sort(X)
w <- (XX[-1] - XX[-N]) * qq(XX[-N])
return(sum(w * XX[-N]**2) / sum(w))
}
o.ar <- o.pr <- numeric(1000)
for(i in 1:1000) {
X <- qq.ar(1000)
o.ar[i] <- mean(X**2)
o.pr[i] <- sigma2.pr(1000, X)
}
print(cbind(AR=sd(o.ar), PR=sd(o.pr)))
6


Related Solutions

Suppose that random variable X 0 = (X1, X2) is such that E[X 0 ] =...
Suppose that random variable X 0 = (X1, X2) is such that E[X 0 ] = (µ1, µ2) and var[X] = σ11 σ12 σ12 σ22 . (a matrix) (i) Let Y = a + bX1 + cX2. Obtain an expression for the mean and variance of Y . (ii) Let Y = a + BX where a' = (a1, a2) B = b11 b12 0 b22 (a matrix). Obtain an expression for the mean and variance of Y . (ii)...
A random variable X has density function f(x) = 4x ( 1 + x2)-3 for x...
A random variable X has density function f(x) = 4x ( 1 + x2)-3 for x > 0. Determine the mode of X.
2.2.8. Suppose X1 and X2 have the joint pdf f(x1, x2) = " e−x1 e−x2 x1...
2.2.8. Suppose X1 and X2 have the joint pdf f(x1, x2) = " e−x1 e−x2 x1 > 0, x2 > 0 0 elsewhere . For constants w1 > 0 and w2 > 0, let W = w1X1 + w2X2. (a) Show that the pdf of W is fW (w) = " 1 w1− w2 (e−w/w1 − e−w/w2) w > 0 0 elsewhere . (b) Verify that fW (w) > 0 for w > 0. (c) Note that the pdf fW...
Suppose we want to estimate the mean salary μ of all college graduates. We take a...
Suppose we want to estimate the mean salary μ of all college graduates. We take a sample of 25 graduates and the sample average is $39,000 with a sample standard deviation of $10,000. We construct a 95% confidence interval for the true average salary. What is the upper bound of the confidence interval i.e. what is the upper confidence limit? $39,000 $43,128 $41,000 $42.422
suppose that we want to estimate what proportion of the adult population of the United States...
suppose that we want to estimate what proportion of the adult population of the United States has high blood pressure, and we want to be “99% sure” that the error of our estimate will not exceed 2 percentage points. How large a sample will we need if (a) we have no idea what the true proportion might be; (b) we found research that states the true proportion lies on the interval from 5% to 20%
Suppose we want to estimate the average weight of children in a particular school district. The...
Suppose we want to estimate the average weight of children in a particular school district. The district contains 15 elementary schools, 10 middle schools, and 2 high schools. For each of the following approaches, select the best description of the sampling design, and identify whether the sampling scheme ensures that the probability of being included is necessarily the same for every student. A. List all the students in the school district. Average the weights of every 50th student on the...
7. Suppose that next year we again will collect data and will want to estimate the...
7. Suppose that next year we again will collect data and will want to estimate the mean height of the entire female/male SCC student body. The minimum sample size we need to take depends on what confidence level and margin of error we specify. (n≥30, use section 6.1 as a guide for this question, use ? = (s) answer from 2c, remember round answers up!) a. Determine the minimum sample size needed to be 94% confident that the sample mean...
Check the requirements for confidence intervals using the following information: Suppose we want to estimate the...
Check the requirements for confidence intervals using the following information: Suppose we want to estimate the proportion of American teenagers who would rather own movies physically vs those who would rather own movies digitally. A survey of 45 American teenagers finds that 12 of them would rather own movies physically. With this information, you plan to create a 95% Confidence Interval for the true proportion of Americans who would rather own movies physically. Conditions: i. The method of sampling      ...
Suppose we want to estimate the proportion of teenagers (aged 13-18) who are lactose intolerant. If...
Suppose we want to estimate the proportion of teenagers (aged 13-18) who are lactose intolerant. If we want to estimate this proportion to within 5% at the 95% confidence level, how many randomly selected teenagers must we survey? If 15% of adults in a certain country work from home, what is the probability that fewer than 42 out of a random sample of 350 adults will work from home? (Round your answer to 3 decimal places)
Construct a 98​% confidence interval to estimate the population mean when x overbar125 and s​ =...
Construct a 98​% confidence interval to estimate the population mean when x overbar125 and s​ = 27 for the sample sizes below. ​a) n=20       ​b) n=50       ​c) n=80 ​a) The 98​% confidence interval for the population mean when nequals20 is from a lower limit of _ to an upper limit of _.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT