Question

In: Statistics and Probability

Answer the following bootstrap question by showing the R code : A set of data X...

Answer the following bootstrap question by showing the R code :

A set of data X contains the following numbers:

119.7 104.1 92.8 85.4 108.6 93.4 67.1 88.4 101.0 97.2

95.4 77.2 100.0 114.2 150.3 102.3 105.8 107.5 0.9 94.1

We generated n = 20 observations Xi = 10 Wi+100, where Wi has a contaminated normal distribution with proportion of contamination 20% and σc = 4.

Suppose we are interested in testing:

H0 : μ = 90 versus H1 : μ > or <90 .

(a) Determine the bootstrap p-value for this situation.

(b) Using R Program to Compute the p-value based on 3000 bootstraps and show the R code.

My R codes always give me the P value =0. What's wrong with it?

m <- 90
x <- c(119.7, 104.1, 92.8, 85.4, 108.6, 93.4, 67.1, 88.4, 101, 97.2, 95.4, 77.2, 100, 114.2, 150.3, 102.3, 105.8, 107.5, 0.9, 94.1 )

boottestonemean <- function(x,m,b){
n<-length(x)
m=90
b=3000
v <- mean(x)
z <- x - mean(x) + 90
counter <- 0
teststatall <- rep(0,b)
for(i in 1:b){xstar <- sample(z, n, replace=TRUE)
vstar <- mean(xstar)
if(vstar > v)
{counter <- counter+1}
teststatall[i]<-vstar}
pvalue <- counter/b
list(origtest=v,pvalue=pvalue,teststatall=teststatall)
}

Use R program.

Expert Solution

Run the below code
set.seed(2)
m <- 90
x <- c(119.7, 104.1, 92.8, 85.4, 108.6, 93.4, 67.1, 88.4, 101, 97.2, 95.4, 77.2, 100, 114.2, 150.3, 102.3, 105.8, 107.5, 0.9, 94.1 )

boottestonemean <- function(x,m,b){
n<-length(x)
m=90
b=3000
v <- mean(x)
z <- x - mean(x) + 90
counter <- 0
teststatall <- rep(0,b)
for(i in 1:b){xstar <- sample(z, n, replace=TRUE)
vstar <- mean(xstar)
if(vstar > v)
{counter <- counter+1}
teststatall[i]<-vstar}
pvalue <- counter/b
list(origtest=v,pvalue=pvalue,teststatall=teststatall)
}
R=boottestonemean(x,90,3000)
R$pvalue ## boostrap p-value

T<-(mean(x)-90)/(sd(x)/20^0.5) #one sided t test
T

Run the above and get answer below

> R$pvalue ## boostrap p-value
[1] 0.2056667
> 
> T<-(mean(x)-90)/(sd(x)/20^0.5)  #one sided t test
> T
[1] 0.8447507
> R=boottestonemean(x,90,3000)
> R$pvalue ## boostrap p-value
[1] 0.1896667
> 
> T<-(mean(x)-90)/(sd(x)/20^0.5)  #one sided t test
> T
[1] 0.8447507

orchestra answered 2 years ago

Use R to answer the following question. Copy and paste the code and answer from R...

Use R to answer the following question. Copy and paste the code and answer from R into your paper. On the average,five cars arrive at a particular car wash every hour. Let X count the number of cars that arrive from 10 AM to 11 AM. Then X ∼pois(lambda = 5). Also, μ = σ2 = 5.  What is the probability that no car arrives during this period?  Suppose the car wash above is in operation from 8AM...

Write PYTHON CODE to answer the following question: Consider the following data: x = [0, 2,...

Write PYTHON CODE to answer the following question: Consider the following data: x = [0, 2, 4, 6, 9, 11, 12, 15, 17, 19] y = [5, 6, 7, 6, 9, 8, 8, 10, 12, 12] Using Python, use least-squares regression to fit a straight line to the given data. Along with the slope and intercept, compute the standard error of the estimate and the correlation coefficient. Best fit equation y = ___ + ___ x Standard error, Sy/x =...

The Book of R (Question 20.2) Please answer using R code. Continue using the survey data...

The Book of R (Question 20.2) Please answer using R code. Continue using the survey data frame from the package MASS for the next few exercises. The survey data set has a variable named Exer , a factor with k = 3 levels describing the amount of physical exercise time each student gets: none, some, or frequent. Obtain a count of the number of students in each category and produce side-by-side boxplots of student height split by exercise. Assuming independence...

Write code in R for this questions,, will vote!! Load the Taxi.txt data set into R....

Write code in R for this questions,, will vote!! Load the Taxi.txt data set into R. (a) Calculate the mean, median, standard deviation, 30th percentile, and 65th percentile for Mileage and TripTime. (b) Make a frequency table for PaymentProvider that includes a Sum column. Report the resulting table. (c) Make a contingency table comparing PaymentType and Airport. Report the resulting table. (d) Use the cor() function to find the correlation between each pair of the Meter, Tip, Mileage, and TripTime...

Answer IN R CODE please. Using the data below, Create a scatterplot of y vs x...

Answer IN R CODE please. Using the data below, Create a scatterplot of y vs x (show this) and fit it a simple linear regression model using y as the response and plot the regression line (with the data). Show this as well. Test whether x is a significant predictor and create a 95% CI around the slope coefficient. What does the coefficient of determinations represent? For x=20, create a CI for E(Y|X=20). Show this. For x=150, can you use...

Answer IN R CODE to get the following. Using the data below, Create a scatterplot of...

Answer IN R CODE to get the following. Using the data below, Create a scatterplot of y vs x Fit a simple linear regression model using y as the response and plot the regression line (with the data) Test whether x is a significant predictor and create a 95% CI around the slope coefficient. Report and interpret the coefficient of determination. For x=20, create a CI for E(Y|X=20). For x=150, can you use the model to estimate E(Y|X=150)? Discuss. Does...

What am I doing wrong in my bootstrap code for R? x<-c(30, 37, 36, 43, 42,...

What am I doing wrong in my bootstrap code for R? x<-c(30, 37, 36, 43, 42, 43, 43, 46, 41, 42) n = 10 x=pnorm(n,mean=40.3,sd=4.6) mu_0=40.3 s.mean=mean(x) s.sd=sd(x);s.sd [1] NA t.sample=(s.mean-mu_0)/(s.sd/sqrt(n)) B=10000 t=c() count.less=0 count.more=0 for(j in 1:B) + { b.smpl = x[sample(1:n, size = n,replace=TRUE)] + ybar.bs = mean(b.smpl) + sd.bs = sd(b.smpl) + t[j] = (ybar.bs - s.mean)/(sd.bs/sqrt(n)) + if(t[j]>=t.sample){ count.more=count.more+1} + if(t[j]<=t.sample){ count.less=count.less+1} + } Error in if (t[j] >= t.sample) { : missing value where TRUE/FALSE...

Use the following set of data to answer the question: is there a significant association between...

Use the following set of data to answer the question: is there a significant association between an immigrant’s length of time in the country and his/her level of acculturative stress, as measured by a well-being scale? The data are listed in the table below. Years Well-being In country score X Y 12 6 15 8 9 4 7 5 18 9 24 10 15 7 16 6 21 3 15 9 M = 15.20 M = 6.70 SSx = 235.60...

Prove the converse of Theorem 3.3.4 by showing that if a set K ⊆ R is...

Prove the converse of Theorem 3.3.4 by showing that if a set K ⊆ R is closed and bounded, then it is compact. Theorem 3.3.4 A set K ⊆ R is compact if and only if it is closed and bounded.

Given the data set (treatments 1 to 4) with respective outcome, what is the R code...

Given the data set (treatments 1 to 4) with respective outcome, what is the R code I can use to Find a 95 percent confidence interval on the mean strength of the 4 techniques. Also for finding a 95 percent confidence interval on the difference in means. (i.e 1 vs 3 , 2 vs 4 etc) strength group 3129 1 3000 1 2865 1 2890 1 3200 2 3300 2 2975 2 3150 2 2800 3 2900 3 2985 3...