Question

In: Math

Using R Studio Now, set the seed to 348 with `set.seed()`. Then take a sample of...

Using R Studio

Now, set the seed to 348 with `set.seed()`. Then take a sample of size 10,000 from a normal distribution with a mean of 82 and a standard deviation of 11.

(a) Using sum() on a logical vector, how many draws are less than 60? Using mean() on a logical vector, what proportion of the total draws is that? How far is your answer from pnorm() in 1.1 above?

```{R}
set.seed(348)
x=rnorm(10000,82,11)
sum(ifelse(x<60,1,0))

mean(ifelse(x<60,1,0))

pnorm(60,82,11)

Using sum() function there are 128 draws that are less than 60 and using the mean() function 0.0281 is the porportion of total draws. From these outputs we can say that the answer is quite close to the pnorm() value that has been calculated.

(b) What proportion of your sample is greater than 110 or less than 54?

(c) Why are your answers close to what you got above? Why are they not exactly the same?

(d) Using ggplot2, make a histogram of your sample. Set y=..density.. inside aes(). Overlay a normal distribution with stat_function(aes(samp), fun=dnorm, args=list(82,11)). Using geom_vline(xintercept=), add dashed vertical lines corresponding to the 2.5th and the 97.5th percentile of the sample

Expert Solution

a) Using sum() on a logical vector, how many draws are less than 60? Using mean() on a logical vector, what proportion of the total draws is that? How far is your answer from pnorm() in 1.1 above?

R code with comments

#set the seed
set.seed(348)

#set the sample size
n<-10000
#take a sample of size n from normal(82,11)
x=rnorm(n,82,11)

#a
#number of draws that are less than 60
k<-sum(x<60)
sprintf('The number of draws that are less than 60 is %g',k)
# what proportion of the total draws is that?
prop<-mean(x<60)
sprintf('The proportion of the total draws that are less than 60 is %.4f',prop)
#How far is your answer from pnorm() in 1.1
sprintf('The theoretica value from pnorm() is %.4f', pnorm(60,82,11))

#get this

We can see that the theoretical value from pnorm is close to the sample proportion

Note: x<60 is a logical vector (made of TRUE,FALSE), where as ifelse(x<60,1,0) is a vector of 0,1s and not a logical vector

(b) What proportion of your sample is greater than 110 or less than 54?

R code

#b) proportion of your sample is greater than 110 or less than 54
prop<-mean(x<54 | x>110)
sprintf('Proportion of sample is greater than 110 or less than 54 is %.4f',prop)

# get this

(c) Why are your answers close to what you got above? Why are they not exactly the same?

R code

c)#How far is your answer from pnorm()
a<-pnorm(54,82,11)+(1-pnorm(110,82,11))
sprintf('The theoretical value from pnorm() is %.4f',a )

# get this

We can see that the sample proportion from b) is close to the theoretical proportion. They are close but not the same as the sample in part b is just that a sample which represents the population. Each sample is subjected to a random variation, and hence the proportion we calculated in part b is a sample statistics and the theoretical proportion from pnorm is the population parameter. We use the sample statistics to estimate the population parameter and hence the proportion in part b is close but not the same.

(d) Using ggplot2, make a histogram of your sample. Set y=..density.. inside aes(). Overlay a normal distribution with stat_function(aes(samp), fun=dnorm, args=list(82,11)). Using geom_vline(xintercept=), add dashed vertical lines corresponding to the 2.5th and the 97.5th percentile of the sample

R code

#d)
library(ggplot2)
#make a histogram of your sample
p<-ggplot(data.frame(x), aes(x=x))+
geom_histogram(aes(y=..density..),binwidth=1)
#Overlay a normal distribution
p<-p+stat_function(aes(x), fun=dnorm, args=list(82,11),color="red")
#2.5th and the 97.5th percentile of the sample
q<-quantile(x,c(0.025,0.975))
p<-p+geom_vline(xintercept=q,color="blue", linetype="dashed")
#add the title
p+labs(title="histogram of the sample")

# get this

milcah answered 1 year ago

R-question Set the random number generation seed to the value 1234. Draw a sample of size...

R-question Set the random number generation seed to the value 1234. Draw a sample of size 11 fromExp(λ = 0.097) and find the mean get-better time in this sample. Repeat this process for a total of 10000 get-better averages and store these values in the variables gbas. Typically, we would make a histogram of the values, but R has a nice function that will draw a smooth representation of the histogram: plot(density(gbas)). Plot this. We now know that the normal...

Using R Studio/R programming... Usually, we will use a random sample to estimate the statistics of...

Using R Studio/R programming... Usually, we will use a random sample to estimate the statistics of the underlying population. If we assume a given population is a standard normal distribution and we want to estimate its mean, which is the better technique to estimate that mean from a sample: Use the mean of one random sample of size 500 Use the mean of 300 random samples of size 10 Run your own experiment and use your results as a supporting...

Using R Studio: A College Algebra course requires students to take an assessment test at the...

Using R Studio: A College Algebra course requires students to take an assessment test at the start of the course and again at the end of the course. The pre and post test scores for ten students are: Student 1 2 3 4 5 6 7 8 9 10 Pre-test score 70 62 63 61 56 52 71 63 64 67 Post-test score 87 71 82 78 57 50 72 65 78 65 Do the assessment test results support the...

Using R studio 1. Read the iris data set into a data frame. 2. Print the...

Using R studio 1. Read the iris data set into a data frame. 2. Print the first few lines of the iris dataset. 3. Output all the entries with Sepal Length > 5. 4. Plot a box plot of Petal Length with a color of your choice. 5. Plot a histogram of Sepal Width. 6. Plot a scatter plot showing the relationship between Petal Length and Petal Width. 7. Find the mean of Sepal Length by species. Hint: You could...

<< Using R code >> Set seed number as "12345" every time you generate random numbers....

<< Using R code >> Set seed number as "12345" every time you generate random numbers. For each answer, use # to explain if necessary. 2) Generate a data.frame "D" with 3 variables. The 1st variable "v1" has 50 number of N(5,3^2) (normal with mean 5, standard deviation 3) The 2nd variable "v2" has 50 number of exp(5) (exponential with parameter 5) The 3rd variable "v3" has 50 random characters from lower case alphabets. 2-1) Rename the variable from "v1",...

( In R / R studio ) im not sure how to share my data set,...

( In R / R studio ) im not sure how to share my data set, but below is the title of my data set and the 12 columns of my data set. Please answer as best you can wheather its pseudo code, partial answers, or just a suggestion on how i can in to answer the question. thanks #---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- The dataset incovid_sd_20201001.RDatacontains several variables related to infections of covid-19 for eachzip code in San Diego County as of October...

<<Using R code>> Set seed nuumber as 12345" every time you generate random numbers. For each...

<<Using R code>> Set seed nuumber as 12345" every time you generate random numbers. For each anser, use # to explain if necessary. 3. Use data "thusen" in ibrary ISwR" 3-1) Remove missing observations in the data, name this set as thu1, and print the first 6 and last 6 observations. 3-2) Rename a variable "short.velocity" -> "x", "blood.glucose" -> "y". 3-3) Draw a scatter plot for "y" by "x", give title "velocity vs.glucose". Put tick marks of x-axis at...

Number 2 implemented in R (R Studio) Set up the Auto data: Load the...

** Number 2 implemented in R (R Studio) ** Set up the Auto data: Load the ISLR package and the Auto data Determine the median value for mpg Use the median to create a new column in the data set named mpglevel, which is 1 if mpg>median and otherwise is 0. Make sure this variable is a factor. We will use mpglevel as the target (response) variable for the algorithms. Use the names() function to verify that your new column...

Using R-Studio please answer the following questions and show your code. 1. Julie buys a take-out...

Using R-Studio please answer the following questions and show your code. 1. Julie buys a take-out coffee from one of two coffee shops on a random basis: Ultimo Coffee and Joe’s Place. This month, she measured the temperature of each cup immediately after purchase, using a cooking thermometer. Sample data is shown below, temperatures are in Fahrenheit. ultimo = c(171,161,169,179, 171,166,169,178,171, 165,172,172) joes = c(168,165,172, 151,162,158,157,160, 158,160,158,164) State the null and alternative hypothesis in your own words. What type of statistical...

Please use R and R studio A sample of 15 female collegiate golfers was selected and...

Please use R and R studio A sample of 15 female collegiate golfers was selected and the clubhead velocity (km/hr) while swinging a driver was determined for each one, resulting in the following data (“Hip Rotational Velocities During the Full Golf Swing,” J.of Sports Science and Medicine, 2009: 296–299): 69.0 69.7 72.7 80.3 81.0 85.0 86.0 86.3 86.7 87.7 89.3 90.7 91.0 92.5 93.0 The corresponding z percentiles are -1.83 -1.28 -0.97 -0.73 -0.52 -0.34 -0.17 0.0 0.17 0.34 0.52...

Question

Using R Studio Now, set the seed to 348 with `set.seed()`. Then take a sample of...

Solutions

Expert Solution

Related Solutions

R-question Set the random number generation seed to the value 1234. Draw a sample of size...

Using R Studio/R programming... Usually, we will use a random sample to estimate the statistics of...

Using R Studio: A College Algebra course requires students to take an assessment test at the...

Using R studio 1. Read the iris data set into a data frame. 2. Print the...

<< Using R code >> Set seed number as "12345" every time you generate random numbers....

( In R / R studio ) im not sure how to share my data set,...

<<Using R code>> Set seed nuumber as 12345" every time you generate random numbers. For each...

Number 2 implemented in R (R Studio) Set up the Auto data: Load the...

Using R-Studio please answer the following questions and show your code. 1. Julie buys a take-out...

Please use R and R studio A sample of 15 female collegiate golfers was selected and...