Question

In: Computer Science

Programming in R test1 [1] 62.21030 57.60602 86.21137 84.73354 83.74019 69.68914 84.57337 68.31329 74.84393 77.75101 69.23417...

Programming in R

test1

[1] 62.21030 57.60602 86.21137 84.73354 83.74019 69.68914 84.57337 68.31329 74.84393 77.75101 69.23417 66.95640
[13] 68.56414 71.97554 63.92802 74.36488 72.45757 72.37171 72.23253 86.86378 91.33591 60.92220 94.63742 78.92828
[25] 85.36320 65.42284 77.67914 74.72229 66.06849 66.18031

test2

[1] 70.92537 61.84501 79.35110 66.56921 85.24835 71.78693 77.12057 82.20876 71.54209 66.11271 62.46592 79.36359
[13] 73.91162 77.18452 71.46808 72.78128 82.57056 78.34531 59.93903 64.00577 72.96255 75.81221 69.76166 68.04771
[25] 64.12077 84.65762 64.87694 80.51515 78.21864 79.27847

The two vectors above (test1 and test2) represent exam scores from two different classes of size 30 (taught by two different teachers). If one class is scoring significantly higher than the other, it could be interpreted that one teacher is more effective than the other.

a. Find the mean test score for each class. Which class did better, based on means? Would you say they did significantly better?

b. Create comparative (side-by-side) boxplots in R for the two classes. Based on these, do you believe one teacher is more effective than the other? Significantly more effective?

c. Create comparative density plots for the two small classes on the same set of axes (you can use par(new=TRUE)) and make sure xlim and ylim are the same). Based on these, do you believe one teacher is more effective than the other? Significantly more effective?

d. Perform a two-sample t-test (one-tailed) to see if there is a difference in the population means for the two classes. Based on these, do you believe one teacher is more effective than the other? Significantly more effective? Significant in what sense?

Expert Solution

Hello,

Let's consider the two test classes as vector statements in R:

test1<-c(62.21030,57.60602, 86.21137, 84.73354, 83.74019, 69.68914, 84.57337 ,68.31329 ,74.84393 ,77.75101 ,69.23417, 66.95640 ,68.56414 ,71.97554 ,63.92802 ,74.36488 ,72.45757, 72.37171 ,72.23253 ,86.86378 ,91.33591 ,60.92220 ,94.63742, 78.92828 ,85.36320, 65.42284 ,77.67914 ,74.72229 ,66.06849, 66.18031)

test2<-c(0.92537, 61.84501, 79.35110, 66.56921, 85.24835, 71.78693, 77.12057, 82.20876, 71.54209, 66.11271, 62.46592 ,79.36359, 73.91162, 77.18452, 71.46808, 72.78128, 82.57056, 78.34531 ,59.93903, 64.00577 ,72.96255 ,75.81221 ,69.76166, 68.04771 ,64.12077, 84.65762 ,64.87694 ,80.51515, 78.21864, 79.27847)

a. Mean test score for both classes can easily be found using mean function in R:

mean(test1)
mean(test2)

which gives output as

[1] 74.32937
[1] 70.76658

which implies that class1 did better, but not significantly

b. to create side by side boxplots for both classes we use the statement:

boxplot(test1,test2)

which gives the output

The boxplots show that the median (bold line) is approximately equal for both the classes, whereas the percentage of people who scored above the median is higher in the class 1 than 2, making it better and giving a clearer insight over the classes' performance. However, its still not quite significant.

c. To create comparitive density plots between the two classes, we use the par function.

par(mfrow=c(1,2),new=TRUE)
plot(density(test1),main="test1",xlim=c(0,100),ylim=c(0,0.05))
plot(density(test2),main="test2",xlim=c(0,100),ylim=c(0,0.05))

this gives output as

The density plots again show that the population in class 1 is more concentrated towards the 60-100 region than class 2, giving us same outcome as before , still not significant enough.

d. To perform a two sampled t-test we use the statement:

t.test(test1, test2, var.equal = FALSE)

which gives output as

Welch Two Sample t-test

data:  test1 and test2
t = 1.0998, df = 48.817, p-value = 0.2768
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -2.947693 10.073258
sample estimates:
mean of x mean of y 
 74.32937  70.76658

We obtained p-value greater than 0.05, then we can conclude that the averages of two groups are significantly similar, therefore we reach the conclusion that there is no significant difference between the performances of the two teachers.

Hope this helps. Help back by giving a thumbs up.

Have a great day!!

venereology answered 9 months ago

R Programming: create a vector for 1 to 31 and count the number of even and...

R Programming: create a vector for 1 to 31 and count the number of even and odds using ifelse()

R-Studio (R Programming Language) 4. Let the data x be given by `x <- c(1, 8,...

R-Studio (R Programming Language) 4. Let the data x be given by `x <- c(1, 8, 2, 6, 3, 8, 5, 5, 5, 5)` Use R to compute the following functions. Note, we use X1 to denote the first element of x (which is 1) etc. 1. `(X1 + X2 + . . .+ X10)/10` (use sum) 2. Find log10(Xi) for each i. (Use the log function which by default is base e) 3. Find `(Xi - 4.4)/2.875`...

R-Studio (R Programming Language) 1. How would you create a vector `V` containing the values 0,...

R-Studio (R Programming Language) 1. How would you create a vector `V` containing the values 0, 0.25, 0.5, 0.75, and 1? ```{r} #insert your code ``` 2. Name the elements of `V`: first, second, middle, fourth, last. Describe two ways of naming elements in `V` ```{r} #insert your code ``` 3. Suppose you keep track of your mileage each time you fill up. At your last 6 fill-ups the mileage was 65311 65624 65908 66219 66499 66821 67145 67447....

1.What is the relationship between the LSD and a Z score? (R programming Language) Write a...

1.What is the relationship between the LSD and a Z score? (R programming Language) Write a fragment of code comparing how both would be computed. Use the Z score for the standard error of a mean, $z_i = (x_i - \mu)/(\sigma / \sqrt(n))$ 2. How would you modify the calculations of LSD to produce Tukey's Honest Signficant Difference?

what does the following mean in R programming: %d

Using R Studio/R programming... Usually, we will use a random sample to estimate the statistics of...

Using R Studio/R programming... Usually, we will use a random sample to estimate the statistics of the underlying population. If we assume a given population is a standard normal distribution and we want to estimate its mean, which is the better technique to estimate that mean from a sample: Use the mean of one random sample of size 500 Use the mean of 300 random samples of size 10 Run your own experiment and use your results as a supporting...

R - STUDIO R PROGRAMMING STATISTICS Imagine that you and your friend have catched COVID-19 while...

R - STUDIO R PROGRAMMING STATISTICS Imagine that you and your friend have catched COVID-19 while jogging without social distancing. Your case is more severe than your friend’s at the beginning: there are 400 millions of coronavirus in you, and only 120 millions in your friend. However, your immune system is more effective. In your body the number coronavirus decrease by 20 percent each day (new = 0.8 × orginal), while in your friend it increases by 10 percent each...

Using R Studio/R programming... A consumer-reports group is testing whether a gasoline additive changes a car's...

Using R Studio/R programming... A consumer-reports group is testing whether a gasoline additive changes a car's gas mileage. A test of seven cars finds an average improvement of 0.4 miles per gallon with a standard deviation of 3.57. Is the difference significantly greater than 0? Assume that the values are normally distributed. What would the code be?

This questions was asked to be done using R language programming. The datasets are available along...

This questions was asked to be done using R language programming. The datasets are available along with MASS package in Rstudio. A. Package ‘MASS’ which provides a description of the datasets available in the MASS package Then, complete the following analysis of the identified data from the library. B. One-sample t-test: Use the “chem” dataset to answer the question, “is the flour production company producing whole meal flour with greater than 1 part per million copper in it?” C. Two-sample...

Does anyone know the code to use in R programming to create a scatter plot?