Question

In: Statistics and Probability

Given the data on scores of students final grade in statistics (in percent) determine the following...

Given the data on scores of students final grade in statistics (in percent) determine the following statistics.

43 45 48 51 53 54 57 59 60 60 60 60 61 70 70 71 71 72 72 72 75 76 76 79 81 81 83 85 87 88 88 89 89 91 92 93 96 98 98 99 100 101 101

Assume students are only allowed to transfer the class if they receive a grade of 70 % or above. Use this fact to create a binomial distribution for students that are able to transfer and students not able to transfer the class. Do this by finding the proportion of students that receive a grade of 70 or above (this will be the value p and then q = 1 - p).

a. Determine the mean and standard deviation using the binomial distribution formulas.

b. Determine the range of usual value by finding the values that are significantly low and significantly high.

c. Use a normal continuous distribution to APPROMATE the binomial discrete probability distribution to determine the probability that at least 30 students score at a 70 or more. (Be sure to use the boundary to get the more accurate/correct answer.) Show an approximation box to verify your boundaries.

d. Use a normal continuous distribution to APPROMATE the binomial discrete probability distribution to determine the probability that exactly 30 students score at a 70 or more. (Be sure to use the boundary to get the more accurate/correct answer. Show an approximation box to verify your boundaries.

Expert Solution

Excel Data

RStudio - RMarkdown Code

### Import the data in R

```{r}
library(xlsx)
library(tidyverse)
df <- read.xlsx(file = "data.xlsx", sheetIndex = 1)
check <- function(x) {
  if (x<70) {
    return("fail")
  } else{
    return("pass")
  }
}
df$flag <- apply(df[1],1,check)
table(df$flag)
```

---

### Initial calculations

$$\begin{aligned}
& n = 13+30 = 43 \\
& \hat{p} = \frac{30}{43} = `r 30/43` \approx 0.7 \\
& \hat{q} = 1 - \hat{p} = 1 - 0.7 = 0.3
\end{aligned}$$

---

### Part a

Using properties of binomial distribution

Mean: 

$$\begin{aligned}
\mu = np = 43*`r 30/43` = 30
\end{aligned}$$

Variance: 

$$\begin{aligned}
\sigma^2 = npq = 43*0.7*0.3 = `r 43*0.7*0.3`
\end{aligned}$$

---

### Part b

We can use the $3\sigma$ rule to find the range for usual values.

$$\begin{aligned}
\sigma = \sqrt{\sigma^2} = \sqrt{`r 43*0.7*0.3`} = `r sqrt(43*0.7*0.3)` \approx 3
\end{aligned}$$


$$\begin{aligned}
& \mu \pm 3\sigma \\
& 30 \pm 3*3 \\
& 30 \pm 9 \\
& [21,39]
\end{aligned}$$

Therefore significantly low values will be below 21 and significantly high values will be above 39.

---

### Part c

#### Theorem - Normal Approximation to Binomial

If X is a binomial random variable with mean $\mu=np$ and variance $\sigma^2=npq$ then limiting form of the distribution of 

$Z = \frac{X-np}{\sqrt{npq}}$

as $n \rightarrow \infty$, is the standard normal distribution $n(z;0,1)$.

<br>

##### Conditions

1. n is sufficiently large and p is not very close to 0 or 1.

2. Approximation works well even when n is small if p is reasonably close to 0.5.

3. General rule of thumb is to use $np \quad and \quad n(1-p) \quad \ge 10$.

$np = 43*0.7 = 30.1 > 10$

$n(1-p) = 43*0.3 = `r 43*0.3` > 10$

<br>

##### Continuity Correction

A continuity correction of 0.5 is made to the random variable X because we are approximating a discrete distribution with continuous distribution.

$$\begin{aligned}
P(X \le x) &= \sum_{x=0}^{k}{b(x;n,p)} \\
&\approx \text{area under normal curve to the left of k + 0.5} \\
&= P(Z \le \frac{k + 0.5 - np}{\sqrt{np(1-p)}})
\end{aligned}$$

<br>

```{r Continuity Correction for normal approximation to binomial}
n <- 43
p <- 0.7
k <- 30
q <- 1 - p
mu <- n*p
sd <- sqrt(n*p*q)
zs <- (k+0.5-mu)/sd
prob_binom <- pbinom(q = k, prob = p, size = n)
prob_norm <- pnorm(q = k, mean = mu, sd = sd)
prob_norm_cc <- pnorm(q = zs)
```

<br>

probability using binomial distribution for $P(X \le `r k`)$ is given by `r prob_binom`

probability using normal approximation for $P(X \le `r k`)$ is given  by `r prob_norm`

probability using normal approximation with continuity correction for $P(X \le `r k`)$ is given  by `r prob_norm_cc`

---

### Part d

$$\begin{aligned}
P(X = k) &= b(k;n,p) \\
&= b(30;43,0.7) \\
&\approx \text{area under normal curve between k-0.5 & k + 0.5} \\
&= P \left( \frac{k - 0.5 - np}{\sqrt{np(1-p)}} < Z < \frac{k + 0.5 - np}{\sqrt{np(1-p)}} \right) \\
&= P \left( \frac{30 - 0.5 - 30}{3} < Z < \frac{30 + 0.5 - 30}{3} \right) \\
&= P \left( `r -1/6` < Z < `r 1/6` \right) \\
\end{aligned}$$


<br>

```{r }
k2 <- 30
z_l <- (k2-0.5-mu)/sd
z_u <- (k2+0.5-mu)/sd
prob_binom_2 <- dbinom(x = k2, prob = p, size = n)
prob_norm_cc_2 <- pnorm(q = z_u) - pnorm(q = z_l)
```

<br>

probability using binomial distribution for $P(X = `r k2`)$ is given by `r prob_binom_2`

probability using normal approximation with continuity correction
$P(X = `r k2`) \approx P(`r k2 - 0.5` < X < `r k2 + 0.5`) = `r prob_norm_cc_2`$

---

### Plot for the normal approximation

```{r echo=FALSE}

library(ggplot2)

n <- 43
p <- 0.7

q <- 1 - p
mu <- n*p
sd <- sqrt(n*p*q)

x <- seq(0, n, 1)
y <- dbinom(x = x, size = n, prob = p)

df <- data.frame(x = x, y = y)

col_binom <- "#F39C12"
col_norm <- "#257CE4"

ggplot(df, aes(x = x, y = y)) +
  
  theme_classic() +
  
  scale_x_continuous(
    name = 'X', expand = expansion(mult = c(0, 0))
  ) +
  scale_y_continuous(
    name = 'pmf - binom(x)', expand = expansion(mult = c(0, 0.05))
  ) +
  
  geom_bar(stat = "identity", color = col_binom, fill = col_binom) +
  geom_line(stat = "function", fun = dnorm, args = list(mean = mu, sd = sd),
            color = col_norm, size = 1
  )
```

orchestra answered 2 years ago

Given the data on scores of students final grade in statistics (in percent) determine the following...

Given the data on scores of students final grade in statistics (in percent) determine the following statistics. 43 45 48 51 53 54 57 59 60 60 60 60 61 70 70 71 71 72 72 72 75 76 76 79 81 81 83 85 87 88 88 89 89 91 92 93 96 98 98 99 100 101 101 1. Create a relative frequency distribution table and histogram to determine the percentage of students getting a particular grade in statistics class. Do this by separating into 7 different classes by using 40’s, 50’s, 60’s, 70’s, 80’s, 90’s, 100’s. This will essentially be a table and graph of the probability distribution for final grade. 2. Determine the following a. What is the...

The following are final exam scores for 30 students in an elementary statistics class.

The following are final exam scores for 30 students in an elementary statistics class. 91 59 82 91 79 76 90 69 77 83 59 88 95 72 88 81 77 52 80 96 62 97 76 75 75 89 61 72 90 85 a. Find the quartiles for this data______________________________. b. What is the Interquartile Range (IQR)_________________________? c. What percent of students got at least a 72 on their final exam______________? d. Build a boxplot using the graphing calculator.

Below are the final exam scores of 20 Introductory Statistics students.

Below are the final exam scores of 20 Introductory Statistics students. Student 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Score 71 76 77 77 79 80 80 80 81 81 84 84 85 86 86 86 86 87 89 93 The mean exam score is 82.4 with a standard deviation of 5.14. 1. How many of the exam scores in the sample are within one standard deviation...

A sample of 10 students record their scores on the final exam for their statistics class....

A sample of 10 students record their scores on the final exam for their statistics class. The mean of the sample is 81 with sample standard deviation 7 points. Analysis of the 10 sample values indicated that the population is approximately normal. We wish to find the 95% confidence interval for the population mean test scores. 1. What is the confidence level, c? 2. Which of the following is correct? a. To find the confidence interval, a z-critical value should...

The following data represent the final exam scores of 18 male students and 18 female students....

The following data represent the final exam scores of 18 male students and 18 female students. Male: 68, 68, 72, 73, 65, 74, 73, 72, 68, 65, 73, 66, 71, 68, 74, 66, 71, 65 Female: 59, 75, 70, 56, 66, 75, 68, 75, 62, 60, 73, 61, 75, 74, 58, 60, 73, 58 Is there any significant difference between the average exam scores of male and female students? Explain your answer using both confidence interval method and hypothesis test...

The following data are scores on a standardized statistics examination for independent random samples of students...

The following data are scores on a standardized statistics examination for independent random samples of students from two small liberal arts colleges. College A: 78, 84, 81, 78, 76, 83, 79, 75, 85, 81 College B: 89, 78, 83, 85, 87, 78, 85, 94, 88, 87 Calculate the sample variance (or standard deviation) for each college. For the test of homogeneity, Ho: σ²A = σ²B Ha: σ²A ≠ σ²B calculate the test statistic F'. For α = 0.05, specify the...

This is the final grade and number of absences for a set of students. Regress grade...

This is the final grade and number of absences for a set of students. Regress grade on absences. Use a 95% confidence level. Give the equation of estimation. Interpret the equation. According to the regression, now much does a tardy (= 1/2 an absence) change your grade? Evaluate the model. What evaluation criterion did you use? Could this be a case of reverse causality? If so, give an example of how the causation could run in the opposite direction. Student...

An investigator collected data on midterm exam scores and final exam scores of elementary school students;...

An investigator collected data on midterm exam scores and final exam scores of elementary school students; results can summarized as follows. Average SD -------------------------------------------------- Boys' midterm score 70 20 Boys' final score 65 23 girls' midterm score 75 20 girls' final score 80 23 The correlation coefficient between midterm score and final score for the boys was about 0.70; for the girls, it was about the same. If you take the boys and the girls together, the correlation between midterm...

The data below are the final exam scores of 5 randomly selected calculus students and the...

The data below are the final exam scores of 5 randomly selected calculus students and the number of hours they slept the night before the exam. Hours, x 4 6 3 9 3 Scores, y 74 89 69 90 75 a) Draw scatterplot for the data. b) Calculate the linear correlation coefficient to 3 decimal places. (if you are unable to calculate the linear correlation coefficient, use .9 for part c,d and e) c) Is there a linear relationship between...

The data below are the final exam scores of 10 randomly selected calculus students and the...

The data below are the final exam scores of 10 randomly selected calculus students and the number of hours they slept the night before the exam: Hours Slept (x) 7 11 6 13 7 8 8 11 12 9 Exam Scores (y) 68 83 63 91 69 81 88 93 93 74 Using the equation of the regression line, with all numbers in it rounded to 2 decimal places, predict the final exam score of a student who slept for...

Question

Given the data on scores of students final grade in statistics (in percent) determine the following...

Solutions

Expert Solution

Related Solutions

Given the data on scores of students final grade in statistics (in percent) determine the following...

The following are final exam scores for 30 students in an elementary statistics class.

Below are the final exam scores of 20 Introductory Statistics students.

A sample of 10 students record their scores on the final exam for their statistics class....

The following data represent the final exam scores of 18 male students and 18 female students....

The following data are scores on a standardized statistics examination for independent random samples of students...

This is the final grade and number of absences for a set of students. Regress grade...

An investigator collected data on midterm exam scores and final exam scores of elementary school students;...

The data below are the final exam scores of 5 randomly selected calculus students and the...

The data below are the final exam scores of 10 randomly selected calculus students and the...