Questions
Question 1. Given the following table answer the questions.                           Score Respondent   &nb

Question 1. Given the following table answer the questions.

                          Score

Respondent          X            Y         x-x       (x-x) 2         y-y      ( y-y) 2     ( x-x) ( y-y)

A. Rose                 18          92   

B. Bush                 36          65    

C. Novicevic         24          91

D. Vitell                28          85                                                                       

E. Walker              25          70    

a. Calculate Pearson’s Product Movement Correlation Coefficient (r). Show your work.

b. Based on the correlation coefficient which you calculated, in two words how would you describe the relationship between the two variables in the “test”?

Question 2. PHR Score

               Score

Respondent        X             Y         x-x       (x-x) 2         y-y      ( y-y) 2     ( x-x) ( y-y)

A. Selber           96            92

B. Franklin         56            65

C. Nichols          84            91   

D. Vitello           88            85

E. Grado           72            70

a. Calculate Pearson’s Product Movement Correlation Coefficient (r). Show your work.

b. Based on the correlation coefficient which you calculated, in two words how would you describe the relationship between the two variables in the “test”?

Question 3. Which of the two “tests” above would you choose? Why?

In: Statistics and Probability

The types of browse favored by deer are shown in the following table. Using binoculars, volunteers...

The types of browse favored by deer are shown in the following table. Using binoculars, volunteers observed the feeding habits of a random sample of 320 deer. Type of Browse Plant Composition in Study Area Observed Number of Deer Feeding on This Plant Sage brush 32% 108 Rabbit brush 38.7% 113 Salt brush 12% 39 Service berry 9.3% 32 Other 8% 28 Use a 5% level of significance to test the claim that the natural distribution of browse fits the deer feeding pattern. (a) What is the level of significance? State the null and alternate hypotheses. H0: The distributions are the same. H1: The distributions are the same. H0: The distributions are the same. H1: The distributions are different. H0: The distributions are different. H1: The distributions are different. H0: The distributions are different. H1: The distributions are the same. (b) Find the value of the chi-square statistic for the sample. (Round the expected frequencies to at least three decimal places. Round the test statistic to three decimal places.) Are all the expected frequencies greater than 5? Yes No What sampling distribution will you use? Student's t normal binomial uniform chi-square What are the degrees of freedom? (c) Estimate the P-value of the sample test statistic. P-value > 0.100 0.050 < P-value < 0.100 0.025 < P-value < 0.050 0.010 < P-value < 0.025 0.005 < P-value < 0.010 P-value < 0.005 (d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis that the population fits the specified distribution of categories? Since the P-value > α, we fail to reject the null hypothesis. Since the P-value > α, we reject the null hypothesis. Since the P-value ≤ α, we reject the null hypothesis. Since the P-value ≤ α, we fail to reject the null hypothesis. (e) Interpret your conclusion in the context of the application. At the 5% level of significance, the evidence is sufficient to conclude that the natural distribution of browse does not fit the feeding pattern. At the 5% level of significance, the evidence is insufficient to conclude that the natural distribution of browse does not fit the feeding pattern.

In: Statistics and Probability

Suppose we wanted to predict the selling price of a house, using its size, in a...

Suppose we wanted to predict the selling price of a house, using its size, in a certain area

of a city. A random sample of six houses were selected from the area. The data is

presented in the following table with size given in hundreds of square feet, and sale price

in thousands of dollars.:

Temperature (oF): Xi

16

28

13

22

25

19

Number of Calls: Yi

95

120

70

115

130

85

We are interested in fitting the following simple linear regression model: Y = Xβ + ε

a) Calculate X′X, (X′X)-1 and X′Y and then calculate the least squares estimates of β0 and β1.

In: Statistics and Probability

The type of household for the U.S. population and for a random sample of 411 households...

The type of household for the U.S. population and for a random sample of 411 households from a community in Montana are shown below. Type of Household Percent of U.S. Households Observed Number of Households in the Community Married with children 26% 106 Married, no children 29% 101 Single parent 9% 31 One person 25% 102 Other (e.g., roommates, siblings) 11% 71 Use a 5% level of significance to test the claim that the distribution of U.S. households fits the Dove Creek distribution. (a) What is the level of significance? State the null and alternate hypotheses. H0: The distributions are the same. H1: The distributions are different. H0: The distributions are different. H1: The distributions are the same. H0: The distributions are different. H1: The distributions are different. H0: The distributions are the same. H1: The distributions are the same. (b) Find the value of the chi-square statistic for the sample. (Round the expected frequencies to two decimal places. Round the test statistic to three decimal places.) Are all the expected frequencies greater than 5? Yes No What sampling distribution will you use? normal Student's t binomial chi-square uniform What are the degrees of freedom? (c) Find or estimate the P-value of the sample test statistic. (Round your answer to three decimal places.) (d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis that the population fits the specified distribution of categories? Since the P-value > α, we fail to reject the null hypothesis. Since the P-value > α, we reject the null hypothesis. Since the P-value ≤ α, we reject the null hypothesis. Since the P-value ≤ α, we fail to reject the null hypothesis. (e) Interpret your conclusion in the context of the application. At the 5% level of significance, the evidence is sufficient to conclude that the community household distribution does not fit the general U.S. household distribution. At the 5% level of significance, the evidence is insufficient to conclude that the community household distribution does not fit the general U.S. household distribution.

In: Statistics and Probability

(1 point) The table below lists the weights of some colleage students in September and later...

(1 point) The table below lists the weights of some colleage students in September and later in February of their freshman year. September weight 62 54 71 65 53 58 74 49 66 56 69 74 61 56 71 51 70 52 70 78 59 60 73 60 February weight 67 53 67 71 56 56 77 54 67 55 64 70 56 59 74 56 69 53 65 72 59 57 77 54 At the 0.05 significance level, test the claim of no difference between September weights and February weights. The test statistic is

The critical value is

In: Statistics and Probability

Take a look at the four requirements for binomial probability distributions: 1. Fixed number of single...

Take a look at the four requirements for binomial probability distributions:

1. Fixed number of single observations (trials)

2. Each trial is independent

3. Each trial must have outcomes that fall into one of two categories (success, failure)

4. The probability of success remains the same for every trial.

Come up with an example scenario in which you would have a binomial probability distribution to work with.  

In: Statistics and Probability

Fit a Beta distribution to the following data. Start from initial guesses shape1=3, shape2=6. 0.18573722 0.41073334...

Fit a Beta distribution to the following data. Start from initial guesses shape1=3, shape2=6.

0.18573722 0.41073334 0.56355831 0.06673358 0.43762574 0.45158744 0.27369696

0.27787527 0.27522373 0.22834730 0.28829185 0.21342879 0.24438748 0.37434856

0.53836814 0.34561632 0.28320219 0.22540641 0.23575321 0.38607398 0.28625720

0.29384326 0.44312820 0.25625404 0.15563416 0.46424265 0.21000100 0.36114007

0.22198265 0.56719777

What is the estimated shape2 ?

In: Statistics and Probability

For expert using R I try to solve this question((USING DATA FAITHFUL)) but each time I...

For expert using R

I try to solve this question((USING DATA FAITHFUL)) but each time I solve it, I have error , I try it many times. So,everything you write will be helpful..

Modify the EM-algorithm functions to work for a general K component Gaussian mixtures. Please use this function to fit a K= 1;2;3;4 modelto the old faithful data available in R (You need to initialize the EM-algorithm First ).   

Which modelseems to t the data better? (Hint: use BIC to compare models.)

Here what I try to use

## EM algorithm for univariate normal mixture

# The E-step

E.step <- function(x, pi, Mu, S2){

K <- length(pi)

n <- length(x)

tau <- matrix(rep(NA, n * K), ncol = K)

for (i in 1:n){

for (k in 1:K){

tau[i,k] <- pi[k] * dnorm(x[i], Mu[k], sqrt(S2[k]))

}

tau[i,] <- tau[i,] / sum(tau[i,])

}

return(tau)

}

#The M-step

M.step <- function(x, tau){

n <- length(x)

K <- dim(tau)[2]

tau.sum <- apply(tau, 2, sum)

pi <- tau.sum / n

Mu <- t(tau) %*% x / tau.sum

S2[1] <- t(tau[,1]) %*% (x - Mu[1])^2 / tau.sum[1]

S2[2] <- t(tau[,2]) %*% (x - Mu[2])^2 / tau.sum[2]

return(list(pi = pi, Mu = Mu, S2 = S2))

}

## The log-likelihood function

logL <- function(x, pi, Mu, S2){

n <- length(x)

ll <- 0

for (i in 1:n){

ll <- ll + log(pi[1] * dnorm(x[i], Mu[1], sqrt(S2[1])) +

pi[2] * dnorm(x[i], Mu[2], sqrt(S2[2])))

}

return(ll)

}

## The algorithm

EM <- function(x, pi, Mu, S2, tol){

t <- 0

ll.old <- -Inf

ll <- logL(x, pi, Mu, S2)

repeat{

t <- t + 1

if ((ll - ll.old) / abs(ll) < tol) break

ll.old <- ll

tau <- E.step(x, pi, Mu, S2)

M <- M.step(x, tau)

pi <- M$pi

Mu <- M$Mu

S2 <- M$S2

ll <- logL(x, M$pi, M$Mu, M$S2)

cat("Iteration", t, "logL =", ll, " ")

}

return(list(pi = M$pi, Mu = M$Mu, S2 = M$S2, tau = tau, logL = ll))

}

## generate data

set.seed(1)

pi <- c(0.3, 0.7)

Mu <- c(5, 10)

S2 <- c(1, 1)

n <- 1000

n1 <- rbinom(1, n, pi[1])

n2 <- n - n1

x1 <- rnorm(n1, Mu[1], sqrt(S2[1]))

x2 <- rnorm(n2, Mu[2], sqrt(S2[2]))

x <- c(x1, x2)

hist(x, freq = FALSE, ylim = c(0, 0.2))

# pick initial values

pi.init <- c(0.5, 0.5)

Mu.init <- c(3, 10)

S2.init <- c(0.4, 2)

#Run EM

A <- EM(x, pi.init, Mu.init, S2.init, tol = 10^-6)

#plot

t <- seq(0, 15, by = 0.01)

y <- pi[1] * dnorm(t, Mu[1], sqrt(S2[1])) +

pi[2] * dnorm(t, Mu[2], sqrt(S2[2]))

y.est <- A$pi[1] * dnorm(t, A$Mu[1], sqrt(A$S2[1])) +

A$pi[2] * dnorm(t, A$Mu[2], sqrt(A$S2[2]))

points(t, y, type = "l")

points(t, y.est, type = "l", col = 2, lty = 2)

# assign observations to components - clustering

d <- function(x) which(x == max(x))

apply(A$tau, 1, d)

apply(A$tau, 1, which.max)

# assess misclassification

table(apply(A$tau, 1, which.max), c(rep(1, n1), rep(2, n2)))

In: Statistics and Probability

D.48  Predicting Percent Body Fat Data 10.1 introduces the dataset BodyFat. Computer output is shown for using...

D.48  Predicting Percent Body Fat

Data 10.1 introduces the dataset BodyFat. Computer output is shown for using this sample to create a multiple regression model to predict percent body fat using the other nine variables.

Predictor

Coef

SE Coef

T

P

The regression equation is

Bodyfat = − 23.7 + 0.0838 Age − 0.0833 Weight + 0.036 Height + 0.001 Neck − 0.139 Chest + 1.03 Abdomen + 0.226 Ankle + 0.148 Biceps − 2.20 Wrist

Constant

−23.66

29.46

−0.80

0.424

Age

0.08378

0.05066

1.65

0.102

Weight

−0.08332

0.08471

−0.98

0.328

Height

0.0359

0.2658

0.14

0.893

Neck

0.0011

0.3801

0.00

0.998

Chest

−0.1387

0.1609

−0.86

0.391

Abdomen

1.0327

0.1459

7.08

0.000

Ankle

0.2259

0.5417

0.42

0.678

Biceps

0.1483

0.2295

0.65

0.520

Wrist

−2.2034

0.8129

−2.71

0.008

S = 4.13552

R-Sq = 75.7%

R-Sq(adj) = 73.3%

Analysis of Variance

Source

DF

SS

MS

F

P

Regression

9

4807.36

534.15

31.23

0.000

Residual Error

90

1539.23

17.10

Total

99

6346.59

(a)  Interpret the coefficients of Age and Abdomen in context. Age is measured in years and Abdomen is abdomen circumference in centimeters.

(b)  Use the p-value from the ANOVA test to determine whether the model is effective.

(c)  Interpret R2 in context.

(d)  Which explanatory variable is most significant in the model? Which is least significant?

(e)  Which variables are significant at a 5% level?

In: Statistics and Probability

Let x = age in years of a rural Quebec woman at the time of her...

Let x = age in years of a rural Quebec woman at the time of her first marriage. In the year 1941, the population variance of x was approximately σ2 = 5.1. Suppose a recent study of age at first marriage for a random sample of 31 women in rural Quebec gave a sample variance s2 = 2.4. Use a 5% level of significance to test the claim that the current variance is less than 5.1. Find a 90% confidence interval for the population variance.

(a) What is the level of significance?


State the null and alternate hypotheses.

Ho: σ2 = 5.1; H1: σ2 > 5.1

Ho: σ2 = 5.1; H1: σ2 ≠ 5.1    

Ho: σ2 = 5.1; H1: σ2 < 5.1

Ho: σ2 < 5.1; H1: σ2 = 5.1


(b) Find the value of the chi-square statistic for the sample. (Round your answer to two decimal places.)


What are the degrees of freedom?


What assumptions are you making about the original distribution?

We assume a exponential population distribution.

We assume a binomial population distribution.   

We assume a uniform population distribution.

We assume a normal population distribution.


(c) Find or estimate the P-value of the sample test statistic.

P-value > 0.100

0.050 < P-value < 0.100   

0.025 < P-value < 0.050

0.010 < P-value < 0.025

0.005 < P-value < 0.010

P-value < 0.005


(d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis?

Since the P-value > α, we fail to reject the null hypothesis.

Since the P-value > α, we reject the null hypothesis.    

Since the P-value ≤ α, we reject the null hypothesis.

Since the P-value ≤ α, we fail to reject the null hypothesis.


(e) Interpret your conclusion in the context of the application.

At the 5% level of significance, there is insufficient evidence to conclude that the variance of age at first marriage is less than 5.1.At the 5% level of significance, there is sufficient evidence to conclude that the that the variance of age at first marriage is less than 5.1.    


(f) Find the requested confidence interval for the population variance. (Round your answers to two decimal places.)

lower limit
upper limit    


Interpret the results in the context of the application.

We are 90% confident that σ2 lies within this interval.We are 90% confident that σ2 lies outside this interval.    We are 90% confident that σ2 lies above this interval.We are 90% confident that σ2 lies below this interval

In: Statistics and Probability

In a clinic, the systolic blood pressures in mmHg of a random sample of 10 patients...

In a clinic, the systolic blood pressures in mmHg of a random sample of 10 patients with a certain metabolic disorder were collected. Assume blood pressure is normally-distributed in the population. The mean of this sample was 103.2 mmHg with a (sample) standard deviation of 15.0 mmHg. Test the hypothesis that the mean blood pressure of this sample of patients differs from the known population mean sysolic blood pressure of 121.2 mmHg.

Show your working including null hypothesis, alternative hypothesis, test statistic and p-value, and interpret your p-value. Also give a 95% confidence interval. What experimental design is this? What do we mean by “a random sample” in the question?

If the population standard deviation was actually known to be 15.0 mmHg exactly (from previous large studies), compute your p-value in this case.

In: Statistics and Probability

A special diet is intended to reduce systolic blood pressure among patients diagnosed with stage 2...

A special diet is intended to reduce systolic blood pressure among patients diagnosed with stage 2 hypertension. If the diet is effective, the target is to have the average systolic blood pressure of this group be below 150. After six months on the diet, an SRS of 28 patients had an average systolic blood pressure of ¯ x = 143 with standard deviation s = 21 . Is this sufficient evidence that the diet is effective in meeting the target? Assume the distribution of the systolic blood pressure for patients in this group is approximately Normal with mean μ . Given a P ‑value between 0.01 and 0.05, what conclusion should you draw at the 5% level of significance? No conclusion can be drawn without knowing the exact P ‑value. Accept the null hypothesis, because the P ‑value is less than the level of significance. Reject the null hypothesis, because the P ‑value is less than the level of significance. Fail to reject the null hypothesis, because the P ‑value is less than the level of significance.

In: Statistics and Probability

The average number of sugar in a generic brand of cereal is 660mg, and the standard...

The average number of sugar in a generic brand of cereal is 660mg, and the standard deviation is 35mg. Assume the variable is normally distributed.

a.) if a single cereal is selected, find the probability that the sugar content will be more than 670mg.

b.) if a sample of 10 cereals is selected, find the probability that the mean of the sample will be larger than 670mg.

c.)Why is the probability for part (a) greater than for part (b)?

In: Statistics and Probability

The quality control manager at a light bulb factory needs to estimate the mean life of...

The quality control manager at a light bulb factory needs to estimate the mean life of a large shipment of light bulbs. The standard deviation is 108 hours. A random sample of 64 light bulbs indicated a sample mean life of 410 hours. Complete parts​ (a) through​ (d) below.

a. Construct a 95​% confidence interval estimate for the population mean life of light bulbs in this shipment.

The 95​% confidence interval estimate is from a lower limit of ___ hours to an upper limit of ___ hours.

In: Statistics and Probability

Your instructor randomly chose a coin with probability 0.5 and asks you to decide which coin...

Your instructor randomly chose a coin with probability 0.5 and asks you to decide which coin he chose according to the outcome of 3 tosses: Tossing coin 1 yields a head with a probability P(X1 = H) = .3 (and tail with P(X1 = T) = .7). Tossing coin 2 yields a head with a probability P(X2 = H) = .6 (and tail with P(X2 = T) = .4). You earn $1 if you correctly guessed the coin and $0 otherwise. Design the optimum decision rule and estimate your average earning.

In: Statistics and Probability