Questions
The following data show the brand, price ($), and the overall score for six stereo headphones...

The following data show the brand, price ($), and the overall score for six stereo headphones that were tested by a certain magazine. The overall score is based on sound quality and effectiveness of ambient noise reduction. Scores range from 0 (lowest) to 100 (highest). The estimated regression equation for these data is

ŷ = 23.194 + 0.318x,

where x = price ($) and y = overall score.

Brand Price ($) Score
A 180 76
B 150 71
C 95 61
D 70 58
E 70 38
F 35 26

(a) Compute SST, SSR, and SSE. (Round your answers to three decimal places.)

SST =  

SSR =

SSE =

(b) Compute the coefficient of determination r2.  (Round your answer to three decimal places.)

r2 =

(c) What is the value of the sample correlation coefficient? (Round your answer to three decimal places.)

In: Statistics and Probability

10m Wind Speed Data Day 1 Day 2 Day 3 Day 4 Day 5 Day 6...

10m Wind Speed Data

Day 1

Day 2

Day 3

Day 4

Day 5

Day 6

Day 7

3.226

2.908

2.355

3.341

1.507

6.355

6.846

3.144

3.026

2.235

3.402

1.418

6.465

6.581

3.197

2.843

2.168

3.242

1.433

6.613

6.894

3.007

2.995

2.125

3.039

1.6

6.363

6.401

3.05

3.245

2.474

2.983

1.837

6.149

7.19

3.02

3.349

2.421

3.165

2.054

7.19

6.795

3.001

3.085

2.369

2.915

1.669

5.826

7.16

2.957

3.003

2.344

2.414

2.136

6.628

6.583

3.012

3.01

2.509

1.619

2.849

5.999

6.14

3.249

3.141

2.796

1.681

2.876

6.501

6.472

3.304

3.338

2.928

1.673

3.536

6.388

7.8

3.239

3.165

2.867

2.1

3.517

5.757

7.14

3.063

2.969

3.002

2.312

3.476

6.314

6.789

2.833

3.049

3.12

2.56

4.368

7.04

7.02

2.876

3.058

3.179

2.352

4.778

6.51

5.736

2.855

3.032

3.005

2.133

4.708

6.734

7.02

3.252

3.015

2.813

1.882

4.599

6.788

5.754

3.409

2.823

2.97

2.015

5.207

6.347

6.617

3.198

2.921

3.113

2.046

5.66

7.05

5.253

2.797

2.866

3.329

1.78

5.837

6.327

6.159

For the above wind data set, find the 3 M’s (mean, median, and mode) as well as the range of data for each day.

Based on the 3 M’s and the range of the data, which day has the most optimum wind speeds, consider variability, overall wind speeds, and stability of wind speeds as your deciding factor. Elaborate on your reasoning.

In: Statistics and Probability

A simple random sample of 800 elements generates a sample proportion of 0.70. Provide a 90%...

A simple random sample of 800 elements generates a sample proportion of 0.70.

Provide a 90% confidence interval for the population proportion. (round to two decimal places) [Answer , Answer]

Provide a 99% confidence interval for the population proportion. (round to two decimal places) [Answer , Answer]

In: Statistics and Probability

2. A researcher believes that smoking affects a person’s sense of smell. To test this, he...

2. A researcher believes that smoking affects a person’s sense of smell. To test this, he takes a sample of 25 smokers and gives them a test of olfactory sensitivity. In this test, higher scores indicate greater sensitivity. For his sample, the mean score on the test is 14.8 with a standard deviation of 2.4. The researcher knows the mean score in the population is 16.2, but the population standard deviation is unknown.

(a) What are the null and alternative hypotheses in this study (stated mathematically)?

(b) Should the researcher use a one-tailed or a two-tailed test? (c) Compute the appropriate test statistic for testing the hypothesis.

(d) Using α = 0.01, do you conclude that smoking affects a person’s sense of smell? Be sure to include a discussion of the critical value in your answer.

(e) What type of error might the researcher be making in part (d)?

In: Statistics and Probability

What is the forecast and MSE using regression? 2019 is the holdout sample and "car sales"...

What is the forecast and MSE using regression? 2019 is the holdout sample and "car sales" is the independent variable.

Shipments Car Sales Fasteners
Jan-17 17680000 335798
Feb-17 17650000 297853
Mar-17 17130000 318399
Apr-17 17230000 311730
May-17 17200000 363876
Jun-17 17200000 296832
Jul-17 17180000 297513
Aug-17 17020000 321144
Sep-17 18380000 317677
Oct-17 18200000 325487
Nov-17 17860000 272937
Dec-17 17700000 276282
Jan-18 17550000 335439
Feb-18 17560000 310514
Mar-18 17690000 407754
Apr-18 17770000 356169
May-18 17780000 345322
Jun-18 17700000 331997
Jul-18 17380000 343059
Aug-18 17360000 350277
Sep-18 17840000 265205
Oct-18 18000000 389332
Nov-18 17880000 310474
Dec-18 17890000 308429
Jan-19 17240000 385807
Feb-19 17030000 332529
Mar-19 17770000 407606
Apr-19 17050000 361946
May-19 17930000 453432
Jun-19 17710000 412892
Jul-19 17440000 447359
Aug-19 17510000 363769
Sep-19 17720000 361232
Oct-19 17050000 451421
Nov-19 17450000 363724
Dec-19 17160000 331619

In: Statistics and Probability

An insurance company was conducting performance analysis of their claims handling processes and process cycle time...

An insurance company was conducting performance analysis of their claims handling processes and process cycle time was one of their concerns. They collected a sample data of the process cycle time across a number of different claims handling processes over the past six months. However, the data followed a (non-normal) multimodal distribution instead of a normal distribution. Why? Explain what could be the reason(s) behind this?

The company then focused on the CTP insurance claims handling pro- cesses and a sample data of the process cycle time of about 500 CTP in- surance claims from the past six months. Such data followed a normal dis- tribution characterised by mean=20.5 days and standard deviation=5.25 days. Answer the following questions based on the above information.

(a) Given the sample data, how often the cycle time of a CTP insurance claim process could fall within the range [15.25, 20.75] days? Why?

(b) If the expected mean cycle time of CTP insurance claims is 20.2 days, did the company meet this target in the past six months? Conduct an appropriate statistic test to draw conclusion.

In: Statistics and Probability

In this chapter, you learn four steps that should be used to evaluate a regression model....

In this chapter, you learn four steps that should be used to evaluate a regression model. What is the first step and why is it important? Explain the other three steps, indicating what you learn from each of those three steps.

In: Statistics and Probability

2. Numerical “Proof” that for X,Y independent,Var(X+Y) = Var(X−Y) =σ2X+σ2Y: 2.1 As we did in Lab...

2. Numerical “Proof” that for X,Y independent,Var(X+Y) = Var(X−Y) =σ2X+σ2Y:

2.1 As we did in Lab 4, you will need to generate a sample(x,y)-values by independently generating x-values and y-values.(You may choose sample size of 50000.) State the two distributions you will use for generating x-values and y-values,and the corresponding population variances.

2.2 Compute the sample variances of the (X+Y)-values, and of the (X−Y)-values. What value are these two sample variances supposed to estimate?

2.3 Use the formula that explains the difference between the two sample variances and recompute them using the sample variances of the x- and y-values and their covariance.

please include r code

In: Statistics and Probability

A college admissions director wishes to estimate the mean age of all students currently enrolled. In...

A college admissions director wishes to estimate the mean age of all students currently enrolled. In a random sample of 22 students, the mean age is found to be 21.4 years. From past studies, the ages of enrolled students are normally distributed with a standard deviation of 10.2 years. Construct a 90% confidence interval for the mean age of all students currently enrolled.

b. The standard deviation of the sample mean:

In: Statistics and Probability

1. For a certain population, systolic BP is normally distributed with μ=122 and σ=14. Hypertension is...

1. For a certain population, systolic BP is normally distributed with μ=122 and σ=14. Hypertension is defined as systolic BP over 150 mmHg.

a. what is the z-score for systolic BP of 150, given this information? z=

b. What percentage of the population is hypertensive? you may use the empirical rule =

2. Continuing from the previous question where BP follows a normal distribution with μ=122 and σ=14, suppose we randomly sample n=194 individuals from this population. What is the probability that the sample mean of systolic BP obtained from this sample will be between 135 and 146mmHg? It is recommended to first compute the standard error, then compute the z-scores for 135 and 146, then compute the difference between the cumulative probabilities associated with these two z-scores

a. 0.823

b. 0.133

c. 0.841

d. 0.957

In: Statistics and Probability

1.1 LetX∼Poisson(4). The r command dpois(0:9, 4) gives the probabilities that P(X=k) fork= 0,1,...,9. 1.1.1 Use...

1.1 LetX∼Poisson(4). The r command dpois(0:9, 4) gives the probabilities that P(X=k) fork= 0,1,...,9.

1.1.1 Use the plot() function to plot these probabilities and to connect the points with lines.

1.1.2 Use the barplot() function to make a barplot these probabilities.

1.2 The r command rpois(1000, 4) generates a sample of 1000values from the Poisson(4) distribution. Use the barplot()function to plot the empirical probabilities for X=k resultingfrom this sample

please include r code needed to generate plots

In: Statistics and Probability

Use a two-tailed independent sample t-test to answer this question. A new drug is being tested...

Use a two-tailed independent sample t-test to answer this question. A new drug is being tested to see if it reduces the number of backaches. Do these samples of number of monthly backaches from eight volunteers who are taking the drug and eight who are not taking the drug show a significant reduction in the mean number of monthly backaches (α= 0.05)? Identify the hypothesis statements, test statistic, p-value, and your decision. Also, construct and report the 95% confidence interval for each group. Analyze the confidence intervals - Do you come to the same conclusion as the t-test results? Why or why not?

New Drug Group: 4 4 3 5 4 5 4 6

Control Group: 5 6 6 6 8 6 6 7

In: Statistics and Probability

Many female undergraduates at four-year colleges switch from STEM majors into disciplines that are not science-based,...

Many female undergraduates at four-year colleges switch from STEM majors into disciplines that are not science-based, thereby contributing to the underrepresentation of women in STEM fields. When female undergrads switch majors, are their reasons different from those of their male counterparts? This question was investigated in Science Education. A sample of 335 junior/senior undergraduates- 172 females and 163 males- at two large research universities were identified as “switchers”, that is they left a declared STEM major for a non-STEM major. Each student listed one or more factors that contributed to the switching decision.

(a) Of the 172 females in the sample, 74 listed lack or loss of interest in STEM (i.e., “turned off” by science) as a major factor, compared to 72 of the 163 males. Conduct a test (at α = .10) to determine whether the proportion of female switchers who give “lack of interest in STEM” as a major reason for switching differs from the corresponding proportion of males.

(b) Thirty–three of the 172 females in the sample indicated that they were discouraged or lost confidence because of low grades in STEM during their early years, compared to 44 of 163 males. Construct a 90 % confidence interval for the difference between the proportions of female and male switchers who lost confidence due to low grades in STEM. Interpret the result.

In: Statistics and Probability

1. Independent random samples of n1 = 200 and n2 = 200 observations were randomly selected...

1. Independent random samples of n1 = 200 and n2 = 200 observations were randomly selected from binomial populations 1 and 2, respectively. Sample 1 had 116 successes, and sample 2 had 122 successes.

a) Calculate the standard error of the difference in the two sample proportions, (p̂1 − p̂2). Make sure to use the pooled estimate for the common value of p. (Round your answer to four decimal places.)

b) Critical value approach: Find the rejection region when α = 0.01. (Round your answer to two decimal places. If the test is one-tailed, enter NONE for the unused region.)

z <

z >

2. The meat department of a local supermarket chain packages ground beef in trays of two sizes. The smaller tray is intended to hold 1 kilogram (kg) of meat. A random sample of 30 packages in the smaller meat tray produced weight measurements with an average of 1.01 kg and a standard deviation of 20 grams.

p-value =

In: Statistics and Probability

Out of 600 people sampled, 300 preferred Candidate A. Based on this, estimate what proportion of...

Out of 600 people sampled, 300 preferred Candidate A. Based on this, estimate what proportion of the voting population ( p ) prefers Candidate A. Use a 99% confidence level, and give your answers as decimals, to three places.

In: Statistics and Probability