Questions
How is statistical software useful?

How is statistical software useful?

In: Statistics and Probability

Exxon is developing a new faster pump design. They've narrowed the development to two design options...

Exxon is developing a new faster pump design. They've narrowed the development to two design options and are wondering how the pump design might affect daily gas sales (Sales). In a test in 31 stations, they try out the new design A in some stations (code = 2), the new design B in other stations (code = 3) and for control they have stations that have not had a change (code = 1). From prior research, Exxon knows that three other factors are crucial in predicting gas sales for any particular station: advertising amount in the market (Ad), relative pricing (relprice), and the number of competing stations and their density (compet). For any changes, 95% confidence

output: regression results tables

a) run a multiple regression analysis to assess the effect of the new pump designs & Interpret b) are the new designs better than the current design (ie leading to higher sales) c) Assume a 1% profit margin and an investment of $1M for 100 stations for the change to a new design; will any change be profitable within the first year?

Data:

Store Pump Design Sales ad relprice compet
1 2 29100 25410 1.18 9.4
2 3 25620 26400 1.14 9.4
3 1 23850 25950 1.18 9.7
4 1 25200 27010 1.20 11.9
5 2 21420 27850 1.24 13.4
6 3 21300 25090 1.46 9.6
7 1 21900 25700 1.54 9.2
8 1 23700 26670 1.48 13.6
9 2 22080 28780 1.48 14.4
10 3 21960 28350 1.48 15.3
11 1 17580 28970 1.48 15.1
12 1 19440 27440 1.66 11.8
13 2 20940 25820 1.76 12.8
14 3 19110 26130 1.88 12.4
15 1 20310 25290 2.00 9.3
16 1 20460 25440 2.14 7.9
17 2 25020 26330 2.08 7.8
18 3 22380 28780 2.02 8.4
19 1 23940 30510 1.88 9.1
20 1 25860 32740 1.70 8.8
21 2 28980 35940 1.58 9.2
22 3 24480 37740 1.50 9.8
23 1 24600 38610 1.50 10.3
24 1 26460 39190 1.44 8.8
25 2 29880 40400 1.48 8.2
26 3 29670 41330 1.46 7.5
27 1 24390 43030 1.42 7.1
28 1 25980 43930 1.40 7.2
29 2 30450 45600 1.42 8.9
30 3 32130 45870 1.42 7.7
31 1 26850 47160 1.38 7.4

In: Statistics and Probability

Use this scenario to answer questions below. The Collins Research Crew (CRC) is interested in examining...

Use this scenario to answer questions below.

The Collins Research Crew (CRC) is interested in examining the number of vape/smoking stores (i.e. stores that sell vaping and cigarette/cigar smoking products) in low-income neighborhoods compared to other types of neighborhoods. CRC's research question is, "Do low-income neighborhoods have more vape/smoke shops than other types of neighborhoods?" Low-income neighborhoods were defined as those where the median household income is less than the U.S. federal poverty line. Non-low-income neighborhoods are those that the median household income is greater than the U.S. federal poverty line.

CRC employed a team of undergraduate researchers to go out and count the number of vape/smoke shops in a random selection of low-income and non-low-income neighborhoods. They define the population as all neighborhoods in King County.

They found a significant difference in the number of vape/smoke shops across neighborhoods. Specifically, low-income neighborhoods had a greater number of vape/smoke shops compared to non-low-income neighborhoods.

Match the null hypothesis, directional hypothesis, and non-directional hypothesis with their most appropriate statement.

Null Hypothesis

      [ Choose ]           

There is a relationship between the average number of vape/smoke shops and neighborhood type..           

There is no relationship between the number of vape/smoke shops and neighborhood type.           

Low-income neighborhoods have more vape/smoke shops than non-low-income neighborhoods.average.      

Directional Hypothesis

      [ Choose ]           

There is a relationship between the average number of vape/smoke shops and neighborhood type..           

There is no relationship between the number of vape/smoke shops and neighborhood type.           

Low-income neighborhoods have more vape/smoke shops than non-low-income neighborhoods.average.      

Non-directional hypothesis

      [ Choose ]           

There is a relationship between the average number of vape/smoke shops and neighborhood type..           

There is no relationship between the number of vape/smoke shops and neighborhood type.           

Low-income neighborhoods have more vape/smoke shops than non-low-income neighborhoods.average.      

Given the research question asked in the scenario above, the best research hypothesis is a non-directional hypothesis.

True OR False

"Specifically, low-income neighborhoods had a greater number of vape/smoke shops compared to non-low-income neighborhoods." What is this sentence indicating?

A. Low-income people vape/smoke at higher levels than the average King County resident

B. Any difference is due to chance and not some systematic influence

C. Any difference is due to some systematic influence and not by chance

D. There is no difference in the number of vape/smoke shops

In: Statistics and Probability

Name the pros and cons of spreadsheets in statistics.

Name the pros and cons of spreadsheets in statistics.

In: Statistics and Probability

Students of a large university spend an average of $7 a day on lunch. The standard...

Students of a large university spend an average of $7 a day on lunch. The standard deviation of the expenditure is $2. A simple random sample of 25 students is taken. What is the probability that the sample mean will be at least $4? Jason spent $15 on his lunch. Explain, in terms of standard deviation, why his expenditure is not usual. Explain what information is given on a z table. For example, if a student calculated a z value of 2.77, what is the four-digit number on the z table that corresponds with that value? What exactly is that 4-digit number telling us? Explain why we use z formulas. Why don't we just leave the data alone? Why do we convert? must show work

In: Statistics and Probability

A research project has been tracking the health and cognitive functions of the elderly population in...

A research project has been tracking the health and cognitive functions of the elderly population in Arizona. The table below shows the memory test scores from 10 elderly residents, tested first when they were 65 years old and again when they were 75 years old. The researcher wants to know if there is a significant decline in memory functions from age 65 to age 75 based on this sample. In other words, it is hypothesized that the memory score at age 75 is significantly lower than the memory score at age 65. So the null and alternative hypotheses should be directional. The alpha level was set at α = .05 for a one-tailed hypothesis test.

Memory score

Subject

Age 65

Age 75

1

62

60

2

95

88

3

55

56

4

90

89

5

98

90

6

73

75

7

73

70

8

71

75

9

82

80

10

66

62

d. Calculate the difference score by subtracting each “Age 65” score from the associated “Age 75” score for each subject. Fill in the column in the table below for “difference score.” (1 point total: deduct .5 for each error up to 1 point.)

Hint: The difference score is calculated as (age 75 minus age 65), so a negative number indicates a decline in memory performance, which is the researcher’s hypothesis.

Subject

Difference score (Age 75 – Age 65)

1

2

3

4

5

6

7

8

9

10

e. Calculate the mean from the sample of difference scores

f. Estimate the standard deviation of the population of difference scores

g. Calculate the standard error (standard deviation of the sampling distribution)

h. Calculate the t statistic for the sample of difference scores

i. Figure out the degree of freedom, and then determine the critical t value(s) based on the type of test and the preset alpha level.

j. Compare the t statistic with the critical t value. Is the calculated t statistic more extreme or less extreme than the critical t value? Then make a decision about the hypothesis test, stating explicitly “reject” or “fail to reject” accordingly. (2 points total: 1 for each answer)

k. Interpret the result in 1-2 sentences to answer the research question

l. Calculate the standardized effect size of this hypothesis test

In: Statistics and Probability

Please answer the following questions: Check all outliers of this data of book costs(dollars). 19 95...

Please answer the following questions:

Check all outliers of this data of book costs(dollars).

19 95 30
18 21 75
38 49 53
6 54 143


  • 19
  • 95
  • 30
  • 18
  • 21
  • 75
  • 38
  • 49
  • 53
  • 6
  • 54
  • 143
  • none

In the following data set of candy bag weights(lbs), determine the z-score of 20.

46 75 78
57 14 37
20 46 82


z-score =
[three decimal places]

A distribution of book costs(dollars) has the following 5-number summary. What percentage of data is between 51 and 65 ?


25 51 65 93 118
Min Q1 Median Q3 Max


Percentage is %
[do no include the % sign]

In: Statistics and Probability

A data set is given below. ​(a) Draw a scatter diagram. Comment on the type of...

A data set is given below. ​(a) Draw a scatter diagram. Comment on the type of relation that appears to exist between x and y. ​(b) Given that x̅ = 3.6667, sx = 2.0656​, ŷ = 4.2000​, sy = 1.4805, and r = −0.9287​, determine the​ least-squares regression line. ​(c) Graph the​ least-squares regression line on the scatter diagram drawn in part​ (a).

x   y
1   5.2
2   5.8
3   5.4
4   3.8
6   2.4
6   2.6

​(a) Choose the correct graph below.

Graph B

There appears to be a linear, negative relationship.

​(b)

ŷ =__?__x+(__?__)

​(Round to three decimal places as​ needed.)

In: Statistics and Probability

In this exercise, we examine the effect of combining investments with positively correlated risks, negatively correlated...

In this exercise, we examine the effect of combining investments with positively correlated risks, negatively correlated risks, and uncorrelated risks. A firm is considering a portfolio of assets. The
portfolio is comprised of two assets, which we will call ''A" and "B." Let X denote the annual rate of return from asset A in the following year, and let Y denote the annual rate of return from asset B in the following year. Suppose that
E(X) = 0.15 and E(Y) = 0.20,
SD(X) = 0.05 and SD(Y) = 0.06,
and CORR(X, Y) = 0.30.
(a) What is the expected return of investing 50% of the portfolio in asset A and 50% of the portfolio in asset B? What is the standard deviation of this return?
(b) Replace CORR(X, Y) = 0.30 by CORR(X, Y) = 0.60 and answer the questions in part (a). Do the same for CORR(X, Y) = 0.60, 0.30, and 0.0.
(c) (Spreadsheet Exercise). Use a spreadsheet to perform the following analysis. Suppose that the fraction of the portfolio that is invested in asset B is f, and so the fraction of the portfolio that is invested in asset A is (1 f). Letting f vary from f = 0.0 to f = 1.0 in increments of 5% (that is, f = 0.0, 0.05, 0.10, 0.15, . . . ), compute the mean and the standard deviation of the annual rate of return of the portfolio (using the original data for the problem). Notice that the expected return of the portfolio varies (linearly) from 0.15 to 0.20, and the standard deviation of the return varies (non-linearly) from 0.05 to 0.06. Construct a chart plotting the standard deviation as a function of the expected return.
(d) (Spreadsheet Exercise). Perform the same analysis as in part (c) with CORR (X, Y) = 0.30 replaced by CORR(X, Y) = 0.60, 0.0, 0.30, and 0.60.

In: Statistics and Probability

For the accompanying data​ set, (a) draw a scatter diagram of the​ data, (b) compute the...

For the accompanying data​ set, (a) draw a scatter diagram of the​ data, (b) compute the correlation​ coefficient, and​ (c) determine whether there is a linear relation between x and y.

   Data set

x

7

6

6

7

9

y

3

2

6

9

5

Critical Values for Correlation Coefficient

n

3

0.997

4

0.950

5

0.878

6

0.811

7

0.754

8

0.707

9

0.666

10

0.632

11

0.602

12

0.576

13

0.553

14

0.532

15

0.514

16

0.497

17

0.482

18

0.468

19

0.456

20

0.444

21

0.433

22

0.423

23

0.413

24

0.404

25

0.396

26

0.388

27

0.381

28

0.374

29

0.367

30

0.361

Compute the correlation coefficient.

The correlation coefficient is

r=__?__.

​(Round to three decimal places as​ needed.)

In: Statistics and Probability

Test the hypothesis using the​ P-value approach. Be sure to verify the requirements of the test....

Test the hypothesis using the​ P-value approach. Be sure to verify the requirements of the test. H0: p=0.55 versus H1: p<0.55

n=150, x=72, α=0.01

Is np01−p0≥10​?

No

Yes

In: Statistics and Probability

Please find the percentage for all the questions below. Scores for professional golfers on 18-hole courses...

Please find the percentage for all the questions below.

Scores for professional golfers on 18-hole courses are bell-shaped with a mean of 72 strokes and a standard deviation of 5 strokes. Using the Empirical Rule, what is the approximate percentage of golfer scores between 62 and 82 strokes?

Percentage is:

A child's piano practice times are normally distributed with a mean of 22 minutes and a standard deviation of 6 minutes. Using the Empirical Rule, what is the approximate percentage of practice times running between 16 and 28 minutes?

Percentage is:

Weights of Old English Sheepdogs are normally distributed with a mean of 59 pounds and a standard deviation of 6 pounds. Using the Empirical Rule, what is the approximate percentage of sheepdogs weighing between 41 and 77 pounds?

Percentage is:

In: Statistics and Probability

The data below are the survival times after treatment (in days) of some advanced colon cancer...

The data below are the survival times after treatment (in days) of some advanced colon cancer patients who were treated with ascorbate.

248, 377, 189, 1843, 180, 537, 519, 455, 406, 365, 942, 776, 372, 163, 101, 20, 283

(a)

i) What is the point estimate for the average or mean of this data?

ii) Report an appropriate 96% confidence interval estimate for the mean survival time after treatment of all advanced colon cancer patients who might be treated with ascorbate. Assume the survival times are normally distributed and take Standard Deviation to be 427.17.

iii) Interpret the confidence interval in words and in context.

(b) Based on the data at hand, would 600 days be considered a reasonable guess as to the average survival time?

(c) What was the margin of error (ME) of the confidence interval?

(d) If we wanted to be 99% confident that our estimate was within 60 days of the population mean survival time, how many patients should we observe?

(e) If the sample included more patients, would the 96% confidence interval have been narrower or wider?

In: Statistics and Probability

One of our dealerships has the following average unit sales per team member. We are concerned...

One of our dealerships has the following average unit sales per team member. We are concerned that the dealership variance and the population variance are different. We would like to be 95% confident about our findings. Historically the population variance has been about 390,000. Use the following data as a basis for making inferences about the population variance. Prepare the information about a Confidence Interval and Hypothesis Testing.

Team Member Average Dollar Sale xbar x-xbar x-xbar^2
1 25000
2 25500
3 23750
4 25250
5 24250
6 24750
7 25750
8 24500
9 25375
10 24625

Find:

1. N = ?

2. Sample variance?

3. Confidence coefficient?

4. Level of significance?

5. Chi-square value (Lower tail)?

6. Chi-square value (Upper tail)?

7. Point estimate?

8. Lower limit?

9. Upper limit?

10. Sample Meah?

11. Hypothesized value?

12. Test statistic?

13. P-value (two tail 4 decimals)?

14. Conclusion

We want to compare the variances in two dealerships sales of our A and B dealership. Use the following table to develop a comparison about the two population variances. Prepare the information for Hypothesis Testing.15. Mean of first set?

16. Mean of second set?

17. Variance of first set?

18. Variance of second set?

19. Observations of the first set?

20. Observations of the second set?

21. Degrees of freedom first set?

22. Degrees of freedom second set?

23. Calculated test statistic? (4 decimal places)

24. P=Value? (4 decimal places)

25. Critical value? (4 decimal places)

In: Statistics and Probability

If Washington state has more cases of hep C reported in 2010, 105,800 than OR 79,800....

If Washington state has more cases of hep C reported in 2010, 105,800 than OR 79,800. Why is the reported prevalence rate higher in OR (3.05) than WA (2.30) per 100

assuming OR population at that time 2964621 and WA 5143186

In: Statistics and Probability