Questions
Analyses of drinking water samples for 100 homes in each of two different sections of a...

Analyses of drinking water samples for 100 homes in each of two different sections of a city gave the following means and standard deviations of lead levels (in parts per million).

  

section 1 section 2
sample size 100 100
mean 34.5 36.2
standard deviation 5.8 6.0

(a) Calculate the test statistic and its p-value to test for a difference in the two population means. (Use Section 1 − Section 2. Round your test statistic to two decimal places and your p-value to four decimal places.)

z =

p-value =

Use the p-value to evaluate the statistical significance of the results at the 5% level.

a. H0 is not rejected. There is sufficient evidence to indicate a difference in the mean lead levels for the two sections of the city.

b. H0 is rejected. There is sufficient evidence to indicate a difference in the mean lead levels for the two sections of the city.   

c. H0 is rejected. There is insufficient evidence to indicate a difference in the mean lead levels for the two sections of the city.

d. H0 is not rejected. There is insufficient evidence to indicate a difference in the mean lead levels for the two sections of the city.

(b) Calculate a 95% confidence interval to estimate the difference in the mean lead levels in parts per million for the two sections of the city. (Use Section 1 − Section 2. Round your answers to two decimal places.)

parts per million______ to________parts per million

(c) Suppose that the city environmental engineers will be concerned only if they detect a difference of more than 5 parts per million in the two sections of the city. Based on your confidence interval in part (b), is the statistical significance in part (a) of practical significance to the city engineers? Explain.

a. Since all of the probable values of μ1 − μ2 given by the interval are all less than −5, it is likely that the difference will be more than 5 ppm, and hence the statistical significance of the difference is of practical importance to the the engineers.

b. Since all of the probable values of μ1 − μ2 given by the interval are all greater than 5, it is likely that the difference will be more than 5 ppm, and hence the statistical significance of the difference is of practical importance to the the engineers.   

c. Since all of the probable values of μ1 − μ2 given by the interval are between −5 and 5, it is not likely that the difference will be more than 5 ppm, and hence the statistical significance of the difference is not of practical importance to the the engineers.

In: Math

A random variable is normally distributed. It has a mean of 245 and a standard deviation...

A random variable is normally distributed. It has a mean of 245 and a standard deviation of 21.

e.) For a sample of size 35, state the mean of the sample mean and the standard deviation of the sample mean.

f.) For a sample of size 35, find the probability that the sample mean is more than 241.

g.) Compare your answers in part c and f. Why is one smaller than the other?

In: Math

Consider the results from a completely randomized design showing commuting times in three states. Use an...

Consider the results from a completely randomized design showing commuting times in three states. Use an appropriate Excel ANOVA tool, to test for any significant differences in commuting times between the three states. Use α = 0.05.

Illinois Ohio Texas
26.8 27.5 10.1
17.6 28.9 18.8
27 19.1 31.4
20 36.9 44.2
50.7 40.8 24.6
24.4 9.5 29.5
36.8 37.4 38.1
42.2 38.9 30.3
26.3 46.2 11.7
14 35.8 35.8
28.5 20.7 22.4
36.9 37.8 17
25.6 49.7 15.4
25.9 44.3 15.4
29.5 12.1 6.8
29.7 43.7 14.8
30.5 35.9 59.3
20 30.2 5.3
23.2 8.5 0.6
20.7 34.6 20.7
6.2 37.9 18.6
44.2 50.9 24.9
28.2 24.2 9.3
28.8 39.1 11.9
16.6 20.4 19.6
20.2 12.4 31
13.1 28 25.9
16.9 28.4 52.6
32.4 19.4 38.3
19.6 42.5 34
12.8 27.2 24.9
30.2 22.6 32.1
65.1 50.8 43
25.5 34.1 31.1
17.5 27.1 16.8
11.1 38.9 34.1
48.8 28.7 40.4
38.9 54.2 29.4
23.1 30.6 9.8
21.6 15.9 19.5
22.3 15.1 9.6
27.3 30.1 21.6
30.7 32.2 26.5

In: Math

We are considering a launch of a new type of raisin into the packaged raisin market....

We are considering a launch of a new type of raisin into the packaged raisin market. To do so, we collected product ratings on a 1-10 Likert-scale from consumers utilizing the following attributes and corresponding levels,

Attribute Level1 level 2 level 3 level 4
Rasin Chewiness low medium high n/a
Rasin Color white grey brown black
Packaging Size small large n/a n/a
Free Gift no yes n/a n/a
Raisin Aroma none medium heavy n/a
Price Compared to Market Leader lower same higher n/a

Please base your answer to the following questions on this data. Note that each attribute is coded numerically. For instance, for Chewiness (Low =1, Medium =2, High =3) and similarly for the other attributes reading left to right in the table above.

For each of the attributes, we code the levels into multiple dummy variables to include in our regression. The variables we used are as follows:

Chew1

Chew2

Level 1

1

0

Level 2

0

1

Level 3

0

0

SizeLarge

Level 1

0

Level 2

1

Aroma1

Aroma2

Level 1

1

0

Level 2

0

1

Level 3

0

0

Color1

Color2

Color3

Level 1

1

0

0

Level 2

0

1

0

Level 3

0

0

1

Level 4

0

0

0

GiftDummy

Level 1

0

Level 2

1

Price1

Price2

Level 1

1

0

Level 2

0

1

Level 3

0

0

Note that for each category, the number of variables is equal to the number of levels – 1.

For example, for chewiness, we only need 2 dummy variables to show 3 levels:

  • Chew1 = 1 and Chew2 = 0 indicates level 1 of chewiness.
  • Chew1 = 0 and Chew2 = 1 indicates level 2 of chewiness.
  • If neither Chew1 or Chew2 are 1 that only leaves level 3 of chewiness.

Regression Results from creating dummy variables

Coefficents beta std. error t-value p-value
intercept 5.2991 0.3240 16.353 0.0000
chew 1 -0.8659 0.2437 -3.5535 0.0006
chew 2 -0.3461 0.2438 -1.4195 0.1593
color 1 0.1211 0.2871 0.4218 0.6742
color 2 0.2145 0.2802 0.7657 0.4459
color 3 0.4799 0.2696 1.7801 0.0785
Size large 0.8992 0.2000 4.4969 0.0000
gift dummy 0.0916 0.2099 0.4365 0.6635
aroma 1 0.5468 0.2563 2.1334 0.0357
aroma 2 0.9715 0.2327 4.1742 0.0001
Price 1 0.6548 0.2157 3.0362 0.0032
Price 2 0.3237 0.2895 1.1180 0.2666

Residual standard error: 0.9418 on 88 degrees of freedom

Multiple R-Squared: 0.4276

F-statistic: 5.976 on 11 and 88 degrees of freedom, the p-value is 3.412e-007

  1. Write down the model that was estimated in the regression, with the name of the variables and their coefficients in the model.
  2. What Likert rating score would you predict for a raisin product that has Low Chewiness, Grey Raisins, Large Package Size, Free Gift, Medium Aroma, and the Same Price as the Market Leader? You may round your answer to the nearest integer.
  3. What product has the highest predicted rating score?
  4. Would you necessarily introduce this product, the one from previous part, if you were the decision maker?  Why or why not?
  5. Suppose that the predicted market share of product j is proportional to Rj; that is market share, where Rjis the predicted Likert rating of product j. What would the predicted market share be if the product described in part b were introduced into the market consisting of current products i, ii, iii? The market currently contains the following three products:
    1. High Chewiness, Grey Raisins, Small Package Size, No Free Gift, Medium Aroma, and Same Price as the Market Leader’s (Likert =6.81)
    2. Low Chewiness, Brown Raisins, Small Package Size, Free Gift, Medium Aroma, and Same Price as the Market Leader’s (Likert =6.30)
    3. Medium Chewiness, Black Raisins, Large Package Size, No Free Gift, No Aroma, and Lower Price than the Market Leader’s (Likert =7.05)
  6. Do the product attributes (as a whole) provide significant predictive power for the rating scores? Justify your answer.
  7. Which product attributes, if any, have no statistically significant explanatory power for rating scores? State clearly how you arrived at your answer.

In: Math

3. Independent random samples of n1 = 16 and n2 = 13 observations were selected from...

3. Independent random samples of n1 = 16 and n2 = 13 observations were selected from two normal populations with equal variances. The sample means and variances are shown below: Population 1 Population 2 Sample size 16 13 Sample mean 34.6 32.2 Sample variance 4.0 4.84 a) Suppose you wish to test if there is difference between the population means with significance level of α = 0.05. State the null and alternative hypotheses that you use for the test. b) Find the value of the test statistic c) Find the value of the critical value d) Conduct the test and state your conclusions.

In: Math

Paired Samples t-test (30pts) Suppose you are interested in deciding if a particular diet is effective...

Paired Samples t-test (30pts) Suppose you are interested in deciding if a particular diet is effective in changing people’s weight. You decide to run a “within subject” experiment. You select 6 people and weight each of them. Two weeks, you weight them again. For each person you compute how much weight they lost over this period. This is what you find: Non-diet(subject 1-6): 0 7 3 2 -10 -1

You then put them on the diet and weigh them again after two weeks and compute how much they lost over this period. Diet

(subject 1-6) : 1 6 4 3 -8 2

a) State the null and alternative hypotheses (2pts)

b) Compute the mean and standard deviation of the difference distribution (4pts)

c) How many degrees of freedom do you have? (3pts)

d) Assume the Null Hypothesis is True and compute the t-statistic (2-pts)

e) Compute the P-value (2pts)

f) At an alpha = 0.05 would you accept or reject the null hypothesis? (3pts)

Please show work, thank you!

In: Math

Describe the difference between a one tailed and a two tailed test. What is the difference...

Describe the difference between a one tailed and a two tailed test. What is the difference between a z test and a t test, and how do you determine which one to use? Also, discuss when a two sample test would be used, and provide an example.

In: Math

Diagrams of the normal distribution are almost mandatory Suppose that battery lives are normally distributed with...

Diagrams of the normal distribution are almost mandatory

Suppose that battery lives are normally distributed with a mean of 12.85 hours and a standard deviation of 1.93 hours. What is the minimum sample size that would be required so that the probability of obtaining a sample mean above 13.5 hours is less than 1%?

In: Math

-Formulate both null and alternative hypotheses for the client, and explain why the hypotheses need to...

-Formulate both null and alternative hypotheses for the client, and explain why the hypotheses need to be directional or non-directional.

-Determine what statistical test should be used to analyze the data.

-Summarize all information used to determine the correct statistical test (e.g., number of groups, type of data collected, independent or repeated measures)

-Provide a sample size and critical values in relation to the hypothesis.

-State what statistical test should be used (be specific since you have all of the information you need to determine the critical value(s)).

-Discuss what the statistical analysis will do in answering the hypotheses and question(s) for the client. Also discuss any potential problems to watch out for, including an appropriate sample size to meet the assumptions of the statistical test.

Client Scenario: Jackson Hole Mind and Body Works

My name is Jane, and I am a licensed counselor who owns a business in Jackson Hole, Wyoming. This past year, I started a couple laughing yoga classes that combine the anxiety removing benefits of yoga with the emotional release of laughter. It has been a lot of fun and many clients love it, but a competitor has been criticizing my new approach as a sham and quackery. I am confident that my laughing yoga classes are beneficial, but I would like you to perform a study that examines the impact of my approaches from a scientific perspective. I know that science needs to be objective, so I would like you to setup a study for me as a nonbiased researcher. My thoughts are that I could give you the email addresses of clients who are willing to be in the study, and you would ask each of them questions, and then analyze the results.

            I would like you to examine two types of therapy I conduct, my laughing yoga therapy and my more traditional cognitive behavioral therapy.   I expect that 40 participants will be available from my laughing yoga classes, and that 50 participants will be available from my traditional cognitive behavioral therapy sessions. It would be nice if you asked questions related to clients’ current healthy living practices and use of positive emotions. You can determine the exact questions to ask clients. I was thinking that I would offer each participant in the study a couple free smoothie drinks at a local juice bar for participating in the research.

In: Math

Write an R function max.streak(p) that gives the length of the maximum "streak" of either all...

Write an R function max.streak(p) that gives the length of the maximum "streak" of either all heads or all tails in 100 flips of a (possibly biased) coin whose probabilty of showing heads is pp.

Use your function to determine the expected length (rounded to the nearest integer) of the maximum streak seen in 100 flips of a coin when the probability of seeing "heads" is 0.700.70.

As a check of your work, note that the expected length of the maximum streak seen in 100 flips of a fair coin should be very close to 7.

In: Math

a. An experiment was performed on a certain metal to determine if the strength is a...

a. An experiment was performed on a certain metal to determine if the strength is a function of heating time (hours). Results based on 25 metal sheets are given below. Use the simple linear regression model.
∑X = 50
∑X2 = 200
∑Y = 75
∑Y2 = 1600
∑XY = 400

Find the estimated y intercept and slope. Write the equation of the least squares regression line and explain the coefficients. Estimate Y when X is equal to 4 hours. Also determine the standard error, the Mean Square Error, the coefficient of determination and the coefficient of correlation. Check the relation between correlation coefficient and Coefficient of Determination. Test the significance of the slope.

b. Consumer Reports provided extensive testing and ratings for more than 100 HDTVs. An overall score, based primarily on picture quality, was developed for each model. In general, a higher overall score indicates better performance. The following (hypothetical) data show the price and overall score for the ten 42-inch plasma televisions (Consumer Report data slightly changed here):

Brand

Price (X)

Score (Y)

Dell

3800

50

Hisense

2800

45

Hitachi

2700

35

JVC

3000

40

LG

3500

45

Maxent

2000

28

Panasonic

4000

57

Phillips

3200

48

Proview

2000

22

Samsung

3000

30

Use the above data to develop and estimated regression equation. Compute Coefficient of Determination and correlation coefficient and show their relation. Interpret the explanatory power of the model. Estimate the overall score for a 42-inch plasma television with a price of $3600 and perform significance test for the slope.

In: Math

Number of People Making Contribution Ethnic Group $1-50 $51-100 $101-150 $151-200 Over $200 Row Total A...

Number of People Making Contribution
Ethnic Group $1-50 $51-100 $101-150 $151-200 Over $200 Row Total
A 82 64 45 38 22 251
B 91 54 67 30 22 264
C 74 68 59 35 30 266
D 98 87 71 54 30 340
Column Total 345 273 242 157 104 1121

(b) Find the value of the chi-square statistic for the sample. (Round the expected frequencies to at least three decimal places. Round the test statistic to three decimal places.)

In: Math

The population of Nevada, P(t), in millions of people, is a function of t, the number...

The population of Nevada, P(t), in millions of people, is a function of t, the number of years since 2010. Explain the meaning of the statement P(8) = 3. Use units and everyday language. (1 point)
2. Find the slope-intercept form of the equation of the line through the points (8, 25) and (-2, -13). (2 points)
3. At 8am, Charles leaves his house in Spartanburg, SC and drives at an average speed of 65 miles per hour toward Orlando, FL. At 11:45am, he stops for lunch in Savannah, GA, which is 276.25 miles from Orlando. a. Find a linear formula that represents Charles’ distance, D, in miles from Orlando as a function of t, time in hours since 8am. (2 points)
b. Find and interpret the horizontal intercept. Remember to write your intercept as a point! (2 points)
c. Find and interpret the vertical intercept. Remember to write your intercept as a point! (2 points)
1
2
4. The temperature in ◦F of freshly prepared soup is given by T(t) = 72 + 118e−0.018t, where t represents time in minutes since 6pm when the soup was removed from the stove. a. Determine the value of T(30) and interpret your answer in everyday language. (2 points)
b. Find and interpret the vertical intercept. Remember to write your intercept as a point! (2 points)
5. Decide whether the following function is linear. Explain how you know without finding the equation of the line.
x 9 12 16 23 34 f(x) 26.6 36.2 49 74.9 110.1
6. Attendance at a local fair can be modeled by A(t) = −30t2 + 309t + 20 people, where t represents the number of hours since 10am. a. Find the average rate of change of the attendance from t = 3 to t = 8. Give units. (2 points)
b. Interpret your answer from (a) in everyday language.

In: Math

NO HANDWRITTEN ANSWERS PLEASE The most common abuse of correlation in studies is to confuse the...

NO HANDWRITTEN ANSWERS PLEASE

The most common abuse of correlation in studies is to confuse the concepts of correlation with those of causation.

Good SAT scores do not cause good college grades, for example. Rather, there are other variables, such as good study habits and motivation, that contribute to both. Find an example of an article that confuses correlation and causation.

Discuss other variables that could contribute to the relationship between the variables.

In: Math

Bass - Samples: The bass in Clear Lake have weights that are normally distributed with a...

Bass - Samples: The bass in Clear Lake have weights that are normally distributed with a mean of 1.9 pounds and a standard deviation of 0.8 pounds.

(a) If you catch 3 random bass from Clear Lake, find the probability that the mean weight is less than 1.0 pound. Round your answer to 4 decimal places.


(b) If you catch 3 random bass from Clear Lake, find the probability that the mean weight it is more than 3 pounds. Round your answer to 4 decimal places.

In: Math