Analyses of drinking water samples for 100 homes in each of two different sections of a city gave the following means and standard deviations of lead levels (in parts per million).
section 1 | section 2 | |
sample size | 100 | 100 |
mean | 34.5 | 36.2 |
standard deviation | 5.8 | 6.0 |
(a) Calculate the test statistic and its p-value to test for a difference in the two population means. (Use Section 1 − Section 2. Round your test statistic to two decimal places and your p-value to four decimal places.)
z =
p-value =
Use the p-value to evaluate the statistical significance of the results at the 5% level.
a. H0 is not rejected. There is sufficient evidence to indicate a difference in the mean lead levels for the two sections of the city.
b. H0 is rejected. There is sufficient evidence to indicate a difference in the mean lead levels for the two sections of the city.
c. H0 is rejected. There is insufficient evidence to indicate a difference in the mean lead levels for the two sections of the city.
d. H0 is not rejected. There is insufficient evidence to indicate a difference in the mean lead levels for the two sections of the city.
(b) Calculate a 95% confidence interval to estimate the difference in the mean lead levels in parts per million for the two sections of the city. (Use Section 1 − Section 2. Round your answers to two decimal places.)
parts per million______ to________parts per million
(c) Suppose that the city environmental engineers will be concerned only if they detect a difference of more than 5 parts per million in the two sections of the city. Based on your confidence interval in part (b), is the statistical significance in part (a) of practical significance to the city engineers? Explain.
a. Since all of the probable values of μ1 − μ2 given by the interval are all less than −5, it is likely that the difference will be more than 5 ppm, and hence the statistical significance of the difference is of practical importance to the the engineers.
b. Since all of the probable values of μ1 − μ2 given by the interval are all greater than 5, it is likely that the difference will be more than 5 ppm, and hence the statistical significance of the difference is of practical importance to the the engineers.
c. Since all of the probable values of μ1 − μ2 given by the interval are between −5 and 5, it is not likely that the difference will be more than 5 ppm, and hence the statistical significance of the difference is not of practical importance to the the engineers.
In: Math
A random variable is normally distributed. It has a mean of 245 and a standard deviation of 21.
e.) For a sample of size 35, state the mean of the sample mean and the standard deviation of the sample mean.
f.) For a sample of size 35, find the probability that the sample mean is more than 241.
g.) Compare your answers in part c and f. Why is one smaller than the other?
In: Math
Consider the results from a completely randomized design showing commuting times in three states. Use an appropriate Excel ANOVA tool, to test for any significant differences in commuting times between the three states. Use α = 0.05.
Illinois | Ohio | Texas |
26.8 | 27.5 | 10.1 |
17.6 | 28.9 | 18.8 |
27 | 19.1 | 31.4 |
20 | 36.9 | 44.2 |
50.7 | 40.8 | 24.6 |
24.4 | 9.5 | 29.5 |
36.8 | 37.4 | 38.1 |
42.2 | 38.9 | 30.3 |
26.3 | 46.2 | 11.7 |
14 | 35.8 | 35.8 |
28.5 | 20.7 | 22.4 |
36.9 | 37.8 | 17 |
25.6 | 49.7 | 15.4 |
25.9 | 44.3 | 15.4 |
29.5 | 12.1 | 6.8 |
29.7 | 43.7 | 14.8 |
30.5 | 35.9 | 59.3 |
20 | 30.2 | 5.3 |
23.2 | 8.5 | 0.6 |
20.7 | 34.6 | 20.7 |
6.2 | 37.9 | 18.6 |
44.2 | 50.9 | 24.9 |
28.2 | 24.2 | 9.3 |
28.8 | 39.1 | 11.9 |
16.6 | 20.4 | 19.6 |
20.2 | 12.4 | 31 |
13.1 | 28 | 25.9 |
16.9 | 28.4 | 52.6 |
32.4 | 19.4 | 38.3 |
19.6 | 42.5 | 34 |
12.8 | 27.2 | 24.9 |
30.2 | 22.6 | 32.1 |
65.1 | 50.8 | 43 |
25.5 | 34.1 | 31.1 |
17.5 | 27.1 | 16.8 |
11.1 | 38.9 | 34.1 |
48.8 | 28.7 | 40.4 |
38.9 | 54.2 | 29.4 |
23.1 | 30.6 | 9.8 |
21.6 | 15.9 | 19.5 |
22.3 | 15.1 | 9.6 |
27.3 | 30.1 | 21.6 |
30.7 | 32.2 | 26.5 |
In: Math
We are considering a launch of a new type of raisin into the packaged raisin market. To do so, we collected product ratings on a 1-10 Likert-scale from consumers utilizing the following attributes and corresponding levels,
Attribute | Level1 | level 2 | level 3 | level 4 |
Rasin Chewiness | low | medium | high | n/a |
Rasin Color | white | grey | brown | black |
Packaging Size | small | large | n/a | n/a |
Free Gift | no | yes | n/a | n/a |
Raisin Aroma | none | medium | heavy | n/a |
Price Compared to Market Leader | lower | same | higher | n/a |
Please base your answer to the following questions on this data. Note that each attribute is coded numerically. For instance, for Chewiness (Low =1, Medium =2, High =3) and similarly for the other attributes reading left to right in the table above.
For each of the attributes, we code the levels into multiple dummy variables to include in our regression. The variables we used are as follows:
|
|
Note that for each category, the number of variables is equal to the number of levels – 1.
For example, for chewiness, we only need 2 dummy variables to show 3 levels:
Regression Results from creating dummy variables
Coefficents | beta | std. error | t-value | p-value |
intercept | 5.2991 | 0.3240 | 16.353 | 0.0000 |
chew 1 | -0.8659 | 0.2437 | -3.5535 | 0.0006 |
chew 2 | -0.3461 | 0.2438 | -1.4195 | 0.1593 |
color 1 | 0.1211 | 0.2871 | 0.4218 | 0.6742 |
color 2 | 0.2145 | 0.2802 | 0.7657 | 0.4459 |
color 3 | 0.4799 | 0.2696 | 1.7801 | 0.0785 |
Size large | 0.8992 | 0.2000 | 4.4969 | 0.0000 |
gift dummy | 0.0916 | 0.2099 | 0.4365 | 0.6635 |
aroma 1 | 0.5468 | 0.2563 | 2.1334 | 0.0357 |
aroma 2 | 0.9715 | 0.2327 | 4.1742 | 0.0001 |
Price 1 | 0.6548 | 0.2157 | 3.0362 | 0.0032 |
Price 2 | 0.3237 | 0.2895 | 1.1180 | 0.2666 |
Residual standard error: 0.9418 on 88 degrees of freedom
Multiple R-Squared: 0.4276
F-statistic: 5.976 on 11 and 88 degrees of freedom, the p-value is 3.412e-007
In: Math
3. Independent random samples of n1 = 16 and n2 = 13 observations were selected from two normal populations with equal variances. The sample means and variances are shown below: Population 1 Population 2 Sample size 16 13 Sample mean 34.6 32.2 Sample variance 4.0 4.84 a) Suppose you wish to test if there is difference between the population means with significance level of α = 0.05. State the null and alternative hypotheses that you use for the test. b) Find the value of the test statistic c) Find the value of the critical value d) Conduct the test and state your conclusions.
In: Math
Paired Samples t-test (30pts) Suppose you are interested in deciding if a particular diet is effective in changing people’s weight. You decide to run a “within subject” experiment. You select 6 people and weight each of them. Two weeks, you weight them again. For each person you compute how much weight they lost over this period. This is what you find: Non-diet(subject 1-6): 0 7 3 2 -10 -1
You then put them on the diet and weigh them again after two weeks and compute how much they lost over this period. Diet
(subject 1-6) : 1 6 4 3 -8 2
a) State the null and alternative hypotheses (2pts)
b) Compute the mean and standard deviation of the difference distribution (4pts)
c) How many degrees of freedom do you have? (3pts)
d) Assume the Null Hypothesis is True and compute the t-statistic (2-pts)
e) Compute the P-value (2pts)
f) At an alpha = 0.05 would you accept or reject the null hypothesis? (3pts)
Please show work, thank you!
In: Math
Describe the difference between a one tailed and a two tailed test. What is the difference between a z test and a t test, and how do you determine which one to use? Also, discuss when a two sample test would be used, and provide an example.
In: Math
In: Math
-Formulate both null and alternative hypotheses for the client, and explain why the hypotheses need to be directional or non-directional.
-Determine what statistical test should be used to analyze the data.
-Summarize all information used to determine the correct statistical test (e.g., number of groups, type of data collected, independent or repeated measures)
-Provide a sample size and critical values in relation to the hypothesis.
-State what statistical test should be used (be specific since you have all of the information you need to determine the critical value(s)).
-Discuss what the statistical analysis will do in answering the hypotheses and question(s) for the client. Also discuss any potential problems to watch out for, including an appropriate sample size to meet the assumptions of the statistical test.
Client Scenario: Jackson Hole Mind and Body Works
My name is Jane, and I am a licensed counselor who owns a business in Jackson Hole, Wyoming. This past year, I started a couple laughing yoga classes that combine the anxiety removing benefits of yoga with the emotional release of laughter. It has been a lot of fun and many clients love it, but a competitor has been criticizing my new approach as a sham and quackery. I am confident that my laughing yoga classes are beneficial, but I would like you to perform a study that examines the impact of my approaches from a scientific perspective. I know that science needs to be objective, so I would like you to setup a study for me as a nonbiased researcher. My thoughts are that I could give you the email addresses of clients who are willing to be in the study, and you would ask each of them questions, and then analyze the results.
I would like you to examine two types of therapy I conduct, my laughing yoga therapy and my more traditional cognitive behavioral therapy. I expect that 40 participants will be available from my laughing yoga classes, and that 50 participants will be available from my traditional cognitive behavioral therapy sessions. It would be nice if you asked questions related to clients’ current healthy living practices and use of positive emotions. You can determine the exact questions to ask clients. I was thinking that I would offer each participant in the study a couple free smoothie drinks at a local juice bar for participating in the research.
In: Math
Write an R function max.streak(p) that gives the length of the maximum "streak" of either all heads or all tails in 100 flips of a (possibly biased) coin whose probabilty of showing heads is pp.
Use your function to determine the expected length (rounded to the nearest integer) of the maximum streak seen in 100 flips of a coin when the probability of seeing "heads" is 0.700.70.
As a check of your work, note that the expected length of the maximum streak seen in 100 flips of a fair coin should be very close to 7.
In: Math
a. An experiment was performed on a certain
metal to determine if the strength is a function of heating time
(hours). Results based on 25 metal sheets are given below. Use the
simple linear regression model.
∑X = 50
∑X2 = 200
∑Y = 75
∑Y2 = 1600
∑XY = 400
Find the estimated y intercept and slope. Write the equation of the
least squares regression line and explain the coefficients.
Estimate Y when X is equal to 4 hours. Also determine the standard
error, the Mean Square Error, the coefficient of determination and
the coefficient of correlation. Check the relation between
correlation coefficient and Coefficient of Determination. Test the
significance of the slope.
b. Consumer Reports provided extensive testing and ratings for more than 100 HDTVs. An overall score, based primarily on picture quality, was developed for each model. In general, a higher overall score indicates better performance. The following (hypothetical) data show the price and overall score for the ten 42-inch plasma televisions (Consumer Report data slightly changed here):
Brand |
Price (X) |
Score (Y) |
Dell |
3800 |
50 |
Hisense |
2800 |
45 |
Hitachi |
2700 |
35 |
JVC |
3000 |
40 |
LG |
3500 |
45 |
Maxent |
2000 |
28 |
Panasonic |
4000 |
57 |
Phillips |
3200 |
48 |
Proview |
2000 |
22 |
Samsung |
3000 |
30 |
Use the above data to develop and estimated regression equation. Compute Coefficient of Determination and correlation coefficient and show their relation. Interpret the explanatory power of the model. Estimate the overall score for a 42-inch plasma television with a price of $3600 and perform significance test for the slope.
In: Math
Number of People Making Contribution | ||||||
Ethnic Group | $1-50 | $51-100 | $101-150 | $151-200 | Over $200 | Row Total |
A | 82 | 64 | 45 | 38 | 22 | 251 |
B | 91 | 54 | 67 | 30 | 22 | 264 |
C | 74 | 68 | 59 | 35 | 30 | 266 |
D | 98 | 87 | 71 | 54 | 30 | 340 |
Column Total | 345 | 273 | 242 | 157 | 104 | 1121 |
(b) Find the value of the chi-square statistic for the sample. (Round the expected frequencies to at least three decimal places. Round the test statistic to three decimal places.)
In: Math
The population of Nevada, P(t), in millions of people, is a
function of t, the number of years since 2010. Explain the meaning
of the statement P(8) = 3. Use units and everyday language. (1
point)
2. Find the slope-intercept form of the equation of the line
through the points (8, 25) and (-2, -13). (2 points)
3. At 8am, Charles leaves his house in Spartanburg, SC and drives
at an average speed of 65 miles per hour toward Orlando, FL. At
11:45am, he stops for lunch in Savannah, GA, which is 276.25 miles
from Orlando. a. Find a linear formula that represents Charles’
distance, D, in miles from Orlando as a function of t, time in
hours since 8am. (2 points)
b. Find and interpret the horizontal intercept. Remember to write
your intercept as a point! (2 points)
c. Find and interpret the vertical intercept. Remember to write
your intercept as a point! (2 points)
1
2
4. The temperature in ◦F of freshly prepared soup is given by T(t)
= 72 + 118e−0.018t, where t represents time in minutes since 6pm
when the soup was removed from the stove. a. Determine the value of
T(30) and interpret your answer in everyday language. (2
points)
b. Find and interpret the vertical intercept. Remember to write
your intercept as a point! (2 points)
5. Decide whether the following function is linear. Explain how you
know without finding the equation of the line.
x 9 12 16 23 34 f(x) 26.6 36.2 49 74.9 110.1
6. Attendance at a local fair can be modeled by A(t) = −30t2 + 309t
+ 20 people, where t represents the number of hours since 10am. a.
Find the average rate of change of the attendance from t = 3 to t =
8. Give units. (2 points)
b. Interpret your answer from (a) in everyday language.
In: Math
NO HANDWRITTEN ANSWERS PLEASE
The most common abuse of correlation in studies is to confuse the concepts of correlation with those of causation.
Good SAT scores do not cause good college grades, for example. Rather, there are other variables, such as good study habits and motivation, that contribute to both. Find an example of an article that confuses correlation and causation.
Discuss other variables that could contribute to the relationship between the variables.
In: Math
Bass - Samples: The bass in Clear Lake have weights that are normally distributed with a mean of 1.9 pounds and a standard deviation of 0.8 pounds.
(a) If you catch 3 random bass from Clear Lake, find the
probability that the mean weight is less than 1.0
pound. Round your answer to 4 decimal
places.
(b) If you catch 3 random bass from Clear Lake, find the
probability that the mean weight it is more than 3
pounds. Round your answer to 4 decimal
places.
In: Math