What is the normal curve and why is it so important to the world of inferential statistics?
In: Statistics and Probability
You are a nurse working in a distress center and would like to study the self-esteem of domestic violence victims. You would like to know if self-esteem is associated with education level. You know from the existing literature that self-esteem scores are generally normally distributed with homogeneous variance across education groups. The table below shows the data that you collected for a random sample of clients that recently visited your center. Let the probability of committing a type I error be 0.05. Can you conclude that there is a difference in self-esteem across the education groups? If an overall significant difference is found, which pairs of individual sample means are significant different?
Less than High School Diploma |
High School Diploma |
Some College |
Bachelor’s Degree and Above |
17 |
22 |
24 |
26 |
15 |
23 |
25 |
27 |
14 |
24 |
26 |
28 |
16 |
25 |
24 |
29 |
17 |
26 |
28 |
30 |
26 |
27 |
29 |
31 |
15 |
28 |
27 |
32 |
18 |
20 |
26 |
33 |
19 |
18 |
25 |
34 |
21 |
20 |
23 |
35 |
In: Statistics and Probability
A nursing professor was curious as to whether the students in a very large class
she was teaching who turned in their tests first scored differently from the overall
mean on the test. The overall mean score on the test was 75 with a standard
deviation of 10; the scores were approximately normally distributed. The mean
score for the first 20 tests was 78. Did the students turning in their tests first score
significantly
different from the mean?
In: Statistics and Probability
A national chain of women’s clothing stores with locations in the large shopping malls thinks that it can do a better job of planning more renovations and expansions if it understands what variables impact sales. It plans a small pilot study on stores in 25 different mall locations. The data it collects consist of monthly sales, store size (sq. ft), number of linear feet of window display, number of competitors located in mall, size of the mall (sq. ft),and distance to nearest competitor (ft).
1. Test the individual regression coefficients. At the 0.05 level of significance, what are your conclusions?
2. f you were going to drop just one variable from the model, which one would you choose? Why?
The store planners for the women’s clothing chain want to find the best model that they can for understanding what store characteristics impact monthly sales.
3. Use stepwise regression to find the best model for the data.
4. Analyze the model you have identified to determine whether it has any problem
5. Write a memo reporting your findings to your boss. Identify the strengths and weaknesses of the model you have chosen.
Sales | Size | Windows | Competitors | Mall Size | Nearest Competitor |
4453 | 3860 | 39 | 12 | 943700 | 227 |
4770 | 4150 | 41 | 15 | 532500 | 142 |
4821 | 3880 | 39 | 15 | 390500 | 263 |
4912 | 4000 | 39 | 13 | 545500 | 219 |
4774 | 4140 | 40 | 10 | 329600 | 232 |
4638 | 4370 | 48 | 14 | 802600 | 257 |
4076 | 3570 | 37 | 16 | 463300 | 241 |
3967 | 3870 | 39 | 16 | 855200 | 220 |
4000 | 4020 | 44 | 21 | 443000 | 188 |
4379 | 3990 | 38 | 16 | 613400 | 209 |
5761 | 4930 | 50 | 15 | 420300 | 220 |
3561 | 3540 | 34 | 15 | 626700 | 167 |
4145 | 3950 | 36 | 14 | 601500 | 187 |
4406 | 3770 | 36 | 12 | 593000 | 199 |
4972 | 3940 | 38 | 11 | 347100 | 204 |
4414 | 3590 | 35 | 10 | 355900 | 146 |
4363 | 4090 | 38 | 13 | 490100 | 206 |
4499 | 4580 | 45 | 16 | 649200 | 144 |
3573 | 3580 | 35 | 18 | 685900 | 178 |
5287 | 4380 | 42 | 15 | 106200 | 149 |
5339 | 4330 | 40 | 10 | 354900 | 231 |
4656 | 4060 | 37 | 11 | 598700 | 225 |
3943 | 3380 | 34 | 16 | 381800 | 163 |
5121 | 4760 | 44 | 17 | 597900 | 224 |
4557 | 3800 | 36 | 14 | 745300 | 195 |
In: Statistics and Probability
what does researcher looks for when manipulating a variable?
In: Statistics and Probability
In this problem, we explore the effect on the standard deviation of multiplying each data value in a data set by the same constant. Consider the data set 14, 16, 13, 7, 8.
(a) Use the defining formula, the computation formula, or a
calculator to compute s. (Round your answer to one decimal
place.)
s =
(b) Multiply each data value by 7 to obtain the new data set 98,
112, 91, 49, 56. Compute s. (Round your answer to one
decimal place.)
s =
(c) Compare the results of parts (a) and (b). In general, how does
the standard deviation change if each data value is multiplied by a
constant c?
Multiplying each data value by the same constant c results in the standard deviation being |c| times as large.Multiplying each data value by the same constant c results in the standard deviation increasing by c units. Multiplying each data value by the same constant c results in the standard deviation remaining the same.Multiplying each data value by the same constant c results in the standard deviation being |c| times smaller.
(d) You recorded the weekly distances you bicycled in miles and
computed the standard deviation to be s = 2.9 miles. Your
friend wants to know the standard deviation in kilometers. Do you
need to redo all the calculations?
YesNo
Given 1 mile ≈ 1.6 kilometers, what is the standard deviation in
kilometers? (Enter your answer to two decimal places.)
s = km
In: Statistics and Probability
{4 marks} Here, we will quickly investigate the importance of understanding conditional prob- abilities when talking with medical patients. This problem is based on a true investigation by Hoffrage and Gigerenzer in 1996. The investigators asked practicing physicians to consider the following scenario:
“The probability that a randomly chosen woman age 40-50 has breast cancer is 1%. If a woman has breast cancer, the probability that she will have a positive mammogram is 80%. However, if a woman does not have breast cancer, the probability she will have a positive mammogram is 10%. Imagine that you are consulted by a woman, age 40-50, who has a positive mammogram but no other symptoms. What is the probability that she actually has breast cancer?”
Twenty-four physicians were asked to respond. The average probability estimate was 70%. Using your knowledge of Bayes’ Rule, determine if the physicians were close in their estimate. Comment on where the error in their judgement may have occurred, and why this may cause problems in their practice.
In: Statistics and Probability
the probability of getting a single pair in a poker hand of 5 cards is approximately .42. Find the approximate probability that out of 1000 poker hands there will be at least 450 with a single pair.
In: Statistics and Probability
> convert(8.3,"inches")
[1] 0.21082
> convert(8.3,"feet")
[1] 2.52984
> convert(8.3,"foot")
Error in convert(8.3, "foot") : the unit was not "inches" or "feet"
Use your function to convert the values of Girth and Height to meters. Using cor.test() function, calculate the correlation (and the corresponding p-value) between Girth and Height in meters.
In: Statistics and Probability
4. (5 pts) Acme Outdoors Co. is introducing a new line of sport all-terrain vehicles (ATVs). Acme is considering marketing | |||||||||||
program proposals from two competing companies: A & B. A's marketing plan is expected to generate high sales with a 75% | |||||||||||
probability and only a 25% likelihood of low sales. | |||||||||||
An alternative marketing plan from company B could result in an initial 60% likelihood of high ATV sales and 40% probability | |||||||||||
of low sales. But company B's offer also contains a provision for an optional follow-on promotion if a low response is returned. | |||||||||||
If an optional follow-on ATV promotion is conducted by company B, there is a 70% chance of ultimately realizing high sales. | |||||||||||
Draw an appropriate decision tree diagram with branches labeled and probabilities assigned. Do not solve! |
In: Statistics and Probability
What is "sampling bias"? Explain using proper terminology and craft your example to explain how it can affect the outcome of a statistical study. This should be several paragraphs long.
In: Statistics and Probability
define p values. explain the two methods of interpreting p values
In: Statistics and Probability
6.5.4
According to the WHO MONICA Project the mean blood pressure for people in China is 128 mmHg with a standard deviation of 23 mmHg (Kuulasmaa, Hense & Tolonen, 1998). Blood pressure is normally distributed.
6.5.6
The mean cholesterol levels of women age 45-59 in Ghana, Nigeria, and Seychelles is 5.1 mmol/l and the standard deviation is 1.0 mmol/l (Lawes, Hoorn, Law & Rodgers, 2004). Assume that cholesterol levels are normally distributed.
In: Statistics and Probability
A simple random sample of size n equals n=400 individuals who are currently employed is asked if they work at home at least once per week. Of the 400 employed individuals surveyed, 40 responded that they did work at home at least once per week. Construct a 99% confidence interval for the population proportion of employed individuals who work at home at least once per week.
In: Statistics and Probability
A population has a mean of 300 and a standard deviation of 90. Suppose a sample of size 125 is selected and is used to estimate . Use z-table.
What is the probability that the sample mean will be within +/- 3 of the population mean (to 4 decimals)? (Round z value in intermediate calculations to 2 decimal places.) 0.8884 What is the probability that the sample mean will be within +/- 11 of the population mean (to 4 decimals)? (Round z value in intermediate calculations to 2 decimal places.)
In: Statistics and Probability