A drug company tested three formulations of a pain relief medicine for migraine headache sufferers. For the experiment 27 volunteers were selected and 9 were randomly assigned to one of three drug formulations. The subjects were instructed to take the drug during their next migraine headache episode and to report their pain on a scale of 1 to 10 (10 being most pain). | |||||||
Drug A | Drug B | Drug C | |||||
4 | 6 | 6 | |||||
5 | 8 | 7 | |||||
4 | 4 | 6 | |||||
3 | 5 | 6 | |||||
2 | 4 | 7 | |||||
4 | 6 | 5 | |||||
3 | 5 | 6 | |||||
4 | 8 | 5 | |||||
4 | 6 | 5 | |||||
1) Fill in the blanks | |||||||
response variable = | |||||||
factor = | |||||||
n = number of observations = | |||||||
p = number of treatments = | |||||||
overall mean = | |||||||
Drug A | Drug B | Drug C | |||||
sample mean | |||||||
Total variation in the data (SSTO)= | |||||||
Variation caused because of drugs (SST) = | |||||||
Variation caused by the samples (SSE) = | |||||||
2) Fill in the ANOVA table (alpha=0.05) | |||||||
Source of Variation | SS | df | MS | F | p-value | F-critical | |
Treatments | |||||||
Samples | |||||||
Total | |||||||
3) Based on your Anova table, write down the conclusion about Null hyphothesis (H0: µA=µB=µC) | |||||||
If you reject Null, what does that mean? Write it in once sentence. | |||||||
4) Build TUKEY SIMULTANOUS 95% confidence intervals for the following: | |||||||
Tukey Simultamous C.I | µA-µB | µA-µC | µB-µC | ||||
Point Estimate | |||||||
Standard error | |||||||
df = p and n-p -----> | q alpha | ||||||
Margin of error | |||||||
Lower Limit | |||||||
Upper Limit | |||||||
5) Build Individual 95% confidence intervals for the following: | |||||||
Individual C.I | µA-µB | µA-µC | µB-µC | µA | |||
Point Estimate | |||||||
Standard error | |||||||
df = n-p ---- > | talpha/2 | ||||||
Margin of error | |||||||
Lower Limit | |||||||
Upper Limit | |||||||
6) Based on the answers on parts 4 and 5, write down your conclusions about which drug beats which drug? | |||||||
7) Create the MegaStat output and fill in the blanks by using the post-hoc analysis of MegsaStat output: | |||||||
What is the p-value of comparing drug A and B | |||||||
What is the p-value of comparing drug C and B | |||||||
Did we find evidence that the effects of drugs B and C on pain relief are different from eachother? | |||||||
In: Math
The combined math and verbal scores for females taking the SAT-I test are normally distributed with a mean of 998 and a standard deviation of 202 (based on date from the College Board). If a college includes a minimum score of 1100 among its requirements, what percentage of females do not satisfy that requirement?
In: Math
The Excel file Salary reports the monthly salaries for 93 randomly and independently selected employees of a bank; there are 32 salaries of male employees and 61 salaries of female employees.
Let um = the mean monthly salary for all male bank employees, and uf = the mean monthly salary for all female bank employees. Your objective is to find some evidence of um > uf, that is, the female employees are discriminated against.
Provide descriptive statistical summaries of the data sets for male and female employees. What are your primary observations concerning the two data sets? Calculate the 99% confidence intervals for um and uf, and interpret them. Do these intervals overlap?
Formulate a hypothesis test for supporting um > uf. What is the distribution of the test statistic? What is the value of the test statistics? What is the p-value of the test? What is your conclusion and its interpretation when the test is conducted under the 0.01 significance level?
Do your findings support a discrimination suit against the employer?
Instructions
For Task 1, apply “Descriptive Statistics” in Data Analysis of Excel (see instructions on pages 354-356). Summarize the obtained relevant statistics. Recall that in the row “Confidence Level (99.0%)” of the Descriptive Statistics output you actually see the margin of error of the confidence interval for the corresponding population mean.
To complete Task 2, formulate the null and alternative hypotheses, and apply “t-Test; Two-sample Assuming Unequal Variances” in Data Analysis of Excel with α = 0.01 (see instructions on pages 446-447). Note. In Excel, e.g., 2.71E-06 is 2.71(10^-6), which is practically zero.
Feel free to express your opinion.
Use Microsoft Word to write a managerial report with your name shown on the first page. The report should include all your Excel outputs (copy and paste them), so do not attach any separate Excel files. Hint. You may assume that you are an intern working for a branch of the bank and your boss, who has a very limited knowledge about business statistics, asked you to conduct a statistical analysis concerning the comparison of the salaries of male and female employees. You report may look like a letter written to your boss in which you present your findings.
Male Salary | Female Salary |
4620 | 3900 |
5040 | 4020 |
5100 | 4290 |
5100 | 4380 |
5220 | 4380 |
5400 | 4380 |
5400 | 4380 |
5400 | 4380 |
5400 | 4440 |
5400 | 4500 |
5700 | 4500 |
6000 | 4620 |
6000 | 4800 |
6000 | 4800 |
6000 | 4800 |
6000 | 4800 |
6000 | 4800 |
6000 | 4800 |
6000 | 4800 |
6000 | 4800 |
6000 | 4800 |
6000 | 4800 |
6000 | 4980 |
6000 | 5100 |
6300 | 5100 |
6600 | 5100 |
6600 | 5100 |
6600 | 5100 |
6840 | 5100 |
6900 | 5160 |
6900 | 5220 |
8100 | 5220 |
5280 | |
5280 | |
5280 | |
5400 | |
5400 | |
5400 | |
5400 | |
5400 | |
5400 | |
5400 | |
5400 | |
5400 | |
5400 | |
5400 | |
5400 | |
5520 | |
5520 | |
5580 | |
5640 | |
5700 | |
5700 | |
5700 | |
5700 | |
5700 | |
6000 | |
6000 | |
6120 | |
6300 | |
6300 |
In: Math
If you were given a choice:
Wide confidence interval with a small confidence level
Wide confidence interval with a large confidence level
Narrow confidence interval with a small confidence level
Narrow confidence interval with a large confidence level
Which would you choose? Why? Provide at least one hypothetical example.
In: Math
Assume that different groups of couples use a particular method of gender selection and each couple gives birth to one baby. This method is designed to increase the likelihood that each baby will be a girl, but assume that the method has no effect, so the probability of a girl is 0.5. Assume that the groups consist of 26 couples. Complete parts (a) through (c) below. a. Find the mean and the standard deviation for the numbers of girls in groups of 26 births. The value of the mean is muequals nothing. (Type an integer or a decimal. Do not round.) The value of the standard deviation is sigmaequals nothing. (Round to one decimal place as needed.) b. Use the range rule of thumb to find the values separating results that are significantly low or significantly high. Values of nothing girls or fewer are significantly low. (Round to one decimal place as needed.) Values of nothing girls or greater are significantly high. (Round to one decimal place as needed.) c. Is the result of 23 girls a result that is significantly high? What does it suggest about the effectiveness of the method? The result ▼ is not is significantly high, because 23 girls is ▼ less than equal to greater than nothing girls. A result of 23 girls would suggest that the method ▼ is effective. is not effective. (Round to one decimal place as needed.)
In: Math
According to a 2009 Reader's Digest article, people throw away about 9% of what they buy at the grocery store. Assume this is the true proportion and you plan to randomly survey 122 grocery shoppers to investigate their behavior. What is the probability that the sample proportion does not exceed 0.10?
Standard Deviation of Sample Proportion:
Answer format: .####
z score: Answer format: .####
Probability: Answer format: .####
Note: You should keep standard deviation of phat #, z score and probability to 4 decimal places in your calculations.
Use TI 84 to get the probability.
In: Math
In: Math
You own a company that raises cattle to sell for beef. Your company needs to forecast sales for the next year to purchase raw materials and plan production. You have a pretty good qualitative grasp of the key causal variables that influence sales quantity but lack quantitative estimates of each variable’s impact on sales. So, you collect historical data on monthly per capita beef consumption (dependent variable) and the causal variables you have identified (price of beef and related meats, household income, price). Using regression analysis, you calculate this relationship. For sales quantity, Q, your data represents pounds per capita; for price, P, its the unit price in dollars; income (I) is the average household income in $1000s (e.g., I = 10 implies average income of $10,000). You generate the following regression equation: Q = 1.24 – 0.23 PB + 0.24 PP + 1.18 PC + 0.24 Y (0.34) (-0.14) (0.11) (0.42) (0.09) where the standard errors are in parentheses. PB is the price of beef, PP is the price of pork, PC is the price of chicken, and Y is household income. The R-square value for this regression estimation is 0.83. You should use a critical value of t = 1.96 in the following questions. a. What does the regression equation tell you? Why is it used in economics? b. Are the above regression coefficients significant? Explain. c. Interpret the R-square value of the regression. What does it imply?
In: Math
a.
In general, high school and college students are the most pathologically sleep-deprived segment of the population. Their alertness during the day is on par with that of untreated narcoleptics and those with untreated sleep apnea. Not surprisingly, teens are also 71 percent more likely to drive drowsy and/or fall asleep at the wheel compared to other age groups. (Males under the age of twenty-six are particularly at risk.)
The accompanying data set represents the number of hours 25 college students at a small college in the northeastern United States slept and is from a random sample. Enter this data into C1 of Minitab Express.
6 9 7 7 6 7 7 5 8 6 6 6 8 8 8 5 4 6 7 8 5 8 7 6 7
For the analyses that follow, we shall use
· 90%, 95%, and 99% as the confidence levels for the confidence interval.
· 5% as the level of significance ( ) for the hypothesis test.
· 7 hours sleep as the null hypothesis (according to The Sleep Foundation).
l. Using a 5% level of significance, α = 0.05, make a statistical DECISION regarding the plausibility of the hypotheses; that is, would you reject or fail to reject the null hypothesis? Justify your answer.
a. Describe what the p-value measures in the context of this study. This is also referred to as “interpreting the p-value.” S
In: Math
only 2 questions
((PLSS with steps and clear hand written PLSSS and thank you sooooo much for helping me))
Depression | Geographic location | Gender |
3 | Florida | Female |
7 | Florida | Male |
7 | Florida | Female |
3 | Florida | Female |
8 | Florida | Female |
8 | Florida | Male |
8 | Florida | Male |
5 | Florida | Female |
5 | Florida | Male |
2 | Florida | Female |
6 | Florida | Female |
2 | Florida | Female |
6 | Florida | Female |
6 | Florida | Male |
9 | Florida | Female |
7 | Florida | Male |
5 | Florida | Male |
4 | Florida | Male |
7 | Florida | Female |
3 | Florida | Female |
8 | New York | Female |
11 | New York | Male |
9 | New York | Male |
7 | New York | Male |
8 | New York | Female |
7 | New York | Male |
8 | New York | Female |
4 | New York | Male |
13 | New York | Female |
10 | New York | Male |
6 | New York | Female |
8 | New York | Female |
12 | New York | Female |
8 | New York | Male |
6 | New York | Male |
8 | New York | Male |
5 | New York | Male |
7 | New York | Female |
7 | New York | Male |
8 | New York | Male |
10 | North Carolina | Male |
7 | North Carolina | Female |
3 | North Carolina | Male |
5 | North Carolina | Male |
11 | North Carolina | Female |
8 | North Carolina | Female |
4 | North Carolina | Male |
3 | North Carolina | Male |
7 | North Carolina | Female |
8 | North Carolina | Male |
8 | North Carolina | Female |
7 | North Carolina | Female |
3 | North Carolina | Female |
9 | North Carolina | Female |
8 | North Carolina | Female |
12 | North Carolina | Female |
6 | North Carolina | Male |
3 | North Carolina | Male |
8 | North Carolina | Male |
11 | North Carolina | Female |
As part of a long-term study of individuals 65 years of age or older, sociologists and physicians at the Wentworth Medical Center in upstate New York investigated the relationship between geographic location, gender and depression. A sample of 60 individuals, all in reasonably good health, was selected; 20 individuals were residents of Florida, 20 were residents of New York, and 20 were residents of North Carolina. Each of the individuals sampled was given a standardized test to measure depression. The data collected follow; higher test scores indicate higher levels of depression.
........
h) Is there any significant difference of the mean of depression value due to geographic location? Use a 0.05 level of significance.
i) Give point estimates for the proportion of individuals according to their gender.
In: Math
I was given this problem:
PART A:
Consider the following model of wage determination:
wage= 0+1educ+2exper+3married+ε
where: wage = hourly earnings in dollars
educ = years of education
exper = years of experience
married = dummy equal to 1 if married, 0 otherwise
Using data from the file ps2.dta, which contains wage data for a number of workers from across the United States, estimate the model shown above by OLS using the regress command in Stata. As always, be sure to include your Stata output (show the regression command used and the complete regression output).
Why are we unable to determine which of the included
variables is the most important determinant of wages by simply
looking at the size (and perhaps significance) of the estimated
coefficients (even if we were confident that these estimates
reflected unbiased causal impacts)?
My answer to PART A:
. regress wage educ exper married
Source | SS df MS Number of obs = 526
-------------+---------------------------------- F(3, 522) = 54.97
Model | 1719.00074 3 573.000246 Prob > F = 0.0000
Residual | 5441.41355 522 10.4241639 R-squared = 0.2401
-------------+---------------------------------- Adj R-squared = 0.2357
Total | 7160.41429 525 13.6388844 Root MSE = 3.2286
------------------------------------------------------------------------------
wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .6128507 .0542332 11.30 0.000 .5063084 .7193929
exper | .0568845 .0116387 4.89 0.000 .0340201 .079749
married | .9894464 .309198 3.20 0.001 .3820212 1.596872
_cons |
-3.372934 .7599027 -4.44 0.000
-4.865777 -1.880091
We are unable to determine which of the independent variables is the strongest predictor of wage because the predictors use different units of measurement.
Is this answer correct?
PART B:
Estimate the model again in Stata, but now include the “beta” option and explain how the additional information provided helps to provide insight into this issue discussed in part (c). As part of your answer, provide a clear interpretation of the new Stata output corresponding to the educ variable.
My answer to PART B:
The “, beta” command, shows us the standardized
coefficients and enables us to make a comparison of the independent
variables’ relationship to the dependent variable; the higher the
absolute value of the beta coefficient for each the independent
variable, the stronger predictor it is of the dependent variable.
The beta coefficient shows how one unit change in the independent
variable’s standard deviation corresponds to a change in the
standard deviation of the dependent variable. From the STATA
output, are able to see that educ has the highest beta coefficient,
meaning that education is the strongest predictor of wage. Whether
or not someone is married is the weakest predictor of
wage.
regress wage educ exper married, beta
Source | SS df MS Number of obs = 526
-------------+---------------------------------- F(3, 522) = 54.97
Model | 1719.00074 3 573.000246 Prob > F = 0.0000
Residual | 5441.41355 522 10.4241639 R-squared = 0.2401
-------------+---------------------------------- Adj R-squared = 0.2357
Total | 7160.41429 525 13.6388844 Root MSE = 3.2286
------------------------------------------------------------------------------
wage | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
educ | .6128507 .0542332 11.30 0.000 .4595065
exper | .0568845 .0116387 4.89 0.000 .2090517
married | .9894464 .309198 3.20 0.001 .1308998
_cons | -3.372934 .7599027 -4.44 0.000 .
Is my answer correct?
In: Math
Calculate the sample standard deviation and sample variance for the following frequency distribution of heart rates for a sample of American adults. If necessary, round to one more decimal place than the largest number of decimal places given in the data. Heart Rates in Beats per MinuteClass Frequency 61 - 66 12 67 - 72 3 73 - 78 9 79 - 84 11 85 - 90 13
In: Math
Ten measurements of impact energy on specimens of A238 steel at 60 ºC are as follows: 64.1, 64.7, 64.5, 64.6, 64.5, 64.3, 64.6, 64.8, 64.2, and 64.3 J.
a. Use the Student’s t distribution to find a 95% confidence interval for the impact energy of A238 steel at 60 ºC.
b. Use the Student’s t distribution to find a 98% confidence interval for the impact energy of A238 steel at 60 ºC.
In: Math
Life expectancy in the US varies depending on where an individual lives, reflecting social and health inequality by region. You are interested in comparing mean life expectancies in counties in California, specifically San Mateo County and San Francisco County. Given the data below, answer the following questions.
Mean life expectancy at birth for males in 2014 | Sample standard deviation | Sample size (n) | |
San Mateo County |
81.13 years |
8.25 |
101 |
SF County |
79.34 years |
9.47 |
105 |
1. Calculate the standard error of the mean difference in male life expectancy between the 2 counties, assuming nonequal variance.
2. Calculate a 99% confidence interval for the mean difference in male life expectancy between the two counties. Use the conservative approximation for degrees of freedom.
3.Based on your confidence interval, would you expect the mean difference in male life expectancy to be statistically significant at the α=.01 level? EXPLAIN
In: Math
B.38
Average Size of a Performing Group in the Rock and Roll Hall of Fame
From its founding through 2015, the Rock and Roll Hall of Fame has inducted 303 groups or individuals, and 206 of the inductees have been performers while the rest have been related to the world of music in some way other than as a performer. The full dataset is available at RockandRoll on StatKey. Some of the 206 performer inductees have been solo artists while some are groups with a large number of members. We are interested in the average number of members across all groups or individuals inducted as performers.
(a)
What is the mean size of the performer inductee groups (including individuals)? Use the correct notation with your answer.
(b)
Use technology to create a graph of all 206 values. Describe the shape, and identify the two groups with the largest number of people.
(c)
Use technology to generate a sampling distribution for the mean size of the group using samples of size n = 10. Give the shape and center of the sampling distribution and give the standard error.
(d)
What does one dot on the sampling distribution represent?
In: Math