For this assignment, please submit the answers to the following questions, as well as an Excel spreadsheet which documents the work you did.
Do poets die young? According to William Butler Yeats, “She is the Gaelic muse, for she gives inspiration to those she persecutes. The Gaelic poets die young, for she is restless, and will not let them remain long on earth.” One study designed to investigate this issue examined the age at death for writers from different cultures and genders. Three categories of writers examined were novelists, poets, and nonfiction writers. The ages at death for female writers in these categories from North America are given in the dataset on blackboard (data file : Female Writers.xls). Most of the writers are from the United States, but Canadian and Mexican writers are also included.
a) Use Excel to build a boxplot of the three
associated data sets. If the population mean death ages were the
same for the three populations, would you expect to see a boxplot
like this? Please elaborate.
b) Write down both the null and alternative hypothesis
for the one-way ANOVA.
c) Run the one-way ANOVA test in Excel; make sure you save this in your spreadsheet somewhere where it can be found.
d) You will see a value for F-stat. Report this value, and explain how it is related to MSA and MSW.
e) In general, what is a p-value? In your table, what p-value is reported, and what exactly does it mean?
f) At the .05 level of significance, is there evidence of a difference in mean age of death among the various types of female writers? Explain your decision.
Novels | Poems | Nonfiction |
57 | 88 | 74 |
90 | 69 | 86 |
67 | 78 | 87 |
56 | 68 | 68 |
90 | 72 | 76 |
72 | 60 | 73 |
56 | 50 | 63 |
90 | 47 | 78 |
80 | 74 | 83 |
74 | 36 | 86 |
73 | 87 | 40 |
86 | 55 | 75 |
53 | 68 | 90 |
72 | 75 | 47 |
86 | 78 | 91 |
82 | 85 | 94 |
74 | 69 | 61 |
60 | 38 | 83 |
79 | 58 | 75 |
80 | 51 | 89 |
79 | 72 | 77 |
77 | 58 | 86 |
64 | 84 | 66 |
72 | 30 | 97 |
88 | 79 | |
75 | 90 | |
79 | 66 | |
74 | 45 | |
85 | 70 | |
71 | 48 | |
78 | 31 | |
57 | 43 | |
54 | ||
50 | ||
59 | ||
72 | ||
60 | ||
77 | ||
50 | ||
49 | ||
73 | ||
39 | ||
73 | ||
61 | ||
90 | ||
77 | ||
57 | ||
72 | ||
82 | ||
54 | ||
62 | ||
74 | ||
65 | ||
83 | ||
86 | ||
73 | ||
79 | ||
63 | ||
72 | ||
85 | ||
91 | ||
77 | ||
66 | ||
75 | ||
90 | ||
35 | ||
86 |
In: Statistics and Probability
If n = 460 and X = 368, construct a 99% confidence interval for
the population proportion, p.
Give your answers to three decimals
In: Statistics and Probability
Note: Each bound should be rounded to three decimal places.
Q: A random sample of n=100 observations produced a mean of x⎯⎯⎯=35 with a standard deviation of s=5.
(a) Find a 95% confidence interval for μ Lower-bound: Upper-bound:
(b) Find a 90% confidence interval for μ Lower-bound: Upper-bound:
(c) Find a 99% confidence interval for μ Lower-bound: Upper-bound:
In: Statistics and Probability
The manager of a computer software company wishes to study the number of hours senior executives by type of industry spend at their desktop computers. The manager selected a sample of five executives from each of three industries. At the .05 significance level, can she conclude there is a difference in the mean number of hours spent per week by industry?
Banking |
Retail |
Insurance |
32 |
28 |
30 |
30 |
28 |
28 |
30 |
26 |
26 |
32 |
28 |
28 |
30 |
30 |
30 |
PLEASE SHOW ANSWER WITHOUT USING MINI TAB OR SOFTWARE
In: Statistics and Probability
The National Football League (NFL) records a variety of performance data for individuals and teams. To investigate the importance of passing on the percentage of games won by a team, the following data show the conference (Conf), average number of passing yards per attempt (Yds/Att), the number of interceptions thrown per attempt (Int/Att), and the percentage of games won (Win%) for a random sample of 16 NFL teams for the 2011 season (NFL web site, February 12, 2012). Click on the datafile logo to reference the data.
Let x1 represent Yds/Att.
|
In: Statistics and Probability
We give JMP output of regression analysis. Above output we give the regression model and the number of observations, n, used to perform the regression analysis under consideration. Using the model, sample size n, and output:
Model: y = β0 +
β1x1 +
β2x2 +
β3x3 +
ε Sample size:
n = 30
Summary of Fit | |
RSquare | 0.987331 |
RSquare Adj | 0.985869 |
Root Mean Square Error | 0.240749 |
Mean of Response | 8.382667 |
Observations (or Sum Wgts) | 30 |
Analysis of Variance | ||||
Source | df | Sum of Squares |
Mean Square |
F Ratio |
Model | 3 | 117.438830 | 39.14630 | 675.4012 |
Error | 26 | 1.506960 | 0.05800 | Prob > F |
C. Total | 29 | 118.945790 | <.0001* | |
(1) Report the total variation, unexplained variation, and explained variation as shown on the output. (Round your answers to 4 decimal places.) (2) Report R2 and R¯¯¯2R¯2 as shown on the output. (Round your answers to 4 decimal places.) (3) Report SSE, s2, and s as shown on the output. (Round your answers to 4 decimal places.) (4) Calculate the F(model) statistic by using the explained variation, the unexplained variation, and other relevant quantities. (Round your answer to 2 decimal places.) (5) Use the F(model) statistic and the appropriate critical value to test the significance of the linear regression model under consideration by setting α equal to .05. (6) Find the p−value related to F(model) on the output. Using the p−value, test the significance of the linear regression model by setting α = .10, .05, .01, and .001. What do you conclude? |
In: Statistics and Probability
Bank of America's Consumer Spending Survey collected data on annual credit card charges in seven different categories of expenditures: transportation, groceries, dining out, household expenses, home furnishings, apparel, and entertainment. Using data from a sample of 42 credit card accounts, assume that each account was used to identify the annual credit card charges for groceries (population 1) and the annual credit card charges for dining out (population 2). Using the difference data, the sample mean difference was d= $850, and the sample standard deviation was $1123. a. Formulate the null and alternative hypotheses to test for no difference between the population mean credit card charges for groceries and the population mean credit card charges for dining out. b. Use .05 level of significance. Can you conclude that the population means differ? What is the p -value? (to 6 decimals) c. Which category, groceries or dining out, has a higher population mean annual credit card charge? What is the point estimate of the difference between the population means? Round to the nearest whole number. 850 What is the 95% confidence interval estimate of the difference between the population means? Round to the nearest whole number. (n1,n2)=
In: Statistics and Probability
A bucket contains exactly 3 marble, one red, one blue and one green.
A person arbitrarily pulls out each marble one at a time.
What is the probability that the last marble removed is non-red?
Answer in the form of a fully reduced fraction.
In: Statistics and Probability
Find the MAD for the 3-month and the 12-month moving average forecast.
Year Month Rate(%)
2009 Jan 7.9
2009 Feb 8.5
2009 Mar 8.7
2009 Apr 9.1
2009 May 9.4
2009 Jun 9.4
2009 Jul 9.7
2009 Aug 9.5
2009 Sep 9.9
2009 Oct 9.9
2009 Nov 9.9
2009 Dec 9.7
2010 Jan 9.7
2010 Feb 9.6
2010 Mar 9.8
2010 Apr 9.7
2010 May 9.5
2010 Jun 9.4
2010 Jul 9.4
2010 Aug 9.4
2010 Sep 9.4
2010 Oct 9.6
2010 Nov 9.7
2010 Dec 9.4
2011 Jan 9.2
2011 Feb 8.9
2011 Mar 8.7
2011 Apr 9.1
2011 May 8.9
2011 Jun 9.2
2011 Jul 8.8
2011 Aug 9.1
2011 Sep 9.1
2011 Oct 8.8
2011 Nov 8.5
2011 Dec 8.4
2012 Jan 8.3
2012 Feb 8.3
2012 Mar 8.3
2012 Apr 8.2
2012 May 8.1
2012 Jun 8.1
2012 Jul 8.3
2012 Aug 8.3
2012 Sep 7.9
2012 Oct 7.9
2012 Nov 7.6
2012 Dec 7.7
In: Statistics and Probability
Annual starting salaries for college graduates with degrees in business administration are generally expected to be between $30,000 and $50,000. Assume that a 95% confidence interval estimate of the population mean annual starting salary is desired. How large a sample should be taken if the desired margin of error is:
a. $400? Remove all commas from your answer before submitting.
b. $230? Remove all commas from your answer before submitting.
c. $140? Remove all commas from your answer before submitting.
d. Would you recommend trying to obtain the $140 margin of error? Explain.
In: Statistics and Probability
This question refers to a sweepstakes promotion in which respondents were asked to select what color car they would like to receive if they had the winning number. For a random sample of respondents the choices were 24 blue (B), 34 green(G), 66 red(R), and 36 white(W). Test at the 0.05 level the claim that the population prefers each colour equally. The expected value of chi-square (the test statistics) is
Select one:
a. 7.815
b. 24.600
c. 0
d. 22.412
e. 9.488
This question refers to a sweepstakes promotion in which respondents were asked to select what color car they would like to receive if they had the winning number. For a random sample of respondents the choices were 24 blue (B), 34 green(G), 66 red(R), and 36 white(W). Test at the 0.05 level the claim that the population prefers each colour equally. The critical value chi_square_c is
Select one:
a. 9.488
b. 24.600
c. 22.412
d. 0
e. 7.815
In: Statistics and Probability
For a new product, sales volume in the first year is estimated to be 50,000 units and is projected to grow at a rate of 7% per year. The selling price is $100 and will increase by $10 each year. Per-unit variable costs are $22 and annual fixed costs are $1,000,000. Per-unit costs are expected to increase 4% per year. Fixed costs are expected to increase 10% per year.
Develop a spreadsheet model to predict the net present value of profit over a three-year period, assuming a 4% discount rate.
Please include Excel worksheet with all the details.
In: Statistics and Probability
A professor wants to determine whether her department should keep the requirement of college algebra as a prerequisite for an Introductory Statistics course. Accordingly, she allows some students to register for the course on a pass-fail basis regardless of whether or not they have had the prerequisite. At the end of the semester, the professor compares the number of students passing or failing the class with whether or not they had algebra. Of the 70 students in the class, 30 out of 45 who have had algebra and 5 out of 25 who have not passed the course. Are students more likely to pass the course if they have taken college algebra?
In: Statistics and Probability
A university lecturer in History hypothesizes that more time
studying predicts better exam performance. Before the next exam,
the lecture asks students in the class the average amount of time
(in minutes) they spend in the library per day. The data are below.
What can be concluded with α = 0.05?
time | exam |
---|---|
41 25 48 58 66 81 95 101 121 97 81 111 |
75 75 69 72 63 60 66 57 60 70 65 65 |
a) What is the appropriate statistic?
---Select--- na Correlation Slope Chi-Square
Compute the statistic selected above:
b) Compute the appropriate test statistic(s) to
make a decision about H0.
(Hint: Make sure to write down the null and alternative hypotheses
to help solve the problem.)
Critical value = ; Test statistic =
Decision: ---Select--- Reject H0 Fail to reject H0
c) Compute the corresponding effect size(s) and
indicate magnitude(s).
If not appropriate, input and/or select "na" below.
Effect size = ; ---Select--- na trivial
effect small effect medium effect large effect
d) Make an interpretation based on the
results.
Students who spend more time studying have better exam performance.Students who spend more time studying have worse exam performance. Students time studying does not predict exam performance.
In: Statistics and Probability
According to my records, the population of all past PSY 240 final percentage scores has a mean (μ) of 85 and standard deviation (σ) of 7 points. The new class of 36 students had a mean (M) final percentage score of 87 points. I conducted a hypothesis test to see if the new class would have significantly differently final percentage scores from the population of past students. I was interested in any type of “difference,” whether it’s an increase or a decrease in final percentage score. The significance level for my Z test was set at α= .05.
f. Determine the critical value for Z
l. In this statistical test, how high does the mean final percentage score from the new class have to be, at least, to be considered “significantly” higher than the pool of past final percentage scores?
Hint: In other words, when does the calculated Z equal the critical Z? What needs to be the sample mean for that to happen?
In: Statistics and Probability