State whether each of the following is always true (T) or not always true (F).
a) If X is a random variable, Corr X, (1/3)X= (1/3).
b) If X and Y are independent random variables then E(X|Y ) = E(X)
c) d) If fx(x) is the marginal density of a random variable X and fy(y|X = x) is the conditional density of a random variable Y , given a particular realization x of X, then the joint density of X and Y is given simply by f(x, y) = fx(x)fy(y|X = x)
d) If X1, X2, ..., Xn are independent random variables, each following a Bernoulli distribution with the same parameter p, then the sum Σn i=1Xi is a Binomial random variables, with parameters n and p.
e)If X1, X2, ..., X100 are independent, normally distributed random variables, then the average X¯ = 1 100Σ 100 i=1Xi of these random variables is itself a random variable following a normal distribution.
f) If Z1 and Z2 are independent random variables, each following a standard normal distribution, then Z1 + Z2 follows a standard normal distribution as well.
g) If X ∼ χ 2 (4) and Y ∼ t(1) then the 95th percentile of X exceeds the 95th percentile of Y .
h) If X ∼ F(5, 7) and Y ∼ F(7, 5), then the 5th percentile of X is greater than the 5th percentile of Y .
In: Statistics and Probability
Refer to the CDI data set in Appendix C.2.
a. For each geographic region, regress the number of serious crimes
in a CDI (Y) against
population density (X" total population divided by land area), per
capita personal income
(X2 ), and percent high school graduates (X3 ). Use first-order
regression model (6.5) with
three predictor variables. State the estimated regression
functions.
b. Are the estimated regression functions similar for the four
regions? Discuss.
c. Calculate MSE and R2 for each region. Are these measures similar
for the four regions?
Discuss.
d. Obtain the residuals for each fitted model and prepare a box
plot of the residuals for each
fitted modeL Interpret your plots and state your findings.
I Can do it by my self but Can you explain how what is meaning (b),
(c) and (d)
Just write explantions
In: Statistics and Probability
More than 100 million people around the world are not getting enough sleep; the average adult needs between 7.5 and 8 hours of sleep per night. College students are particularly at risk of not getting enough shut-eye.
A recent survey of several thousand college students indicated that the total hours of sleep time per night, denoted by the random variable X, can be approximated by a normalmodel with E(X) = 6.84 hours and SD(X) = 1.24 hours.
Question 1. Find the probability that the hours of sleep per night for a random sample of 4 college students has a mean x between 6.7 and 6.93.
(use 4 decimal places in your answer)
Question 2. Find the probability that the hours of sleep per night for a random sample of 16 college students has a mean x between 6.7 and 6.93.
(use 4 decimal places in your answer)
Question 3. Find the probability that the hours of sleep per night for a random sample of 25 college students has a mean x between 6.7 and 6.93.
(use 4 decimal places in your answer)
Question 4. The Central Limit Theorem was needed to answer questions 1, 2, and 3 above.
TrueFalse
In: Statistics and Probability
1A) Twenty laboratory mice were randomly divided into two groups of 10. Each group was fed according to a prescribed diet. At the end of 3 weeks, the weight gained by each animal was recorded. Do the data in the following table justify the conclusion that the mean weight gained on diet B was greater than the mean weight gained on diet A, at the α = 0.05 level of significance? Assume normality. (Use Diet B - Diet A.)
Diet A | 6 | 9 | 7 | 13 | 11 | 13 | 8 | 11 | 6 | 14 |
Diet B | 21 | 21 | 12 | 9 | 21 | 14 | 9 | 16 | 10 | 23 |
(a) Find t. (Give your answer correct to two
decimal places.)
(b) Find the p-value. (Give your answer correct to
four decimal places.)
1B) A bakery is considering buying one of two gas ovens. The bakery requires that the temperature remain constant during a baking operation. A study was conducted to measure the variance in temperature of the ovens during the baking process. The variance in temperature before the thermostat restarted the flame for the Monarch oven was 2.2 for 23 measurements. The variance for the Kraft oven was 3.2 for 21 measurements. Does this information provide sufficient reason to conclude that there is a difference in the variances for the two ovens? Assume measurements are normally distributed and use a 0.02 level of significance.
(a) Find F. (Give your answer correct to two
decimal places.)
(b) Find the p-value. (Give your answer correct to
four decimal places.)
In: Statistics and Probability
Given the following sample information, test the hypothesis that the treatment means are equal at the 0.10 significance level:
Treatment 1 | Treatment 2 |
Treatment 3 |
3 | 9 | 6 |
2 | 6 | 3 |
5 | 5 | 5 |
1 | 6 | 5 |
3 | 8 | 5 |
1 | 5 | 4 |
4 | 1 | |
7 | 5 | |
6 | ||
4 |
Complete the ANOVA table. (Round the SS, MS, and F values to 2 decimal places.)
Source | SS | DF | MS | F | ||
Factor | 46.96 Incorrect | 2 Correct | 23.48 Incorrect | 9.30 Incorrect | ||
Error | 53 Incorrect | 21 Correct | 2.52 Incorrect | |||
Total | Not attempted | 23 Correct |
Please explain how you got your answer. I am very new to statistics so the clearer the better.
In: Statistics and Probability
In: Statistics and Probability
How do you evaluate each of the assumptions of regression analysis?
In: Statistics and Probability
A) The proportion of subjects who answered the last question of their survey as "Very Satisfied" is 0.80. The hospital is hoping to increase that proportion to 0.9 and they consider a change in the proportion of 0.10 to be significant. For an alpha of 0.05 and a power of 80%, calculate the number of surveys they would need to collect in order to detect a difference. 4) Proportion of subjects who answered the last question of their survey as "Very Satisfied" is 0.80. The hospital is hoping to increase that proportion to 0.9 and they consider a change in the proportion of 0.10 to be significant. For an alpha of 0.01 and a power of 80%, calculate the number of surveys they would need to collect in order to detect a difference.
B) Proportion of subjects who answered the last question of their survey as "Very Satisfied" is 0.80. The hospital is hoping to increase that proportion to 0.9 and they consider a change in the proportion of 0.10 to be significant. For an alpha of 0.01 and a power of 80%, calculate the number of surveys they would need to collect in order to detect a difference.
In: Statistics and Probability
The Excel file BankData shows the values of the following variables for randomly selected 93 employees of a large bank. (A very similar data set was used in a court lawsuit against discrimination.)
Let
= monthly salary in dollars (SALARY),
= years of schooling at the time of hire (EDUCAT),
= number of months of previous work experience (EXPER),
= number of months that the individual was hired by the bank (MONTHS),
= dummy variable coded 1 for males and 0 for females (MALE).
Using the t-test studied in Section 10.2, you could find some evidence that the mean salary of all male employees is greater than the mean salaries of all female employees, and hence provide some support for a discrimination suit against the employer. It is recognized, however, that a simplecomparison of the mean salaries might be insufficient to conclude that the female employees have been discriminated against. Obviously there are other factors that affect the salary. These factors have been identified as and defined above.
Assume the following multiple linear regression model,
,
and apply Regression in Data Analysis of Excel with the 99% confidence level (see pages 312 – 314) to find the estimated regression equation
.
Note. Of course, Input Y Range is A1:A94, Input X range is B1:E94, and Labels should be checked.
1. Clearly show the estimated regression equation. What is the percentage of variation in the salary explained by this equation? Assuming that the values of and are fixed, what is the estimated difference between the predicted monthly salaries of male and female employees?
2. What salary would you predict for a male employee with 12 years educations, 10 months of previous work experience, and with the time hired equal to 15 months? What salary would you predict for a female employee with 12 years educations, 10 months of previous work experience, and with the time hired equal to 15 months? What is the difference between the two predicted salaries? Compare this difference with that stated in Task 1.
3. Is there a significant difference in the predicted salaries for male and female employees after accounting for the effects of the three other independent variables? To answer this question, conduct the ttest for the significance of at a 1% level of significance. Clearly show the null and alternative hypotheses to be tested, the value of the test statistic, the p-value of the test, your conclusion and its interpretation; see pages 322 – 323 and 333 – 335.
SALARY | EDUCAT | EXPER | MONTHS | MALE |
5620 | 10 | 12 | 22 | 1 |
5040 | 8 | 14 | 3 | 1 |
5100 | 9 | 36 | 15 | 1 |
5100 | 10 | 55 | 2 | 1 |
5220 | 12 | 29 | 14 | 1 |
5400 | 12 | 37 | 21 | 1 |
5400 | 12 | 38 | 11 | 1 |
5400 | 12 | 39 | 3 | 1 |
5400 | 10 | 48 | 8 | 1 |
5400 | 10 | 60 | 11 | 1 |
5700 | 15 | 74 | 5 | 1 |
6000 | 15 | 88 | 21 | 1 |
6000 | 12 | 98 | 12 | 1 |
6000 | 12 | 113 | 17 | 1 |
6000 | 12 | 115 | 14 | 1 |
6000 | 15 | 123 | 33 | 1 |
6000 | 14 | 152 | 11 | 1 |
6500 | 14 | 173 | 19 | 1 |
6000 | 15 | 150 | 13 | 1 |
6400 | 15 | 136 | 32 | 1 |
6000 | 15 | 156 | 12 | 1 |
6900 | 15 | 180 | 33 | 1 |
6000 | 15 | 156 | 16 | 1 |
6000 | 16 | 145 | 13 | 1 |
6300 | 15 | 220 | 17 | 1 |
6600 | 15 | 164 | 16 | 1 |
7800 | 15 | 259 | 33 | 1 |
6600 | 15 | 216 | 16 | 1 |
6840 | 15 | 142 | 17 | 1 |
6900 | 16 | 175 | 20 | 1 |
6900 | 15 | 132 | 24 | 1 |
8100 | 16 | 315 | 33 | 1 |
6300 | 15 | 187 | 30 | 1 |
6400 | 15 | 231 | 33 | 1 |
4620 | 10 | 12 | 22 | 0 |
4020 | 10 | 12 | 7 | 0 |
4290 | 12 | 5 | 10 | 0 |
4380 | 8 | 6 | 7 | 0 |
4380 | 8 | 8 | 6 | 0 |
4380 | 12 | 3 | 7 | 0 |
4380 | 12 | 4 | 10 | 0 |
4380 | 12 | 5 | 6 | 0 |
4440 | 10 | 11 | 2 | 0 |
4500 | 12 | 12 | 3 | 0 |
4500 | 12 | 8 | 19 | 0 |
4620 | 12 | 52 | 13 | 0 |
4800 | 10 | 70 | 20 | 0 |
4800 | 12 | 52 | 23 | 0 |
4800 | 12 | 11 | 12 | 0 |
4800 | 12 | 75 | 17 | 0 |
4800 | 12 | 63 | 22 | 0 |
4800 | 12 | 144 | 24 | 0 |
4800 | 12 | 163 | 12 | 0 |
4800 | 15 | 228 | 26 | 0 |
4800 | 12 | 381 | 10 | 0 |
4800 | 16 | 214 | 15 | 0 |
4980 | 10 | 318 | 25 | 0 |
5100 | 10 | 96 | 33 | 0 |
5100 | 12 | 36 | 15 | 0 |
5100 | 12 | 59 | 14 | 0 |
5100 | 10 | 115 | 1 | 0 |
5100 | 10 | 165 | 4 | 0 |
5100 | 15 | 123 | 12 | 0 |
5160 | 12 | 118 | 12 | 0 |
5220 | 10 | 102 | 29 | 0 |
5220 | 12 | 127 | 29 | 0 |
5280 | 10 | 90 | 11 | 0 |
5280 | 12 | 190 | 31 | 0 |
5280 | 12 | 107 | 11 | 0 |
5400 | 10 | 113 | 34 | 0 |
5400 | 12 | 128 | 33 | 0 |
5400 | 12 | 126 | 11 | 0 |
5400 | 12 | 112 | 33 | 0 |
5400 | 12 | 98 | 22 | 0 |
5400 | 12 | 82 | 29 | 0 |
5400 | 12 | 169 | 27 | 0 |
5400 | 12 | 124 | 31 | 0 |
5400 | 15 | 94 | 13 | 0 |
5400 | 15 | 49 | 27 | 0 |
5400 | 15 | 121 | 21 | 0 |
5400 | 15 | 122 | 33 | 0 |
5520 | 12 | 97 | 17 | 0 |
5520 | 12 | 196 | 32 | 0 |
5580 | 12 | 133 | 30 | 0 |
5640 | 12 | 155 | 9 | 0 |
5700 | 12 | 123 | 23 | 0 |
5700 | 12 | 117 | 25 | 0 |
5700 | 15 | 151 | 17 | 0 |
5700 | 15 | 161 | 11 | 0 |
5700 | 15 | 241 | 34 | 0 |
6000 | 12 | 121 | 30 | 0 |
6000 | 15 | 244 | 22 | 0 |
6120 | 12 | 209 | 21 | 0 |
In: Statistics and Probability
In: Statistics and Probability
9. Exercise 4.9 In a study of housing demand, the county assessor develops the following regression model to estimate the market value (i.e., selling price) of residential property within her jurisdiction. The assessor suspects that important variables affecting selling price (YY , measured in thousands of dollars) are the size of a house (X1X1 , measured in hundreds of square feet), the total number of rooms (X2X2 ), age (X3X3 ), and whether or not the house has an attached garage (X4X4 , No=0, Yes=1No=0, Yes=1 ). Y=α+β1X1+β2X2+β3X3+β4X4+εY=α+β1X1+β2X2+β3X3+β4X4+ε Now suppose that the estimate of the model produces following results: a=166.048a=166.048 , b1=3.459b1=3.459 , b2=8.015b2=8.015 , b3=−0.319b3=−0.319 , b4=1.186b4=1.186 , sb1=1.079sb1=1.079 , sb2=5.288sb2=5.288 , sb3=0.789sb3=0.789 , sb4=12.252sb4=12.252 , R2=0.838R2=0.838 , F-statistic=12.919F-statistic=12.919 , and se=13.702se=13.702 . Note that the sample consists of 15 randomly selected observations. According to the estimated model, holding all else constant, an additional 100 square feet of area means the market value selector 1
Which of the independent variables (if any) appears to be statistically significant (at the 0.05 level) in explaining the market value of residential property? Check all that apply. Size of the house (X1X1 ) Total number of rooms (X2X2 ) Age (X3X3 ) Having an attached garage (X4X4 ) What proportion of the total variation in sales is explained by the regression equation? 0.838 0.789 0.129 The given F-value shows that the assessor selector 1
Which of the following is an approximate 95 percent prediction interval for the selling price of a 15-year-old house having 18 hundred sq. ft., 5 rooms, and an attached garage? (237.382, 292.190) (157.232, 212.040) (170.934, 198.338) |
In: Statistics and Probability
In order to investigate if the systolic blood pressure measurements vary in standing and lying positions the systolic blood pressure levels of a sample of 12 persons were measured in both positions (first in standing position and then in lying position). The data of this experiment are recorded in SSPS file.
Please mentioned the steps in SPSS and screen shoot of it
The table is for the data.
Lying | Standing |
132.00 146.00 135.00 141.00 139.00 162.00 128.00 137.00 145.00 151.00 131.00 143.00 |
136.00 145.00 140.00 147.00 142.00 160.00 137.00 136.00 149.00 158.00 120.00 150.00 |
In: Statistics and Probability
From 104 of its restaurants, Noodles & Company managers collected data on per-person sales and the percent of sales due to "potstickers" (a popular food item). Both numerical variables failed tests for normality, so they tried a chi-square test. Each variable was converted into ordinal categories (low, medium, high) using cutoff points that produced roughly equal group sizes. At α = .10, is per-person spending independent of percent of sales from potstickers? Potsticker % of Sales Per-Person Spending Low Medium High Row Total Low 14 13 8 35 Medium 11 17 5 33 High 10 8 18 36 Col Total 35 38 31 104 You will need to open the Excel file. Then open Minitab. Copy the data (NOT THE TOTALS) into Minitab. Be sure that the 1st number goes into row 1 in Minitab and that you type the column headings (Low, Medium, High) into the grey shaded top header row in Minitab. PictureClick here for the Excel Data File (a) The hypothesis for the given issue is H0: Percentage of Sales and Per-Person Spending are independent. No Yes (b) Calculate the chi-square test statistic, degrees of freedom, and the p-value. (Round your test statistic value to 2 decimal places and p-value to 4 decimal places. Leave no cells blank - be certain to enter "0" wherever required.) Test statistic d.f. p-value (c) We reject the null and find dependence. No Yes
In: Statistics and Probability
The following data lists the ages of a random selection of actresses when they won an award in the category of Best Actress, along with the ages of actors when they won in the category of Best Actor. The ages are matched according to the year that the awards were presented. Complete parts (a) and (b) below.
a. Use the sample data with a
0.010.01
significance level to test the claim that for the population of ages of Best Actresses and Best Actors, the differences have a mean less than 0 (indicating that the Best Actresses are generally younger than BestActors).In this example,
mu Subscript dμd
is the mean value of the differences d for the population of all pairs of data, where each individual difference d is defined as the actress's age minus the actor's age. What are the null and alternative hypotheses for the hypothesis test?
Upper H 0H0:
mu Subscript dμd
equals=
00 year(s)
Upper H 1H1:
mu Subscript dμd
less than<
00 year(s)
(Type integers or decimals. Do not round.)
Identify the test statistic.
tequals=negative 3.68−3.68
(Round to two decimal places as needed.)
Identify the P-value.
P-valueequals=0.0020.002
(Round to three decimal places as needed.)
What is the conclusion based on the hypothesis test?
Since the P-value is
less than or equal to
the significance level,
reject
the null hypothesis. There
is
sufficient evidence to support the claim that actresses are generally younger when they won the award than actors.
b. Construct the confidence interval that could be used for the hypothesis test described in part (a). What feature of the confidence interval leads to the same conclusion reached in part (a)?
The confidence interval is
nothing
year(s)less than<mu Subscript dμdless than<nothing
year(s).
(Round to one decimal place as needed.)
What feature of the confidence interval leads to the same conclusion reached in part (a)?
Since the confidence interval contains
zero,
only positive numbers,
only negative numbers,
reject
fail to reject
the null hypothesis.
Actress (years) Actor (years)
30 58
30 41
31 35
28 42
33 31
27 34
25 46
42 38
29 41
31 43
In: Statistics and Probability
Question 1:Professor Handy measured the time in seconds required to catch a falling meter stick for 12 randomly selected students’ dominant hand and nondominant hand. The Minitab Express file contains these measurements. Professor Handy claims that the reaction time in an individual’s dominant hand is less than the reaction time in their nondominant hand. Assuming that the differences follow a normal distribution, test the claim at the 5% significance level.
Question 2:The New England Patriots. The 2017 roster of the New England Patriots, winners of the 2017 NFL Super Bowl included 12 defensive linemen and 9 offensive linemen. The Minitab Express file for this problem contains the weights in pounds of the offensive and defensive linemen. Use this data set to test the claim that the defensive linesmen weigh less that the offensive linemen at the 5% level of significance.
Question 3: Stress and weight in rats. In a study of the effects of stress on weight in rats, 71 rats were randomly assigned to either a stressful environment or a control (nonstressful) environment. After 21 days, the change in weight (in grams) was determined for each rat. The table below summarizes the data on weight gain. Test the claim that stress effects weight. (Use a 10% significance level.)
Group |
n |
Sample mean |
Sample Standard Dev. |
Stress |
20 |
26 |
13.4 |
No stress |
51 |
32 |
14.2 |
Please provide all answers in the following format:
Step 1: State the null and alternative hypothesis. (Use “mu” for the symbol μ.)
Step 2: Calculate the test statistic.
Step 3: Find the p-value.
Step 4: State your conclusion. (Do not just say “Reject H0” or “Do not reject H0.” State the conclusion in the context of the problem.)
In: Statistics and Probability