In: Statistics and Probability
1) A 17.27 gram sample of aluminum initially at 92 degrees Celsius is added to a container containing water. The final temperature of the metal is 25.1 degrees Celsius. What is the total amount of energy in Joules added to the water? What was the energy lost by the metal?
2) Mixing 25.0 mL of 1.2 M HCl and 25.0 mL of 1.1 M NaOH were mixed. The temperature of the initial solution was 22.4 degrees Celsius. Assuming a Heat of Neutralization of -55.8 KJ/mol, what would the final temperature be if the specific heat for this solution is 4.03 J/g degrees Celsius?
In: Statistics and Probability
A pinball machine has 7 holes through which a ball can drop. Five balls are played and we observe which hole each ball goes down. For example, the first ball could go down hole 1, hole 2, ..., or hole 7 (similarly for the other 4 balls). On each play, assume the ball is equally likely to go down any one of the 7 holes. Find the probability that more than one ball goes down at least one of the holes.
In: Statistics and Probability
Please use R-Studio to answer the following question.
I mentioned that dnorm(x, mean, sd) doesn’t give the probability of being exactly equal to x. So what does it give? It gives you the height of the normal curve at that x. Plot the graph of a normal random variable with mean 50 and standard deviation 10. Note you will have to create range of xvalues between 0 to 100. Then find the probability using dnorm() then use plot() command
In: Statistics and Probability
For each of the following three cases, explain (i) the hypotheses with a plausible definition of p1 and p2, (ii) whether or not the data indicate practical significance (use common sense and/or general knowledge), and (iii) whether or not the data indicate statistical significance. (a) A recent study of perfect pitch tested 2,700 students in American music conservatories. It found that 7% of non-Asian students and 32% of Asian students have perfect pitch. A two-sample Z-test of the difference in proportions resulted in a p-value of < 0.0001. (b) In July 1974, the PEW Research Center selected a large sample of voters. Sixty-six percent of those interviewed disapproved of President Nixon. In July 2007, the same PEW Research Center selected a comparable sample. In this case, 64.5% of those interviewed expressed disapproval of President Bush. The researchers pointed out that the p-value for comparing the two sample results was 0.023. (c) In a survey conducted in a statistics class at Boston College, students were asked their views on a number of social issues; 56% of the male students and 38% of the female students supported the death penalty. In a statistics lab, the students computed the corresponding p-value as 0.21.
In: Statistics and Probability
We wish to predict the salary for baseball players (yy) using the variables RBI (x1x1) and HR (x2x2), then we use a regression equation of the form ˆy=b0+b1x1+b2x2y^=b0+b1x1+b2x2.
The following is a chart of baseball players' salaries and statistics from 2016.
Player Name | RBI's | HR's | Salary (in millions) |
---|---|---|---|
Miquel Cabrera | 108 | 38 | 28.050 |
Yoenis Cespedes | 86 | 31 | 27.500 |
Ryan Howard | 59 | 25 | 25.000 |
Albert Pujols | 119 | 31 | 25.000 |
Robinson Cano | 103 | 39 | 24.050 |
Mark Teixeira | 44 | 15 | 23.125 |
Joe Mauer | 49 | 11 | 23.000 |
Hanley Ramirez | 111 | 30 | 22.750 |
Justin Upton | 87 | 31 | 22.125 |
Adrian Gonzalez | 90 | 18 | 21.857 |
Jason Heyward | 49 | 7 | 21.667 |
Jayson Werth | 70 | 21 | 21.571 |
Matt Kemp | 108 | 35 | 21.500 |
Jacoby Ellsbury | 56 | 9 | 21.143 |
Chris Davis | 84 | 38 | 21.119 |
Buster Posey | 80 | 14 | 20.802 |
Shin-Soo Choo | 17 | 7 | 20.000 |
Troy Tulowitzki | 79 | 24 | 20.000 |
Ryan Braun | 91 | 31 | 20.000 |
Joey Votto | 97 | 29 | 20.000 |
Hunter Pence | 57 | 13 | 18.500 |
Prince Fielder | 44 | 8 | 18.000 |
Adrian Beltre | 104 | 32 | 18.000 |
Victor Martinez | 86 | 27 | 18.000 |
Carlos Gonzalez | 100 | 25 | 17.454 |
Matt Holliday | 62 | 20 | 17.000 |
Brian McCann | 58 | 20 | 17.000 |
Mike Trout | 100 | 29 | 16.083 |
David Ortiz | 127 | 38 | 16.000 |
Adam Jones | 83 | 29 | 16.000 |
Curtis Granderson | 59 | 30 | 16.000 |
Colby Rasmus | 54 | 15 | 15.800 |
Matt Wieters | 66 | 17 | 15.800 |
J.D. Martinez | 68 | 22 | 6.750 |
Brandon Crawford | 84 | 12 | 6.000 |
Rajai Davis | 48 | 12 | 5.950 |
Aaron Hill | 38 | 10 | 12.000 |
Coco Crisp | 55 | 13 | 11.000 |
Ben Zobrist | 76 | 18 | 10.500 |
Justin Turner | 90 | 27 | 5.100 |
Denard Span | 53 | 11 | 5.000 |
Chris Iannetta | 24 | 7 | 4.550 |
Leonys Martin | 47 | 15 | 4.150 |
Justin Smoak | 34 | 14 | 3.900 |
Jorge Soler | 31 | 12 | 3.667 |
Evan Gattis | 72 | 32 | 3.300 |
Logan Forsythe | 52 | 20 | 2.750 |
Jean Segura | 64 | 20 | 2.600 |
a) Use software to find the multiple linear regression equation.
Enter the coefficients rounded to 4 decimal places.
ˆy=y^= + x1x1
+ x2x2
b) Use the multiple linear regression equation to predict the
salary for a baseball player with an RBI of 49 and HR of 22. Round
your answer to 1 decimal place, do not convert numbers to
dollars.
millions of dollars
c) Holding all other variables constant, what is the correct interpretation of the coefficient b1=0.111b1=0.111 in the multiple linear regression equation?
d) Holding all other variables constant, what is the correct interpretation of the coefficient b2=0.0371b2=0.0371 in the multiple linear regression equation?
In: Statistics and Probability
Among a student group 46% use Google Chrome, 20% Internet Explorer, 10% Firefox, 5% Mozilla, and the rest use Safari. What is the probability that you need to pick 7 students to find 2 students using Google Chrome? Report answer to 3 decimals.
In: Statistics and Probability
Average Sleep Time on a School Night |
Students |
4 hours |
8 |
5 hours |
9 |
6 hours |
14 |
7 hours |
12 |
8 hours |
15 |
9 hours |
4 |
10 hours |
0 |
Ho: 72.7% of high school students (grade 9-12) do not get enough sleep at night. (minimum 8 hours)
Ha: 72.7% of high school students (grade 9-12) do get enough sleep at night.
Sample size:
Sample mean:
Sample deviation:
Record the hypothesis test. Use 5% level of significance Include 95% confidence interval on solution sheet.
Create graph to illustrates results.
In: Statistics and Probability
Run descriptive statistics on the dementia patients’ memory scores (make sure to include the mean and confidence interval by using the “Explore” option as shown in this week’s presentation).
You want to compare your sample dementia patients with a population of age matched controls, and you only know their mean (m = 12.2). Run a single sample t test. Paste the output below.
Write a results section in current APA style describing the outcome. All homework results sections must follow the example given in the SPSS presentation and in the textbook. Results sections are multiple sentences (a single paragraph) and must include the APA-formatted statistical statement and whether the null hypothesis is accepted or rejected.
You want to compare your sample dementia patients with a population of patients with TBI and you only know their mean (m = 8.1). Run a single sample t test. Paste the output below.
Write a results section in current APA style describing the outcome. All homework results sections must follow the example given in the SPSS presentation and in the textbook. Results sections are multiple sentences (a single paragraph) and must include the APA-formatted statistical statement and whether the null hypothesis is accepted or rejected.
8 |
3 |
11 |
12 |
6 |
5 |
4 |
8 |
9 |
8 |
7 |
9 |
5 |
4 |
10 |
In: Statistics and Probability
Identify an issue, problem, or opportunity facing a team member's organization that may be examined using hypothesis testing and a regression analysis.
Based on the following issue:
Proposal for the group project:
An issue facing my organization is evaluation of management by staff. Just last month, a survey was sent out (again) for nurses to evaluate management's performance based on certain criteria. Out of 150 staff, only 48 responded and out of the 48, only 36% believed our managers were doing a good job managing the unit. The expectation was to get a score of more than 50% and management believes that the response is not a true representation of the unit.
This is the only scenario I can think of with true values and we can analyze this using p-value.
I
In: Statistics and Probability
Explain the influence a level of significance and sample size has on hypothesis testing. Provide an example of the influence and explain how it impacts business decisions. In replies to peers, discuss whether you agree or disagree with the example provided and justify your response.
In: Statistics and Probability
Please use R
GPA ACT ITS RP
3.897 21 122 99
3.885 14 132 71
3.778 28 119 95
2.540 22 99 75
3.028 21 131 46
3.865 31 139 77
2.962 32 113 85
3.961 27 136 99
0.500 29 75 13
3.178 26 106 97
3.310 24 125 69
3.538 30 142 99
3.083 24 120 97
3.013 24 107 55
3.245 33 125 93
2.963 27 121 80
3.522 25 119 63
3.013 31 128 78
2.947 25 106 93
2.118 20 123 22
2.563 24 111 84
3.357 21 113 87
3.731 28 134 98
3.925 27 128 95
3.556 28 126 63
3.101 26 121 79
2.420 28 104 86
2.579 22 113 90
3.871 26 133 97
3.060 21 125 39
3.927 25 128 97
2.375 16 112 57
2.929 28 107 67
3.375 26 115 81
2.857 22 119 75
3.072 24 113 63
3.381 21 115 15
3.290 30 110 95
3.549 27 122 93
3.646 26 118 99
2.978 26 114 90
2.654 30 112 99
2.540 24 106 85
2.250 26 95 84
2.069 29 102 58
2.617 24 114 86
2.183 31 116 82
2.000 15 93 34
2.952 19 120 34
3.806 18 117 23
2.871 27 119 95
3.352 16 115 41
3.305 27 113 28
2.952 26 108 68
3.547 24 116 54
3.691 30 135 77
3.160 21 108 58
2.194 20 110 73
3.323 30 124 94
3.936 29 130 98
2.922 25 118 99
2.716 23 110 91
3.370 25 117 95
3.606 23 123 72
2.642 30 116 65
2.452 21 109 53
2.655 24 110 81
3.714 32 126 41
1.806 18 99 84
3.516 23 121 84
3.039 20 115 35
2.966 23 127 70
2.482 18 99 15
2.700 18 108 47
3.920 29 129 98
2.834 20 103 77
3.222 23 122 72
3.084 26 118 29
4.000 28 135 80
3.511 34 139 88
3.323 20 128 80
3.072 20 120 46
2.079 26 114 89
3.875 32 133 91
3.208 25 123 95
2.920 27 111 83
3.345 27 122 92
3.956 29 136 99
3.808 19 140 41
2.506 21 109 68
3.886 24 133 98
2.183 27 98 59
3.429 25 134 89
3.024 18 124 89
3.750 29 128 92
3.833 24 149 97
3.113 27 121 43
2.875 21 117 52
2.747 19 110 82
2.311 18 104 61
1.841 25 95 72
1.583 18 96 33
2.879 20 117 97
3.591 32 130 97
2.914 24 121 92
3.716 35 125 99
2.800 25 112 61
3.621 28 136 72
3.792 28 129 99
2.867 25 106 76
3.419 22 108 66
3.600 30 138 70
2.394 20 106 44
2.286 20 111 33
1.486 31 101 77
3.885 20 113 57
3.800 29 131 96
3.914 28 140 97
1.860 16 111 65
2.948 28 110 85
The director of admissions of a small college selected 120 students at random from the new freshman class in a study to determine whether a student’s grade point average (GPA) at the end of the freshman year (y) can be predicted from the ACT test score (x1). The results of the study can be found in the hmw6 prob1.txt file. (Note: The hmw6 prob1.txt file also includes data on other variables that will be used in later parts. For parts (a)-(c) use only GPA and ACT.)
(a) Fit a simple linear regression model relating y with x1.
(b) Plot the residuals ei against the fitted values ˆyi . What departures from the regression model assumptions can be studied from this plot? What are your findings? (Note: If you are not sure about the validity of any of the assumptions, perform a formal test to verify your answer.) (
c) Prepare a normal probability plot (QQ plot) of the residuals. What assumption can be tested from this plot and what do you conclude? (Note: You can also use the formal test to reinforce your conclusion).
(d) Information is given for each student on two variables not included in the model, namely, intelligence test score (ITS-x2) and high school class rank percentile (RP-x3). Plot the residuals you obtained in part (b) against x2 and x3 on separate graphs to ascertain whether the model can be improved by including either of these variables. What do you conclude? (Hint: The residuals represent any variability that was not able to be explained by x1. Therefore, if you see any pattern between the residuals and any other predictor omitted from the model, there is an indication that the predictor will be useful to be added in the model.)
Hint: To read the data in R, save the txt file in the same working director as the one used by R. Then, use the command data=read.table(‘hmw6_prob1.txt’, header=T) y=data$GPA x1=data$ACT x2=data$ITS x3=data$RP
In: Statistics and Probability
The quantity of dissolved oxygen is a measure of water pollution in lakes, rivers, and streams. Water samples were taken at four different locations in a river in an effort to determine if water pollution varied from location to location. Location I was 500 meters above an industrial plant water discharge point and near the shore. Location II was 200 meters above the discharge point and in midstream. Location III was 50 meters downstream from the discharge point and near the shore. Location IV was 200 meters downstream from the discharge point and in midstream. The following table shows the results. Lower dissolved oxygen readings mean more pollution. Because of the difficulty in getting midstream samples, ecology students collecting the data had fewer of these samples. Use a 5% level of significance. Do we reject or not reject the claim that the quantity of dissolved oxygen does not vary from one location to another?
Location I | Location II | Location III | Location IV |
7.5 | 6.1 | 4.3 | 4.7 |
6.1 | 7.2 | 5.4 | 5.3 |
7.8 | 7.8 | 4.9 | 6.1 |
6.8 | 7.9 | 5.5 | |
6.5 | 4.1 |
(b) Find SSTOT, SSBET, and SSW and check that SSTOT = SSBET + SSW. (Use 3 decimal places.)
SSTOT | = | |
SSBET | = | |
SSW | = |
Find d.f.BET, d.f.W,
MSBET, and MSW. (Use 3 decimal
places for MSBET, and
MSW.)
dfBET | = | |
dfW | = | |
MSBET | = | |
MSW | = |
Find the value of the sample F statistic. (Use 3 decimal
places.)
What are the degrees of freedom?
(numerator)=
(denominator)=
(f) Make a summary table for your ANOVA test.
Source of Variation |
Sum of Squares |
Degrees of Freedom |
MS | F Ratio |
P Value | Test Decision |
Between groups | ---Select--- p-value > 0.100 0.050 < p-value < 0.100 0.025 < p-value < 0.050 0.010 < p-value < 0.025 0.001 < p-value < 0.010 p-value < 0.001 | ---Select--- Do not reject H0. Reject H0. | ||||
Within groups | ||||||
Total |
In: Statistics and Probability
The marketing manager of a firm that produces laundry products decides to test market a new laundry product in each of the firm's two sales regions. He wants to determine whether there will be a difference in mean sales per market per month between the two regions. A random sample of 12 12 supermarkets from Region 1 had mean sales of 77.7 with a standard deviation of 8.7. A random sample of 17 supermarkets from Region 2 had a mean sales of 82.5 with a standard deviation of 6.8. Does the test marketing reveal a difference in potential mean sales per market in Region 2? Let μ1 be the mean sales per market in Region 1 and μ2 be the mean sales per market in Region 2. Use a significance level of α=0.05 for the test. Assume that the population variances are not equal and that the two populations are normally distributed.
Step 1 of 4:
State the null and alternative hypotheses for the test.
Step 2 of 4:
Compute the value of the t test statistic. Round your answer to three decimal places.
Step 3 of 4:
Determine the decision rule for rejecting the null hypothesis H0H0. Round your answer to three decimal places.
Step 4 of 4:
State the test's conclusion. (reject or fail to reject the null hypothesis)
In: Statistics and Probability
(Question 8) 2 pts A process is normally distributed with a mean of 10.2 hits per minute and a standard deviation of 1.04 hits. If a randomly selected minute has 13.9 hits, would the process be considered in control or out of control? Out of control as this one data point is more than three standard deviations from the mean In control as only one data point would be outside the allowable range In control as this one data point is not more than three standard deviations from the mean Out of control as this one data point is more than two standard deviations from the mean Flag this Question
(Question 9) 2 pts The candy produced by a company has a sugar level that is normally distributed with a mean of 16.8 grams and a standard deviation of 0.9 grams. The company takes readings of every 10th bar off the production line. The reading points are 17.3, 14.9, 18.3, 16.5, 16.1, 17.4, 19.4. Is the process in control or out of control and why? It is out of control as the values jump above and below the mean It is in control as the data points more than 2 standard deviations from the mean are far apart It is in control as none of these data points is more than 3 standard deviations from the mean It is out of control as two of these data points are more than 2 standard deviations from the mean Flag this Question
(Question 10) 2 pts The toasters produced by a company have a normally distributed life span with a mean of 5.8 years and a standard deviation of 0.9 years, what warranty should be provided so that the company is replacing at most 5% of their toasters sold? 4.6 years 5.9 years 7.3 years 4.3 years Flag this Question
(Question 11) 2 pts A running shoe company wants to sponsor the fastest 5% of runners. You know that in this race, the running times are normally distributed with a mean of 7.2 minutes and a standard deviation of 0.56 minutes. How fast would you need to run to be sponsored by the company? 6.3 minutes 6.1 minutes 8.3 minutes 8.1 minutes Flag this Question
(Question 12) 2 pts The weights of bags of peas are normally distributed with a mean of 13.50 ounces and a standard deviation of 1.06 ounces. Bags in the upper 5% are too heavy and must be repackaged. What is the most that bag and weigh and not need to be repackaged? 15.36 ounces, 11.64 ounces, 15.24 ounces, 11.76 ounces
In: Statistics and Probability