A standard deck of 52 cards is shuffled and dealt. Let X1 be the number of cards appearing before the first ace, X2 the number of cards between the first and second ace (not counting either ace), X3 the number between the second and third ace, X4 the number between the third and forth ace, and X5 the number after the last ace. It can be shown that each of these random variables Xi had the same distribution, i=1,2,...,5, and you can assume this to be true.
a) Write down a formula for P(Xi=k), 0≤k≤48
b) Show that E(Xi)= 9.6 (Hint:Don't use your answer to part a)
c) Are X1,X2,...,X5 pairwise independent? Prove your answer.
In: Math
Download the dataset CARS1 from BlackBoard. a. Do not worry about outliers. Assume the data is correct and any outliers will remain in the dataset. b. Do scatterplot and analyze the results. c. Test for correlation (correlation coefficient) d. Regress weight (column 2) against gas mileage in the city (column 1). Make sure you make gas mileage the dependent (Y) variable. e. Determine and fully explain R2 MPG City Weight 19 3545 23 2795 23 2600 19 3515 23 3245 17 3930 20 3115 22 3235 17 3995 22 3115 23 3240 17 4020 18 3220 19 3175 20 3450 19 3225 17 3985 32 2440 29 2500 28 2290
In: Math
Engineers at the American Lighting Company recently developed a new threeway light bulb that they say is more energy efficient than the company’s existing threeway light bulb. The also claim that the bulb will outlast the current bulb, which has an average lifetime of 700 hours. The standard deviation (σ) for the lifetime of bulbs is 75 hours. The American Lighting Company has decided that before it begins full scale production on the new light bulbs it should take a sample of 225 bulbs and determine whether the mean life of the new bulb exceeds the old bulb’s 700 hours. The sample of 225 bulbs gave a sample mean of 704 hours. Assuming a significance level of .05 perform all hypothesis testing steps. Does the sample support the claim that the average lifetime of the new bulb is longer?
In: Math
In the healthy handwashing survey, it was found the 64% of adult Americans operate the flusher of toilets in public restrooms with their foot. Suppose you survey a random sample of 740 adult American women aged 1824 years. Use normal approximation to the binomial to approximate the probability of following.
a) check the conditions for normal distribution approximation
c)determine probability of exactly 500 of those surveyed flush toilets in public restrooms with their foot.
d) determine probability of no more than 490 of those surveyed flush toilets in public restrooms with their foot.
In: Math
Poissson distribution
In order to control the polishing quality of a lens, a certain company is used to finish the number of spots on the surface considering the defective lens if 3 or more spots appear on it. The average rate is 2 defects per cm2. Calculate the probability that a 4cm2 lens will not be classified as defective.
In: Math
The health of the bear population in a park is monitored by periodic measurements taken from anesthetized bears. A sample of the weights of such bears is given below. Find a 95% confidence interval estimate of the mean of the population of all such bear weights. The 95% confidence interval for the mean bear weight is the following.
data table 80 344 416 348 166 220 262 360 204 144 332 34 140 180
In: Math
We are attempting to see if we can justify the
additional expense of premium fuel over economy; i.e., we will only
use premium fuel if we are convinced that it actually helps mpg.
For each of the 25 cars at our disposal, we randomly picked one
fuel to use first, drove the car until nearly empty, and calculated
mpg when refilling the tank. We then filled it with the other fuel,
repeating the process, obtaining mpg for that fuel, driving over
the same roads.
Explain why it would be important to randomize the
order in which we test two gasoline types. Give a specific example
of how not randomizing might cause a problem with this
design.
In: Math
PART 1: Determining the Appropriate Test
Assume that the following three questions appeared on a survey that is being used to collect data on consumer behavior for your company.
Question 1: Do you subscribe to Netflix? (Circle One) YES NO
Question 2: What is your monthly average income in dollars? $__________
Question 3: In which of the 5 U.S. regions do you reside? (Circle one)
Northeast Southwest West Midwest Southeast
PART 2: Analysis of a ChiSquare test
Assume that a research study was conducted that included the following survey questions:
Question 1: Have you ever attended an event at the city Performing Arts Center? YES NO
Question 2: Have you ever attended an event at the city Athletic Center? YES NO
A sample of 93 people answered the survey questions. The research team utilized Minitab statistical software to create the results shown below. You will find a contingency table with the ChiSquare test statistic and pvalue at the bottom.

CHISQUARE TEST FOR ASSOCIATION: PERFORMING ARTS CENTER ATTENDANCE, ATHLETIC CENTER ATTENDANCE
Rows: Performing Arts Center Attendance Columns: Athletic Center Attendance
NO 
YES 
All 

NO 
15 
1 
16 
9.634 
6.366 

YES 
41 
36 
77 
46.366 
30.634 

All 
56 
37 
93 
Cell Contents: Count
Expected Count
ChiSquare 
DF 
PValue 

Pearson 
9.072 
1 
0.003 
Review the study and the Minitab results. Then answer the following questions:
In: Math
The next two questions (7 and 8) refer to the following:
The weight of bags of organic fertilizer is normally distributed with a mean of 60 pounds and a standard deviation of 2.5 pounds.
7. What is the probability that a random sample of 33 bags of organic fertilizer has a total weight between 1963.5 and 1996.5 pounds?
8. If we take a random sample of 9 bags of organic fertilizer, there is a 75% chance that their mean weight will be less than what value? Keep 4 decimal places in intermediate calculations and report your final answer to 4 decimal places.
The next two questions (8 and 9) refer to the following:
Question 10 and 11
Suppose that 40% of students at a university drive to campus.
10. If we randomly select 100 students from this university, what is the approximate probability that less than 35% of them drive to campus?
Keep 6 decimal places in intermediate calculations and report your final answer to 4 decimal places.
11. If we randomly select 100 students from this university, what is the approximate probability that more than 50 of them drive to campus?
Keep 6 decimal places in intermediate calculations and report your final answer to 4 decimal places.
12. Suppose that IQs of adult Canadians follow a normal distribution with standard deviation 15. A random sample of 30 adult Canadians has a mean IQ of 112.
We would like to construct a 97% confidence interval for the true mean IQ of all adult Canadians. What is the critical value z* to be used in the interval? (You do not need to calculate the calculate the confidence interval. Simply find z*. Input a positive number since we always use the positive z* value when calculating confidence intervals.)
Report your answer to 2 decimal places.
In: Math
Components of a certain type are shipped to a supplier in batches of ten. Suppose that 51% of all such batches contain no defective components, 33% contain one defective component, and 16% contain two defective components. Two components from a batch are randomly selected and tested. What are the probabilities associated with 0, 1, and 2 defective components being in the batch under each of the following conditions? (Round your answers to four decimal places.)
(a) Neither tested component is defective.
no defective components :
one defective component :
two defective components :
(b) One of the two tested components is defective. [Hint: Draw a tree diagram with three firstgeneration branches for the three different types of batches.]
no defective components :
one defective component :
two defective components :
In: Math
The FBI wants to determine the effectiveness of their 10 Most Wanted list. To do so, they need to find out the fraction of people who appear on the list that are actually caught.
Step 2 of 2 :
Suppose a sample of 362 suspected criminals is drawn. Of these people, 119 were captured. Using the data, construct the 90% confidence interval for the population proportion of people who are captured after appearing on the 10 Most Wanted list. Round your answers to three decimal places.
In: Math
*Repeated Measures Analysis of Variance*
Examining differences between groups on one or more variables /
same participants being tested more than once / with more than two
groups.
What test and method would be used to examine the difference between male and female users considering the different variable (Pain Reliever, Sedative, Tranquilizer & Stimulant)
Create a graph illustration.
Describe the Graph.
TABLE 1.22A, Misuse separated by age and 2016, 2017  
Age  Misuse_2016  Misuse_2017  
12  66  55  
13  90  105  
14  160  127  
15  253  234  
16  322  295  
17  426  415  
18  537  466  
19  631  503  
20  692  671  
21  700  661  
22  659  728  
23  581  660  
24  648  681  
25  577  585  
AGE  PR2016  PR2017  TR2016  TR2017  STIM2016  STIM2017  SED2016  SED2017 
12  49  40  12  6  6  7  5  74 
13  78  78  8  23  11  23  8  55 
14  111  84  37  48  47  38  15  15 
15  192  152  92  69  74  83  19  12 
16  196  188  122  132  96  98  25  18 
17  255  226  162  181  193  202  28  18 
18  259  233  232  184  254  229  21  17 
19  272  236  271  209  313  259  40  25 
20  303  304  255  252  431  352  22  14 
21  341  317  226  228  376  397  42  35 
22  301  353  221  282  355  407  16  22 
23  281  334  234  245  284  323  37  18 
24  369  365  214  278  302  316  43  44 
25  327  318  193  202  263  264  34  25 
Misuse of Prescription Drugs, Gender, Age  
Table 1.53A PAIN RELIEVERS (DEMOGRAPHICS)  
Gender  1217(16)  1217(17)  1825(16)  1825(17)  Total  
Male  413  342  1328  1263  3,346  
Female  469  425  1126  1197  3217  
Table 1.57A TRANQUILIZERS (DEMOGRAPHICS)  
Gender  1217(16)  1217(17)  1825(16)  1825(17)  Total  
Male  203  227  914  1004  2,348  
Female  231  231  930  877  2269  
Table 1.60A STIMULANTS (DEMOGRAPHICS)  
Gender  1217(16)  1217(17)  1825(16)  1825(17)  Total  
Male  243  238  1377  1474  3,332  
Female  184  214  1201  1071  2670  
Table 1.63A SEDATIVES (DEMOGRAPHICS)  
Gender  1217(16)  1217(17)  1825(16)  1825(17)  Total  
Male  39  41  114  105  299  
Female  61  32  141  94  328  
In: Math
Workers in several industries were surveyed to determine the proportion of workers who feel their industry is understaffed. In the government sector, 37% of the respondents said they were understaffed, in the health care sector 33% said they were understaffed and in the education sector 28% said they were understaffed (USA Today, January 11, 2010). Suppose that 200 workers were surveyed in each industry.
b) Assuming the same sample size will be used in each industry, how large would the sample need to be to ensure that the margin of error is 5% or less for each of the three confidence intervals? Perform the calculation using an appropriate pilot study proportion as well as a worst case scenario.
In: Math
Determine and interpret the linear correlation coefficient, and use linear regression to find a best fit line for a scatter plot of the data and make predictions. Scenario According to the U.S. Geological Survey (USGS), the probability of a magnitude 6.7 or greater earthquake in the Greater Bay Area is 63%, about 2 out of 3, in the next 30 years. In April 2008, scientists and engineers released a new earthquake forecast for the State of California called the Uniform California Earthquake Rupture Forecast (UCERF). As a junior analyst at the USGS, you are tasked to determine whether there is sufficient evidence to support the claim of a linear correlation between the magnitudes and depths from the earthquakes. Your deliverables will be a PowerPoint presentation you will create summarizing your findings and an excel document to show your work. Concepts Being Studied • Correlation and regression • Creating scatterplots • Constructing and interpreting a Hypothesis Test for Correlation using r as the test statistic You are given a spreadsheet that contains the following information: • Magnitude measured on the Richter scale • Depth in km Using the spreadsheet, you will answer the problems below in a PowerPoint presentation. What to Submit The PowerPoint presentation should answer and explain the following questions based on the spreadsheet provided above. • Slide 1: Title slide • Slide 2: Introduce your scenario and data set including the variables provided. • Slide 3: Construct a scatterplot of the two variables provided in the spreadsheet. Include a description of what you see in the scatterplot. • Slide 4: Find the value of the linear correlation coefficient r and the critical value of r using α = 0.05. Include an explanation on how you found those values. • Slide 5: Determine whether there is sufficient evidence to support the claim of a linear correlation between the magnitudes and the depths from the earthquakes. Explain. • Slide 6: Find the regression equation. Let the predictor (x) variable be the magnitude. Identify the slope and the yintercept within your regression equation. • Slide 7: Is the equation a good model? Explain. What would be the best predicted depth of an earthquake with a magnitude of 2.0? Include the correct units. • Slide 8: Conclude by recapping your ideas by summarizing the information presented in context of the scenario. Along with your PowerPoint presentation, you should include your Excel document which shows all calculations.
MAG  DEPTH 
0.70  7.2 
0.74  2.2 
0.64  13.9 
0.39  15.5 
0.70  3.0 
2.20  2.4 
1.98  14.4 
0.64  5.7 
1.22  6.1 
0.20  9.1 
1.64  17.2 
1.32  8.7 
2.95  9.3 
0.90  12.3 
1.76  7.4 
1.01  7.0 
1.26  17.1 
0.00  8.8 
0.65  6.0 
1.46  19.1 
1.62  12.7 
1.83  4.7 
0.99  8.6 
1.56  6.0 
0.40  14.6 
1.28  4.9 
0.83  19.1 
1.34  9.9 
0.54  16.1 
1.25  4.6 
0.92  4.9 
1.00  16.1 
0.79  14.0 
0.79  4.2 
1.44  5.9 
1.00  5.4 
2.24  15.6 
2.50  7.7 
1.79  15.4 
1.25  16.4 
1.49  4.9 
0.84  8.1 
1.42  7.5 
1.00  14.1 
1.25  11.1 
1.42  16.0 
1.35  5.5 
0.93  7.3 
0.40  3.1 
1.39 
6.0 
In: Math
The table below summarizes baseline characteristics of patients participating in a clinical trial. a) Are there any statistically significant differences in baseline characteristics between treatment groups? Justify your answer.
b) Write the hypotheses and the test statistic used to compare ages between groups. (No calculations – just H0, H1 and form of the test statistic).
c) Write the hypotheses and the test statistic used to compare % females between groups. (No calculations – just H0, H1 and form of the test statistic).
d) Write the hypotheses and the test statistic used to compare % females between groups. (No calculations – just H0, H1 and form of the test statistic.) Characteristic Placebo (n = 125) Experimental ( n =125) P Mean (+ SD) Age 54 + 4.5 53 + 4.9 0.7856 % Female 39% 52% 0.0289 % Less than High School Education 24% 22% 0.0986 % Completing High School 37% 36% % Completing Some College 39% 42% Mean (+ SD) Systolic Blood Pressure 136 + 13.8 134 + 12.4 0.4736 Mean (+ SD) Total Cholesterol 214 + 24.9 210 + 23.1 0.8954 % Current Smokers 17% 15% 0.5741 % with Diabetes 8% 3% 0.0438
In: Math