Questions
Suppose the probability mass function of a random variable X is given by ??x−1?pr(1−p)x−r, ifx=r,r+1,r+2,... f(x)...

Suppose the probability mass function of a random variable X is given by

??x−1?pr(1−p)x−r, ifx=r,r+1,r+2,... f(x) = r−1

0, otherwise

If this is the case then we say X is distributed as a Negative Binomial Random Variable with parameters r and p and we write X ∼ NegBin(r, p) (a) If we set r = 1, what distribution do we get? (b) Explain what this random variable models and justify the formula. (Hint: See Section 4.8.2 in Ross.) Math 241 Quiz 3 - Page 2 of 2 August 2019 (c) For this random variable, what is E[X] and Var[X]? There is no need to prove your answer or show any work for this part. (Hint: See Section 4.8.2 in Ross.) (d) In tossing a fair die repeatedly (and independently on successive tosses), find the proba- bility of getting the third “1” on the xth toss. (Hint: Let X denote the number of tosses required until we get our third “1”, or equivalently our third success. Then X is what kind of random variable?) (e) In tossing a fair die repeatedly (and independently on successive tosses), find the proba- bility of getting the third “1” on the fifth toss. (f) What is the average number of trials it will take to get our third “1”? (Hint: Use the results of part (c) for your solution).

In: Math

Problem 11-13 The Martin-Beck Company operates a plant in St. Louis with an annual capacity of...

Problem 11-13

The Martin-Beck Company operates a plant in St. Louis with an annual capacity of 30,000 units. Product is shipped to regional distribution centers located in Boston, Atlanta, and Houston. Because of an anticipated increase in demand, Martin-Beck plans to increase capacity by constructing a new plant in one or more of the following cities: Detroit, Toledo, Denver, or Kansas City. The estimated annual fixed cost and the annual capacity for the four proposed plants are as follows:

Proposed Plant

Annual Fixed Cost

Annual Capacity

Detroit

$175,000

10,000

Toledo

$300,000

20,000

Denver

$375,000

30,000

Kansas City

$500,000

40,000

The company's long-range planning group developed forecasts of the anticipated annual demand at the distribution centers as follows:

Distribution Center

Annual Demand

Boston

30,000

Atlanta

20,000

Houston

20,000

The shipping cost per unit from each plant to each distribution center is shown in table below.

A network representation of the potential Martin-Beck supply chain is shown in figure below.

Each potential plant location is shown; capacities and demands are shown in thousands of units. This network representation is for a transportation problem with a plant at St. Louis and at all four proposed sites. However, the decision has not yet been made as to which new plant or plants will be constructed.

  1. Formulate a model that could be used for choosing the best plant locations and for determining how much to ship from each plant to each distribution center. There is a policy restriction that a plant must be located either in Detroit or in Toledo, but not both. For those boxes in which you must enter subtractive or negative numbers use a minus sign. (Example: -300)

Let

y1 = 1 if a plant is constructed in Detroit; 0 if not

y2 = 1 if a plant is constructed in Toledo; 0 if not

y3 = 1 if a plant is constructed in Denver; 0 if not

y4 = 1 if a plant is constructed in Kansas City; 0 if not

xij = the units shipped in thousands from plant i to distribution center j

i= 1,2,3,4,5, and j = 1,2,3

Min

x11

+

x12

+

x13

+

x21

+

x22

+

x23

+

x31

+

x32

+

x33

+

x41

+

x42

+

x43

+

x51

+

x52

+

x53

+

y1

+

y2

+

y3

+

y4

  1. Formulate a model that could be used for choosing the best plant locations and for determining how much to ship from each plant to each distribution center. There is a policy restriction that no more than two plants can be located in Denver, Kansas City, and St. Louis. For those boxes in which you must enter subtractive or negative numbers use a minus sign. (Example: -300)

Let

y1 = 1 if a plant is constructed in Detroit; 0 if not

y2 = 1 if a plant is constructed in Toledo; 0 if not

y3 = 1 if a plant is constructed in Denver; 0 if not

y4 = 1 if a plant is constructed in Kansas City; 0 if not

xij = the units shipped in thousands from plant i to distribution center j

i= 1,2,3,4,5, and j = 1,2,3

Min

x11

+

x12

+

x13

+

x21

+

x22

+

x23

+

x31

+

x32

+

x33

+

x41

+

x42

+

x43

+

x51

+

x52

+

x53

+

y1

+

y2

+

y3

+

y4

Please show how to solve parts a and b using Excel

In: Math

2. Make a data frame consisting of 20 and 10 columns. Each column j should consist...

2. Make a data frame consisting of 20 and 10 columns. Each column j should consist of 20 values from a normal distribution with mean (i-1) and standard deviation 0.5j. For example, the third column should be normal(mean=2, sd=1.5). Using this data frame, do each of the following (using code, of course):
a. Find the mean and standard deviation for each column.
b. Write code that counts the number of columns for which the sample mean and sample standard deviation are within 20% of the values used to generate the data.
c. Write code that writes the columns from part b to a new data frame.
d. For each value in the new data frame, subtract its column mean and divide by the column standard deviation.

Solution using r and python

In: Math

The age distribution of the Canadian population and the age distribution of a random sample of...

The age distribution of the Canadian population and the age distribution of a random sample of 455 residents in the Indian community of a village are shown below.

Age (years) Percent of Canadian Population Observed Number
in the Village
Under 5 7.2%                   44            
5 to 14 13.6%                   75            
15 to 64 67.1%                   289            
65 and older 12.1%                   47            

Use a 5% level of significance to test the claim that the age distribution of the general Canadian population fits the age distribution of the residents of Red Lake Village.

(a) What is the level of significance?
.05

State the null and alternate hypotheses.

H0: The distributions are the same.
H1: The distributions are the same.

H0: The distributions are different.
H1: The distributions are the same.    

H0: The distributions are the same.
H1: The distributions are different.

H0: The distributions are different.
H1: The distributions are different.


(b) Find the value of the chi-square statistic for the sample. (Round your answer to three decimal places.)


Are all the expected frequencies greater than 5?

  


What sampling distribution will you use?

Student's t

uniform

   normal

binomial

chi-square


What are the degrees of freedom?


(c) Estimate the P-value of the sample test statistic.

P-value > 0.1000

.050 < P-value < 0.100  

  0.025 < P-value < 0.0500

.010 < P-value < 0.0250

.005 < P-value < 0.010

P-value < 0.005


(d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis that the population fits the specified distribution of categories?

Since the P-value > α, we fail to reject the null hypothesis.

Since the P-value > α, we reject the null hypothesis.

    Since the P-value ≤ α, we reject the null hypothesis.

Since the P-value ≤ α, we fail to reject the null hypothesis.


(e) Interpret your conclusion in the context of the application.

At the 5% level of significance, the evidence is insufficient to conclude that the village population does not fit the general Canadian population.

At the 5% level of significance, the evidence is sufficient to conclude that the village population does not fit the general Canadian population.    

In: Math

Two plots at Rothamsted Experimental Station were studied for production of wheat straw. For a random...

Two plots at Rothamsted Experimental Station were studied for production of wheat straw. For a random sample of years, the annual wheat straw production (in pounds) from one plot was as follows.

5.70 5.91 5.70 6.68 7.31 7.18
7.06 5.79 6.24 5.91 6.14

Use a calculator to verify that, for this plot, the sample variance is s2 ≈ 0.382.

Another random sample of years for a second plot gave the following annual wheat production (in pounds).

7.24 5.91 6.82 8.01 7.22 5.58 5.47 5.86

Use a calculator to verify that the sample variance for this plot is s2 ≈ 0.873.

Test the claim that there is a difference (either way) in the population variance of wheat straw production for these two plots. Use a 5% level of signifcance.

(a) What is the level of significance?
.05
State the null and alternate hypotheses.

Ho: σ12 = σ22; H1: σ12 > σ22

Ho: σ12 > σ22; H1: σ12 = σ22   

Ho: σ22 = σ12; H1: σ22 > σ12

Ho: σ12 = σ22; H1: σ12σ22



(b) Find the value of the sample F statistic. (Use 2 decimal places.)


What are the degrees of freedom?

dfN
dfD

What assumptions are you making about the original distribution?

The populations follow dependent normal distributions. We have random samples from each population.

The populations follow independent normal distributions. We have random samples from each population.    

The populations follow independent chi-square distributions. We have random samples from each population.

The populations follow independent normal distributions.


(c) Find or estimate the P-value of the sample test statistic. (Use 4 decimal places.)

p-value > 0.2000

.100 < p-value < 0.200  

  0.050 < p-value < 0.1000

.020 < p-value < 0.0500

.002 < p-value < 0.020

p-value < 0.002


(d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis?

At the α = 0.05 level, we reject the null hypothesis and conclude the data are not statistically significant.

At the α = 0.05 level, we reject the null hypothesis and conclude the data are statistically significant.    

At the α = 0.05 level, we fail to reject the null hypothesis and conclude the data are not statistically significant.

At the α = 0.05 level, we fail to reject the null hypothesis and conclude the data are statistically significant.


(e) Interpret your conclusion in the context of the application.

Fail to reject the null hypothesis, there is sufficient evidence that the variance in annual wheat production differs between the two plots.

Reject the null hypothesis, there is insufficient evidence that the variance in annual wheat production differs between the two plots.    

Reject the null hypothesis, there is sufficient evidence that the variance in annual wheat production differs between the two plots.

Fail to reject the null hypothesis, there is insufficient evidence that the variance in annual wheat production differs between the two plots.

In: Math

. A quality survey asked recent customers of their experience at a local department store. One...

. A quality survey asked recent customers of their experience at a local department store. One question asked for the customers rating on their service using categorical responses of average, outstanding, and exceptional. Another question asked for the applicant’s education level with categorical responses of Some HS, HS Grad, Some College, and College Grad. The sample data below are for 700 customers who recently visited the department store. Education Quality Rating Some HS HS Grad Some College College Grad Average 55 80 50 35 Outstanding 60 105 65 70 Exceptional 35 65 35 45 Using a level of significance of 0.01, is there evidence to suggest that the customer’s Education level and Quality Rating are independent? In other words, is there a relationship or is there NO relationship between Education and Quality Rating? a. State the Null and Alternative hypothesis. b. What is the statistic you would use to analyze this? c. State your decision rule: d. Show your calculation: e. What is your conclusion? Quality Rating and Education level are independent OR Quality Rating and Education level are NOT independent

In: Math

A high school is examining whether or not a certain college admissions test prep course is...

A high school is examining whether or not a certain college admissions test prep course is helpful. To evaluate this, 15 students took the college admissions test. Afterwards, they went through the prep course and then took the admissions test again. Their before and after scores are shown below. With a significance level of 0.90, is the admissions test prep course effective?

Student Before After

1 27 29

2 28 29

3 30 31

4 32 31

5 16 20

6 25 27

7 27 27

8 25 26

9 27 30

10 23 28

11 25 26

12 24 24

13 22 25

14 31 32

15 25 25

In: Math

The U.S. Department of Transportation, National Highway Traffic Safety Administration, reported that 77% of all fatally...

The U.S. Department of Transportation, National Highway Traffic Safety Administration, reported that 77% of all fatally injured automobile drivers were intoxicated. A random sample of 51 records of automobile driver fatalities in Kit Carson County, Colorado, showed that 37 involved an intoxicated driver. Do these data indicate that the population proportion of driver fatalities related to alcohol is less than 77% in Kit Carson County? Use α = 0.10. (a)

What is the level of significance?

State the null and alternate hypotheses. Will you use a left-tailed, right-tailed, or two-tailed test?

Ho: p = 0.77; H1: p > 0.77; right-tailed

Ho: p = 0.77; H1: p < 0.77; left-tailed

Ho: p = 0.77; H1: p ≠ 0.77; two-tailed

Ho: p < 0.77; H1: p = 0.77; left-tailed

(b) What sampling distribution will you use? Do you think the sample size is sufficiently large?

The normal distribution, since the sample size is large.

The t distribution, since the sample size is large.

What is the value of the sample test statistic? (Use 2 decimal places.)

(c) Find the P-value of the test statistic. (Use 4 decimal places.)

Sketch the sampling distribution and show the area corresponding to the P-value.

d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level α?

At the α = 0.10 level, we reject the null hypothesis and conclude the data are not statistically significant.

At the α = 0.10 level, we fail to reject the null hypothesis and conclude the data are not statistically significant.

At the α = 0.10 level, we reject the null hypothesis and conclude the data are statistically significant.

At the α = 0.10 level, we fail to reject the null hypothesis and conclude the data are statistically significant.

(e) State your conclusion in the context of the application.

Reject the null hypothesis, there is sufficient evidence that the true proportion of driver fatalities related to alcohol is less than 0.77 in Kit Carson County.

Fail to reject the null hypothesis, there is insufficient evidence that the true proportion of driver fatalities related to alcohol is less than 0.77 in Kit Carson County.

Fail to reject the null hypothesis, there is sufficient evidence that the true proportion of driver fatalities related to alcohol is less than 0.77 in Kit Carson County.

Reject the null hypothesis, there is insufficient evidence that the true proportion of driver fatalities related to alcohol is less than 0.77 in Kit Carson County.

In: Math

You are part of a team investigating the identifying motor vehicle accidents. A multiple regression model...

You are part of a team investigating the identifying motor vehicle accidents. A multiple regression model is to be constructed to predict the number of motor vehicle accidents in a town per year based upon the population of the town, the number of recorded traffic offenses per year and the average annual temperature in the town.

Data has been collected on 30 randomly selected towns:

Number of motor vehicle
accidents per year
Population
(× 1000)
No. of recorded
traffic offences
(× 100)
Average temperature
°F
355 181 29 78
490 257 56 82
597 441 34 81
475 50 95 81
922 495 102 82
736 38 165 81
305 167 25 84
1,128 378 191 78
745 369 86 76
476 237 63 84
143 100 4 84
203 118 21 79
909 489 106 78
410 210 39 77
642 138 131 81
847 308 138 82
604 418 40 77
719 194 132 78
350 319 8 84
327 70 61 76
1,038 259 192 78
756 299 115 81
635 440 40 79
796 283 131 85
301 64 56 81
135 26 26 79
639 31 150 81
325 210 13 77
441 43 98 79
522 370 26 82

a)Find the multiple regression equation using all three explanatory variables. Assume that X1 is population, X2 is number of recorded traffic offenses per year and X3 is average annual temperature. Give your answers to 3 decimal places.

y^ =  + population + no. traffic offences + average temp

b)At a level of significance of 0.05, the result of the F test for this model is that the null hypothesis isis not rejected.

For parts c) and d), using the data, separately calculate the correlations between the response variable and each of the three explanatory variables.

c)The explanatory variable that is most correlated with number of motor vehicle accidents per year is:

population
number of traffic offenses
average annual temperature

d)The explanatory variable that is least correlated with number of motor vehicle accidents per year is:

population
number of traffic offenses
average annual temperature

e)The value of R2 for this model, to 2 decimal places, is equal to

f)The value of se for this model, to 3 decimal places, is equal to

g)Construct a new multiple regression model by removing the variable average annual temperature. Give your answers to 3 decimal places.

The new regression model equation is:

y^ =  + population + no. traffic offences

h)In the new model compared to the previous one, the value of R2 (to 2 decimal places) is:

increased
decreased
unchanged

i)In the new model compared to the previous one, the value of se (to 3 decimal places) is:

increased
decreased
unchanged

In: Math

5). Finally, which of the following would you use to write out the results in an...

5). Finally, which of the following would you use to write out the results in an APA formatted results section? Note that this one is tricky – some answer options differ in only a single number or word! Pay close attention to details here.

A. We ran a One Way ANOVA with condition (FITD vs. DITF vs. Control) as our dependent variable and willingness to participate in the 30 minute study as our independent variable. The One Way ANOVA was significant, F(2, 87) = 4.05, p < .05. Tukey post hoc tests showed that participants were significantly less willing to participate in the 30 minute study in the control condition (M = 6.63, SD = 1.30) than in both the FITD condition (M = 7.40, SD = 1.00) and the DITF condition (M = 7.37, SD = 1.22), though the DITF and FITD conditions did not differ from each other.

B. We ran a One Way ANOVA with condition (FITD vs. DITF vs. Control) as our independent variable and willingness to participate in the 30 minute study as our dependent variable. The One Way ANOVA was significant, F(2, 87) = 4.05, p < .05. Tukey post hoc tests showed that participants were significantly less willing to participate in the 30 minute study in the control condition (M = 6.63, SD = 1.30) than in both the FITD condition (M = 7.40, SD = 1.00) and the DITF condition (M = 7.37, SD = 1.22), though the DITF and FITD conditions did not differ from each other.

C. We ran a One Way ANOVA with condition (FITD vs. DITF vs. Control) as our independent variable and willingness to participate in the 30 minute study as our dependent variable. The One Way ANOVA was significant, F(2, 87) = 4.05, p < .001. Tukey post hoc tests showed that participants were significantly less willing to participate in the 30 minute study in the control condition (M = 6.63, SD = 1.30) than in both the FITD condition (M = 7.40, SD = 1.00) and the DITF condition (M = 7.37, SD = 1.22), though the DITF and FITD conditions did not differ from each other.

D. We ran a One Way ANOVA with condition (FITD vs. DITF vs. Control) as our independent variable and willingness to participate in the 30 minute study as our dependent variable. The One Way ANOVA was not significant, F(2, 89) = 4.05, p > .05. Since p was greater than .05, there was no need to conduct post hoc tests.

E. We ran a One Way ANOVA with condition (FITD vs. DITF vs. Control) as our independent variable and willingness to participate in the 30 minute study as our dependent variable. The One Way ANOVA was significant, F(2, 87) = 4.05, p < .05. Tukey post hoc tests showed that participants were significantly less willing to do the study in the control condition (M = 6.63, SD = 1.30) than in both the FITD condition (M = 7.40, SD = 1.00) and the DITF condition (M = 7.37, SD = 1.22). In addition, those in the DITF condition were significantly less willing to participate in the 30 minute study than those in the FITD condition.

In: Math

A coin will be tossed 5 times. The chance of getting exactly 2 heads among 5...

A coin will be tossed 5 times.

The chance of getting exactly 2 heads among 5 tosses is %.

The chance of getting exactly 4 heads among 5 tosses is %.

A coin will now be tossed 10 times.

The chance of getting exactly 2 heads in the first five tosses and exactly 4 heads in the last 5 tosses is BLANK %.

All answers must have three numbers following the decimal.

I just need the answer to the bolded on (where I put "BLANK" just provided the whole question in case it is needed)

In: Math

Banking fees have received much attention during the recent economic recession as banks look for ways...

Banking fees have received much attention during the recent economic recession as banks look for ways to recover from the crisis. A sample of 32 customers paid an average fee of ​$12.33 per month on their​ interest-bearing checking accounts. Assume the population standard deviation is ​$1.75. Complete parts a and b below. a. Construct a 95​% confidence interval to estimate the average fee for the population. The 95​% confidence interval has a lower limit of ​$ nothing and an upper limit of ​$ nothing. ​(Round to the nearest cent as​ needed.) b. What is the margin of error for this​ interval? ​$ nothing ​(Round to the nearest cent as​ needed.)

In: Math

9) (CH. 9-2) Is there a difference between the average NBA Championship Final game winning scores...

9) (CH. 9-2) Is there a difference between the average NBA Championship Final game winning scores of the 1970’s versus the average of the winning scores of the 2000’s? Use a 0.01 significance level to test the claim that there is a difference.

1970’s

2000’s

97

99

105

131

109

83

87

95

96

81

102

100

102

88

114

113

118

108

113

116

Use the data from problem #9 to construct a 99% confidence interval estimate for the mean of the differences. Does this interval contain zero? Do the results of this problem support the results of problem #9?

  • What is the Confidence Interval Estimate:

  • Does the interval indicate a difference or not? Explain your answer.

  • Do the results of the hypothesis test (from problem #9) and the confidence interval estimate support each other?

In: Math

A study of undergraduate computer science students examined changes in major after the first year. The...

A study of undergraduate computer science students examined changes in major after the first year. The study examined the fates of 256 students who enrolled as first-year students in the same fall semester. The students were classified according to gender and their declared major at the beginning of the second year. The students studied were enrolled at a large Midwestern university several years ago. Discuss how you would conduct a similar study at a college or university of your choice today. Include a description of all the variables that you would collect for your study.

In: Math

How is variation within categories and between categories relevant to ANOVA? What is dfw? dfb? In...

How is variation within categories and between categories relevant to ANOVA? What is dfw? dfb?

In ANOVA, if the null is false, the means of the sample should be very (different? Similar? zero?) and the standard deviation of the different samples should be (very large? Low?)

ANOVA proceeds by developing two separate estimates of what?

The population variance is a measure of what? What assumptions are required for ANOVA? When will ANOVA tolerate some violation of model assumptions?

In: Math