Statistics and Probability Homework Answers | Statistics and Probability Homework Help

Questions

In a 2008 survey, people were asked their opinions on astrology - whether it was very...

In a 2008 survey, people were asked their opinions on astrology - whether it was very scientific, somewhat scientific, or not at all scientific. Of 1436 who responded,71 said astrology was very scientific.

a. Find the proportion of people in the survey who believe astrology is very scientific. Answer ______ (Round to four decimal places as needed)

b. Find a 95% confidence interval for the population proportion with this belief. Answer (____, and _____)

c. Suppose a TV news anchor said that 5% of people in the general population think astrology is very scientific. Would you say that is plausible? Which one is correct?

Choose the correct answer below A,B,C or D Answer ______

A.This is not plausible because 5% is outside the interval.

B.This is not plausible because 5% is inside the interval.

C.This is plausible because 5% is inside the interval.

D.This is plausible because 5% is outside the interval.

In: Statistics and Probability

A statistics professor is at a supermarket waiting in line to buy some groceries. While waiting...

A statistics professor is at a supermarket waiting in line to buy some groceries. While waiting for the line to move he listened to several people complaining about the long delays. From the different conversions going on he quickly gathers some inform and estimates from the large sample that the mean waiting time is about 12 minutes. He then estimates the population standard deviation to be 1.5 minutes.

1. (a) Explain in detail using full sentences how he should go about finding a 90% confidence interval for the time he has to wait in line to be served. Enumerate each step.

2. Let’s assume that from the question in part (1) it is determined that its margin of error is approximately 3.5 minutes. (a) What does this mean exactly for a 90% confidence interval?

3. Let’s assume now that the professor finished his computation and determined that the 90% confidence interval will be within (8.7 – 15.3) minutes. (a) If he now wants a 95% confidence interval, will the range be bigger or smaller than (8.7 – 15.3) minutes? Explain.

4. It is normal to think that more is better. In the question from part (3) (a) Is the 95% confidence interval better than the 90% interval? Explain. (b) If that is the case why do we not use a 100% confidence interval? Explain.

In: Statistics and Probability

Provide an appropriate answer for each of the mean confidence interval problems. 1) Construct a 94%...

Provide an appropriate answer for each of the mean confidence interval problems.

1) Construct a 94% confidence interval for the population mean, μ. Assume the population has a normal distribution. A sample of 40 part-time workers had mean annual earnings of $3120 with a standard deviation of $677. Round to the nearest dollar. (Show work)

2) In order to set rates, an insurance company is trying to estimate the number of sick days that full time workers at a local bank take per year. Based on earlier studies it is known that the standard deviation is 12.3 days per year. How large a sample must be selected if the company wants to be 95% confident that their estimate is within 3 days of the true mean? Provide an appropriate answer for each of the proportion confidence interval problems.

3) A survey of 500 non-fatal accidents showed that 122 involved uninsured drivers. Construct a 96% confidence interval for the proportion of fatal accidents that involved uninsured drivers. (Show work)

In: Statistics and Probability

Cotinine level(ng/ml) was measured in the meconium of newborns of mothers who were active, passive or...

Cotinine level(ng/ml) was measured in the meconium of newborns of mothers who were active, passive or nonsmokers. There were consecutive women arriving for delivery at one hospital. The alkaloid, cotinine is the main metabolite of nicotine. with a half-life of around 20 hours and detectable for several days after exposure, it is a biomarker for exposure to tobacco smoke.

Cotinine level (ng/ml)

Active Smokers (490, 418, 405, 328, 700, 292, 295, 272, 240, 232)

Passive Smokers ( 254, 219, 287, 257, 271, 282, 148, 273, 350, 293)

Nonsmoker ( 158, 163, 153, 207, 211, 159, 199, 187, 200, 213)

1. Create your own Reditol file to store, analyze adn graph this data set as called for in the questions below. Save this R program file as smoking Elba.R if your name is Elba, else use your name. I should be able to execute your code to produce the answers and the graph you submitted for this problem.

Descriptive statistics for Cotinine level for these 3 smoking groups. Round-off to appropriate levels.

smoking group N Mean SD 95% CI Mean (by t-distribution)

Active Smokers 10 ______ _______ _____________

Passive Smokers 10 ________ _______ __________________

Nonsmokers 10 ________ _________ __________________

2. Perform one-way Anova. Report p-value to ful resolution; round-off the other statistics to appropriate levels.

source df SS MS F P

Age group 2 ______ _______ _______ _____

Unexplained 27 ______ ________ ______ ______

3. Perform Kruskal-Wallis test.

P= ______

4. R-square : Among groups= _______%

5. Cohen's D= ___________( standardized effect size)

6. using plotmeans() in the gplots package, construct a publication quality graph of the means with their 95% confidence inervals calculated using the t-distribution.

7. In several sentences suitable for scientific journal, express the results of the one-way Anova and associated analysis including insight from effect sizes, CI, multiple comparisons and graphical visualization.

In: Statistics and Probability

In a section of an English 201 class, the professor decides to be “generous” with the...

In a section of an English 201 class, the professor decides to be “generous” with the students and will grade the next exam in a unique way. Grades will be assigned according to the following rule: The top 10% receive A’s, the next 20% receive B’s, the middle 40% receive C’s, the next 20% receive D’s, and the bottom 10% receive F’s. Some may refer to this type of grading as “curving” which gave rise to the phrase, “Professor, do you curve the grades?”

1. (a) Where did the term “curving” come from? (b) Which curve is this referring to? (c) What do you think is the purpose of grading exams on a “curve”.

2. Students usually like when professors grade on a curve even though it is likely they do not understand what that involves. (a) Who do you think benefits from grading exams in a curve? Do you think all students will like this method? Explain.

3. Do you think this method of grading is fair to all students? Under what circumstances would some students NOT like this method of grading? Explain.

4. Under pressure from the students to grade an exam on a curve a Math Professor proposes the following curving method: The top 5% receive A’s, the next 10% receive B’s, the middle 30% receive C’s, the next 35% receive D’s, and the bottom 20% receive F’s. (a) What is the difference between this curving method and the one from the method specified at outset? (b) Do you think the students will accept this “curving” approach? Explain.

In: Statistics and Probability

15 17 15 18 13 13 15 18 17 11 (i) Use a calculator with sample...

15 17 15 18 13 13 15 18 17 11 (i) Use a calculator with sample mean and standard deviation keys to find x and s. (Round your answers to two decimal places.) x = s =

In: Statistics and Probability

5. Let n = 60, not a product of distinct prime numbers. Let Bn= the set...

5. Let n = 60, not a product of distinct prime numbers. Let Bn= the set of all positive
divisors of n. Define addition and multiplication to be lcm and gcd as well. Now show
that Bn cannot consist of a Boolean algebra under those two operators.
Hint: Find the 0 and 1 elements first. Now find an element of Bn whose complement
cannot be found to satisfy both equalities, no matter how we define the complement
operator.

In: Statistics and Probability

A retail company has started a new advertising campaign in order to increase sales. In the...

A retail company has started a new advertising campaign in order to increase sales. In the past, the mean spending in both the 18–35 and 35+ age groups was at most $70.00.

a. Formulate a hypothesis test to determine if the mean spending has statistically increased to more than $70.00.

b. After the new advertising campaign was launched, a marketing study found that the sample mean spending for 400 respondents in the 18–35 age group was $73.65, with a sample standard deviation of $56.60. Is there sufficient evidence to conclude that the advertising strategy significantly increased sales in this age group with significance level of 5%?

c. For 600 respondents in the 35+ age group, the sample mean and sample standard deviation were $73.42 and $45.44, respectively. Is there sufficient evidence to conclude that the advertising strategy significantly increased sales in this age group with significance level of 5%?

In: Statistics and Probability

In an outpatient clinic, a nurse practitioner observes a high prevalence of obesity in young female...

In an outpatient clinic, a nurse practitioner observes a high prevalence of obesity in young female patients. The nurse practitioner further observes that there appears to be an association between the economic status of the patient and their obesity. The nurse practitioner reviews the related literature but does not find a study that directly relates obesity and economic status. The nurse practitioner decides to conduct a study to determine if a relationship exists between obesity in younger females, ages 15 to 18 years, and their economic status. The nurse practitioner elicits help from four outpatient clinics and receives data on the Body Mass Index (BMI) for all female patients from age 15 to 18 years. The nurse practitioner also attains information concerning if the patient is above or below the federal poverty line for their respective family. The nurse practitioner uses a bivariate correlation between BMI and the dichotomous variable of economic status. The results did not reflect a statistically significant relationship.

What kind of study is this according to the methodology used? Justify your choice.

Write a research hypothesis/hypotheses or question(s) for this study and identify the variables in the study.

Critique the internal and external validity of the study as it is designed.

Critique the data analyses used in the study.

If you were going to conduct this study, how would you redesign to enhance its validity and at the same time make it reasonable to conduct in a outpatient clinical setting? In your redesign be sure to address the appropriateness of the tests used to measure the outcomes of the study and the statistical procedure(s) that you would use to analyze data.

In: Statistics and Probability

The overhead reach distances of adult females are normally distributed with a mean of 205 cm...

The overhead reach distances of adult females are normally distributed with a mean of

205 cm

and a standard deviation of

8 cm

a. Find the probability that an individual distance is greater than

217.502 17.50

cm.b. Find the probability that the mean for

randomly selected distances is greater than 203.50 cm.

c. Why can the normal distribution be used in part (b), even though the sample size does not exceed 30?

a. The probability is 0.0591.

(Round to four decimal places as needed.)

b. The probability is ..........

An engineer is going to redesign an ejection seat for an airplane. The seat was designed for pilots weighing between

130 lb and

171 lb. The new population of pilots has normally distributed weights with a mean of 135 lb

and a standard deviation of 30.1 lb

a. If a pilot is randomly selected, find the probability that his weight is between

130 lb and 171 lb. The probability is approximately....... (Round to four decimal places as needed.)

In: Statistics and Probability

A deficiency of the trace element selenium in the diet can negatively impact growth, immunity, muscle...

A deficiency of the trace element selenium in the diet can negatively impact growth, immunity, muscle and neuromuscular function, and fertility. The introduction of selenium supplements to dairy cows is justified when pastures have low selenium levels. Authors of a research paper supplied the following data on milk selenium concentration (mg/L) for a sample of cows given a selenium supplement (the treatment group) and a control sample given no supplement, both initially and after a 9-day period.

Initial Measurement
Treatment	Control
11.3	9.1
9.6	8.7
10.1	9.7
8.5	10.8
10.4	10.9
10.6	10.6
11.9	10.1
9.9	12.3
10.8	8.8
10.4	10.4
10.2	10.9
11.3	10.4
9.2	11.6
10.6	10.9
10.9
8.2

After 9 Days
Treatment	Control
138.3	9.4
104	8.9
96.4	8.9
89	10.1
88	9.6
103.8	8.6
147.3	10.3
97.1	12.3
172.6	9.4
146.3	9.5
99	8.3
122.3	8.7
103	12.5
117.8	9.1
121.5
93

(a) Use the given data for the treatment group to determine if there is sufficient evidence to conclude that the mean selenium concentration is greater after 9 days of the selenium supplement. (Use α = 0.05. Use a statistical computer package to calculate the P-value. Use μ_d = μ_initial − μ_9-day. Round your test statistic to two decimal places, your df down to the nearest whole number, and your P-value to three decimal places.)

=P-value=

(b) Are the data for the cows in the control group (no selenium supplement) consistent with the hypothesis of no significant change in mean selenium concentration over the 9-day period? (Use α = 0.05. Use a statistical computer package to calculate the P-value. Use μ_d = μ_initial − μ_9-day. Round your test statistic to two decimal places, your df down to the nearest whole number, and your P-value to three decimal places.)

df=

P-value=

In: Statistics and Probability

Please use R "Team","WINS","HR","BA","ERA" "Anaheim Angels",99,152,.282,3.69 "Baltimore Orioles",67,165,.246,4.46 "Boston Red Sox",93,177,.277,3.75 "Chicago White Sox",81,217,.268,4.53 "Cleveland Indians",74,192,.249,4.91...

Please use R

"Team","WINS","HR","BA","ERA"

"Anaheim Angels",99,152,.282,3.69

"Baltimore Orioles",67,165,.246,4.46

"Boston Red Sox",93,177,.277,3.75

"Chicago White Sox",81,217,.268,4.53

"Cleveland Indians",74,192,.249,4.91

"Detroit Tigers",55,124,.248,4.93

"Kansas City Royals",62,140,.256,5.21

"Minnesota Twins",94,167,.272,4.12

"New York Yankees",103,223,.275,3.87

"Oakland Athletics",103,205,.261,3.68

"Seattle Mariners",93,152,.275,4.07

"Tampa Bay Devil Rays",55,133,.253,5.29

"Texas Rangers",72,230,.269,5.15

"Toronto Blue Jays",78,187,.261,4.8

"Arizona Diamondbacks",98,165,.267,3.92

"Atlanta Braves",101,164,.26,3.13

"Chicago Cubs",67,200,.246,4.29

"Cincinnati Reds",78,169,.253,4.27

"Colorado Rockies",73,152,.274,5.2

"Florida Marlins",79,146,.261,4.36

"Houston Astros",84,167,.262,4

"Los Angeles Dodgers",92,155,.264,3.69

"Milwaukee Brewers",56,139,.253,4.73

"Montreal Expos",83,162,.261,3.97

"New York Mets",75,160,.256,3.89

"Philadelphia Phillies",80,165,.259,4.17

"Pittsburgh Pirates",72,142,.244,4.23

"St. Louis Cardinales",97,175,.268,3.7

"San Diego Padres",66,136,.253,4.62

"San Francisco Giants",95,198,.267,3.54

data on the following variables for the 30 major league baseball teams during the 2002 season: • WINS: number of games won • HR: number of home runs hit • BA: average batting average • ERA: earned run average

(a) Using WINS as the dependent variable, run the regression relating the three predictor variables to WINS. Report the fitted regression line.

(b) Construct the ANOVA table of the above model.

(c) Plot the residuals ei against the fitted values ybi . What departures from the regression model assumptions can be studied from this plot? What are your findings? (Note: If you are not sure about the validity of any of the assumptions, perform a formal test to verify your answer.) 1

(d) Prepare a normal probability plot (QQ plot) of the residuals. Which assumption can be tested from this plot and what do you conclude? (Note: You can also use the formal test to reinforce your conclusion).

(e) If there is no problem with any of the assumptions, you can safely continue on making inference. Test for the significance of the regression using a 0.05 significance level.

(f) What percentage of the variability in y is explained by the regression?

(g) Using the individual t-tests, comment on the significance of each predictor variable, using a 0.05 significance level.

Hint: data=read.table(‘hmw6_prob2.txt’, header=T, sep=‘,’) y=data$WINS x1=data$HR x2=data$BA x3=data$ERA

In: Statistics and Probability

"FATALS","CUTTING" 270,15692 183,16198 319,17235 103,18463 149,18959 124,19103 62,19618 298,20436 330,21229 486,18660 302,17551 373,17466 187,17388 347,15261 168,14731...

"FATALS","CUTTING"

270,15692

183,16198

319,17235

103,18463

149,18959

124,19103

62,19618

298,20436

330,21229

486,18660

302,17551

373,17466

187,17388

347,15261

168,14731

234,14237

68,13216

162,12017

27,11845

40,11905

26,11881

41,11974

116,11892

84,11810

43,12076

292,12342

89,12608

148,13049

166,11656

32,13305

72,13390

27,13625

154,13865

44,14445

3,14424

3,14315

153,13761

11,12471

9,10960

17,9218

2,9054

5,9218

63,8817

41,7744

10,6907

3,6440

26,6021

52,5561

31,5309

3,5320

19,4784

10,4311

12,3663

88,3060

0,2779

41,2623

2,2058

5,1890

2,1535

0,1515

0,1595

23,1803

4,1495

0,1432

The above contains data on the following two variables

• FATALS: the annual number of fatalities from gas and dust explosions in coal mines for years 1915 to 1978.

• CUTTING: the number of cutting machines in use

(a) Fit the regression model using FATALS as the dependent variable and CUTTING as the independent variable.

(b) Using appropriate residual plots and formal tests, investigate the violation of any assumptions. Do any assumptions of the linear regression model appear to be violated? If so, which one (or ones)?

(Hint: Plot of residuals versus fitted values can be used for linearity, zero mean, and constant variance. Normal probability plot of the residuals can be used for normality. We also have formal tests for the constant variance and normality assumptions that you can do in R).

Hint: data=read.table(‘hmw6_prob3.txt’, header=T, sep=‘,’) y=data$FATALS x=data$CUTTING

In: Statistics and Probability

The data worksheet entitled "FUELCON4" contains the following variables for all 50 states plus the District...

The data worksheet entitled "FUELCON4" contains the following variables for all 50 states plus the District of Columbia.

FUELCON (y):	Per capita fuel consumption in gallons
DRIVERS (x₁):	The ratio of licensed drivers to private and commercial motor vehicles registered
HWYMILES (x₂):	The number of miles of federally funded highways
GASTAX (x₃) :	The tax per gallon of gasoline in cents
INCOME (x₄):	The average household income in dollars

Run the regression analysis with FUELCON as the dependent variable and the other four variables as independent variables and obtain the appropriate model diagnostic statistics: Use the Shapiro-Wilk teststatistic to test the assumption of the normality of the model residuals.

(a) What is being tested here? (Choose one)

The assumption of linearity.The assumption of normally-distributed disturbances. Whether there is a linear relationship between x and y.The assumption of constant variance.The assumption of independence.Whether all of the x variables are important in predicting y.

(b) Which hypotheses are being tested? (Choose one)

H₀: β₁ = 1.0
H_a: β₁ ≠ 1.0H₀: All of the x variables in the model are not important
H_a: Atleast one of the x variables is important H₀: The model variance is constant
H_a: The model variance is not constantH₀: Disturbances are normal
H_a: Disturbances are non-normalH₀: β₁ = 0
H_a: β₁ ≠ 0

Reject H₀ if p < 0.10.
Do not reject H₀ if p ≥ 0.10.Reject H₀ if p > 0.10.
Do not reject H₀ if p ≤ 0.10. Reject H₀ if p < 0.05.
Do not reject H₀ if p ≥ 0.05.Reject H₀ if p > 0.05.
Do not reject H₀ if p ≤ 0.05.

(d) What is the name of the test statistic? (Choose one)

Anderson-Darling's A²Shapiro-Wilk's W Test of Constant VarianceKolmogorov-Smirnov's DThe Partial F TestTest of Independence

(e) State the appropriate test statistic name, test statistic value, and the associated p-value (Enter the test statistic value to three decimal places, and the p-value to four decimal places).

---Select--- z A D F t W = , p ---Select--- < ≥ ≤ > =

(f) What conclusion can be drawn from the test result?

Do not reject H₀.The assumption of normally-distributed disturbances has been met.Do not reject H₀. The assumption of independence has been met. Reject H₀. There is a linear relationship between x and y.Reject H₀. The assumption of independence has not been met.Do not reject H₀. There is not a linear relationship between x and y.Reject H₀. The assumption of constant variance has not been met.Do not reject H₀. The assumption of constant variance has been met.Reject H₀. The assumption of normally-distributed disturbances has not been met.

FUELCON	DRIVERS	HWYMILES	GASTAX	INCOME
547.92	0.85	11,849	18	24426
440.38	0.81	4,532	8	30997
456.9	0.9	9,455	18	25479
530.08	1.07	7,949	21.7	22912
426.21	0.76	32,478	18	32678
474.78	0.71	11,015	22	32957
432.44	0.92	3,820	25	41930
492.97	0.88	1,260	23	32121
461.55	0.91	17,272	13.6	28493
564.82	0.81	16,950	7.5	28438
336.97	0.92	1,089	16	28554
484.83	0.69	6,466	25	24257
406.99	0.8	19,700	19	32755
524.01	0.74	10,261	15	27532
532.39	0.61	10,037	20	27283
483.31	0.81	10,494	21	28507
532.77	0.77	10,302	16.4	25057
513.8	0.77	8,954	20	24084
472.68	0.94	3,474	22	36385
463.46	0.89	6,387	23.5	34950
436.57	0.9	7,264	21	38845
504.95	0.84	16,942	19	29538
532.52	0.66	12,509	20	32791
541.06	0.97	8,747	18.4	21643
549.16	0.92	13,580	17	28029
549.35	0.68	10,456	27	23532
503.1	0.79	8,067	24.5	28564
448.81	1.13	5,976	24.75	29860
541.67	0.87	2,405	19.5	33928
465.52	0.89	9,150	10.5	38153
504.77	0.89	9,654	18.5	23162
296.44	1.1	18,998	22	35884
510.05	0.97	13,632	24.1	27418
580.32	0.66	7,415	21	25538
458.31	0.74	16,807	22	28619
523.89	0.68	11,123	17	24787
439.09	0.85	10,138	24	28000
417.36	0.87	18,448	26	30617
382.82	0.88	1,037	29	29984
557.53	0.92	9,272	16	24594
577.84	0.7	7,753	22	26301
506.3	0.83	12,036	20	26758
502.17	0.93	49,678	20	28486
430.53	0.87	7,310	24.5	24202
555.78	0.99	2,138	20	27992
529.52	0.81	14,453	17.5	32295
446.63	0.83	10,802	23	31582
466.31	0.94	5,390	25.65	22725
466.08	0.83	13,088	27.3	28911
715.55	0.67	7,841	14	28807
289.99	1.38	391	20	40498

In: Statistics and Probability

Consider the probability that no less than 92 out of 157 registered voters will vote in...

Consider the probability that no less than 92 out of 157 registered voters will vote in the presidential election. Assume the probability that a given registered voter will vote in the presidential election is 64%

Approximate the probability using the normal distribution. Round your answer to four decimal places.

In: Statistics and Probability

Subjects