Statistics and Probability Homework Answers | Statistics and Probability Homework Help

Questions

what is comparative experiment and what is being compare when a comparative experiment is perform?

In: Statistics and Probability

alpha value is 0.05 if it is not specified in the problem. **Everything should be in...

alpha value is 0.05 if it is not specified in the problem.

**Everything should be in r code base.

4. Back to the iris dataset one last time! We wish to estimate the probability of a flower’s species based on the available measurements.

a. Build a Multinomial model to predict Species based on Sepal.Length. Use it to estimate the probability of each species for a flower with a sepal 6.3 cm long.

b. Build a Multinomial model to predict Species based on Petal.Length. Use it to estimate the probability of each species for a flower with a petal 5.1 cm long.

c. Compare both the residual deviances and the AICs for the two previous models. Which appears to be the “better” model based on these metrics?

d. Build a Multinomial model to predict Species based on all four of the measurements (no interactions). Use the model to estimate the probability of each species for a flower with a Sepal.Length of 6.3 cm, a Sepal.Width of 2.8 cm, a Petal.Length of 5.1 cm, and a Petal.Width of 1.5 cm.

e. The flower described in part d is actually in the dataset. What species was it?

In: Statistics and Probability

If there are 1296 different ways that 4 (6 sided) dice can be rolled, how many...

If there are 1296 different ways that 4 (6 sided) dice can be rolled, how many of the 1296 possibilities have less than 2 fives rolled?

In: Statistics and Probability

Hank would like to know how many customers are entering his propane store within a given...

Hank would like to know how many customers are entering his propane store within a given timeframe. Prior data indicate that on average 8 customers arrive in a given hour.

a. Create the appropriate probability distribution below for 0-12 arrivals.

b. What is the probability that 8 or fewer customers will arrive in the next hour?

c. What is the probability that exactly 10 customers arrive in the next hour?

d. What is the probability that more than 12 customers will arrive in the next hour?

e. How likely is it that Hank has a "large crowd" entering his store in the next hour?

In: Statistics and Probability

The table below gives the number of hours five randomly selected students spent studying and their...

The table below gives the number of hours five randomly selected students spent studying and their corresponding midterm exam grades. Using this data, consider the equation of the regression line, yˆ=b0+b1xy^=b0+b1x, for predicting the midterm exam grade that a student will earn based on the number of hours spent studying. Keep in mind, the correlation coefficient may or may not be statistically significant for the data given. Remember, in practice, it would not be appropriate to use the regression line to make a prediction if the correlation coefficient is not statistically significant.

Hours Studying	11	22	33	44	55
Midterm Grades	7070	7777	8484	8888	9595

1. Find the estimated slope. Round your answer to three decimal places.

2.Find the value of the coefficient of determination. Round your answer to three decimal places.

3.Find the estimated y-intercept. Round your answer to three decimal places

4.Determine the value of the de[endent variable of ^y at x=0

5.According to the equation of the regression line, if the independent variable is increased by one unit what is the change in the dependent variable y?

6.Not all points predicted by the linear model fall on the same line True or False

7.Substitute the values found in 1 and 2 in to the equation in the regression line to find the linear model.According to this model, if the value of the independent variable is increased by one unit, then find the dependent variable y.

In: Statistics and Probability

What two z scores cut off the middle 95% of the normal distribution?

In: Statistics and Probability

In order to analyze water samples using a spectrophotometer or plate reader, it is necessary to...

In order to analyze water samples using a spectrophotometer or plate reader, it is necessary to turn the molecules of nitrate into a dye molecule that can be quantified. The first step in turning nitrate (NO₃^-) into a dye molecule is reducing it to a molecule of nitrite (NO₂^-). This is done by reacting the NO₃^- with cadmium.

After the reduction reaction, the NO₂^- is reacted with two additional reagents. The first reagent, Reagent A, is a solution of sulfanilamide and hydrochloric acid. The second reagent, Reagent B, is a solution of N-(1-naphthyl)-ethylenediamine, called NNED for short. The compounds are mixed with the water sample and produce a purple color. The intensity of the purple color is directly related to the concentration of nitrite in the water sample. We can measure how purple the water turns as absorbance on a spectrophotometer and then convert the absorbance to concentration of nitrate.

To make Reagent A, we will need to make a solution of 10.0 g of sulfanilamide in 1 L of 2.4 molar hydrochloric acid (HCl).

The stock solution of HCl is 6 molar HCl. How many milliliters (mL) of 12 M HCl would you add to produce 0.15 liters (L) of HCl? mL HCl

After creating 0.15 L of 2.4 molar HCl solution, how many grams of sulfanilamide will be added? g sulfanilamide

After reacting the nitrate with cadmium to produce nitrite, the nitrite is then reacting with sulfanilamide and N-(1-naphthyl)-ethylenediamine, to produce a purple dye molecule that can be quantified on a spectrophotometer.

The N-(1-naphthyl)-ethylenediamine, called NNED for convenience, reagent is made by mixing 1 gram of NNED in 1 liter of water. However, we don't always want to make an entire liter of solution because the NNED solution only lasts about 1 month before going bad and turning brown.

How many milligrams of NNED will need to be added to make 0.125 liters of solution?

After converting the nitrate into a purple dye, and measuring the absorbance of the purple dye on a spectrophotometer, a standard curve is used to convert the absorbance into concentration.

To make a standard curve, samples with known concentrations of NO₃^- are run on the spectrophotemeter. The samples with known concentrations are called standards. A linear regression is then performed to relate the concentration of NO₃^- to measured absorbance values.

Here is a link to a spreadsheet containing a simulated data set. There are standards and their related absorbance values, and there are samples from two sites that were diluted, prior to processing and measuring their absorbances. The groundwater originates from the upslope site, and the hope is that the microbes in the soil are removing the NO₃^- from the groundwater before it reaches the downslope site.

Using the given data create a standard curve in Excel, and use Trendline to add a linear regression with the equation. Then use the standard curve and the dilutions to determine the concentration of NO₃^- in all the samples. Using the data analysis tool pack, perform the appropriate t-test to deduce if the nitrate concentration upslope is less than or greater than the nitrate concentration downslope. When performing a t-test using the data analysis tool pack, the output will include the means for both groups.

What is the average NO₃^- concentration at the upslope site?

Report your answer, from the data analysis tool pak output, to 3 decimal places

What is the average NO₃^- concentration at the downslope site?

Report your answer, from the data analysis tool pak output, to 3 decimal places

Given the EPA drinking water quality standard is 10 mg/L of nitrate, is the upslope site safe to drink based only on nitrate content? (Enter yes or no)

Is the downslope site safe to drink, based only on NO₃^- concentration? (Enter yes or no)

Assuming the two sites are hydrologically well connected, the transit time between the two sites is fast, and the two sites cannot be treated as independent samples, what kind of t-test should be performed to show that the upslope site is greater than the downslope site? Enter the letter of your answer choice in the answer blank

A. one-tailed unpaired t-test
B. two-tailed unpaired t-test
C. one-tailed paired t-test
D. two-tailed paired t-test

What is the calculated t statistic, rounded to 4 decimal places?

Is the calculated t statistic greater or less than the critical t value reported by the data analysis tool pack? (enter greater or less)

Is the nitrate concentration at the upslope site significantly greater than the downslope site? (Enter yes or no)

Based on this statistical result, and assuming no diffusion or dilution occurs between the upslope and downslope site, do you think microbes are removing NO₃^- from the ground water? (Enter yes or no)

DATA

mg N per L	Abs	Sample ID	Upslope Absorbance	Dilution	mg N	Downslope Absorbance	Dillution
0	0	1	0.449	0.01		0.316	0.5
0.1	0.12	2	0.243	0.01		0.251	0.5
0.2	0.225	3	0.331	0.01		0.256	1
0.4	0.432	4	0.45	0.1		0.2	1
0.6	0.585	5	0.551	0.01		0.563	1
		6	0.561	0.01		0.316	0.5
		7	0.541	0.02		0.951	1
		8	0.244	0.01		0.317	1
		9	0.532	0.01		0.2	0.5
		10	0.5	0.02		0.269	1
		11	0.332	0.01		0.2	0.5
		12	0.443	0.02		0.313	0.5
		13	0.655	0.1		0.2	1
		14	0.675	0.01		0.745	1
		15	0.5	0.1		0.119	0.5
		16	0.39	0.01		0.103	1
		17	0.5	0.02		0.149	1
		18	0.532	0.01		0.311	0.5
		19	0.5	0.1		0.918	1
		20	0.108	0.01		0.328	1
		21	0.119	0.1		0.2	0.5
		22	0.689	0.01		0.206	1
		23	0.5	0.02		0.2	0.5
		24	0.329	0.1		0.508	0.5
		25	0.753	0.01		0.256	0.5
		26	0.511	0.01		0.294	0.5
		27	0.839	0.02		0.417	0.5
		28	0.543	0.01		0.149	1
		29	0.392	0.02		0.118	0.5
		30	0.444	0.01		0.201	1

In: Statistics and Probability

Assume that a simple random sample has been selected and test the given claim. Identify the...

Assume that a simple random sample has been selected and test the given claim. Identify the null and alternative hypotheses, test statistic, P-value, and state the final conclusion that addresses the original claim. Listed below are brain volumes in cm^3 of unrelated subjects used in a study. Use a 0.05 significance level to test the claim that the population of brain volumes has a mean equal to 1099.8cm ^3.

964
1028
1273
1080
1070
1173
1067
1347
1099
1203

In: Statistics and Probability

A test for diabetes classifies 99% of people with the disease as diabetic and 10% of...

A test for diabetes classifies 99% of people with the disease as diabetic and 10% of those who don't have the disease as diabetic. It is known that 12% of the population is diabetic.

a) what are the false positive and false negative rates?

b) what is the probability that someone classified as diabetic does in fact have the disease?

i) solve the problem by drawing up a contingency table and

ii) solve the problem using conditional probability and the law of total probability

In: Statistics and Probability

Suppose that in one region of the country, the mean amount of credit card debt per...

Suppose that in one region of the country, the mean amount of credit card debt per household in households having credit card debt is $8,000, with standard deviation $1,000. Find the probability that the mean amount of credit card debt in a sample of 400 such households will be within $7,925 and $$8,100.

In: Statistics and Probability

Give descriptive statistics about the current COVID-19 crisis (Mean, Mode, Median, Variance, Correlation) with stating your...

Give descriptive statistics about the current COVID-19 crisis (Mean, Mode, Median, Variance, Correlation) with stating your sources.

In: Statistics and Probability

Question one A researcher in a large supermarket wishes to study sickness absences among its employees....

Question one

A researcher in a large supermarket wishes to study sickness absences among its employees. The
organisation has branches in all the provinces, each branch keeps full records of sickness leave. A random sample of ten such branches produced the following data showing the number of days
of sickness per branch in the year 2017.
18 23 26 30 32 35 39 45 48 54
Required:
a) Using the above data

a). Calculate (manually and using the computer software such as EXCEL, SPSS etc), a
95% confidence interval for the mean amount of sickness days per branch.

b). Estimate the number of branches that should be included in a simple random sample so that a 95% confidence interval for the mean number of days sickness should not have a width greater than 4 days 5 marks
c)) After the sample was collected, it became apparent that the branches fell into three natural groups in terms of sales-small, medium and large. From the data on all of the branches in the provinces, the researcher found that of 210 randomly selected staff, 90 worked in small branches, 36 in medium sized branches, and the rest worked in large branches. In total, 96 of the selected staff had no days off for sickness, of which 52 worked in small branches, and 29 worked in large sized branches.
i) Form a table showing the information clearly. 5 marks
ii) Carry out an appropriate statistical test to investigate whether the size of branch influences the occurrence of sickness absence, interpret your results clearly.

In: Statistics and Probability

The margin of error in a conﬁdence intervals does not account for all types of error....

The margin of error in a conﬁdence intervals does not account for all types of error.

(a) What kind of error does the margin of error in a CI account for?

(b) Give an example of a kind of error which the margin of error does NOT account for.

In: Statistics and Probability

Forty-minute workouts of one of the following activities three days a week will lead to a...

Forty-minute workouts of one of the following activities three days a week will lead to a loss of weight. The following sample data show the number of calories burned during -minute workouts for three different activities.

Swimming	Tennis	Cycling
415	385	408
380	485	250
425	450	295
400	420	402
427	530	268

Use a .05 level of significance. Use Table 1 of Appendix B.

a. What is the sum of the ranks for Swimming, Tennis and Cycling (to the nearest whole number)?

Sum of Rank Swimming
Sum of Rank Tennis
Sum of Rank Cycling

b. What is the value of the test statistic (to 2 decimals)?

How many degrees of freedom?

c. What is the -value?

- Select your answer -less than .005between .005 and .01between .01 and .025between .025 and .05between .05 and .10greater than .10

Do these data indicate differences in the amount of calories burned for the three activities?

- Select your answer - Yes No

What is your conclusion?

- Select your answer -Conclude that the populations of calories burned by the three activities are identical.Conclude that the populations of calories burned by the three activities are not identical

In: Statistics and Probability

The following data show the brand, price ($), and the overall score for six stereo headphones...

The following data show the brand, price ($), and the overall score for six stereo headphones that were tested by a certain magazine. The overall score is based on sound quality and effectiveness of ambient noise reduction. Scores range from 0 (lowest) to 100 (highest). The estimated regression equation for these data is

ŷ = 21.258 + 0.327x,

where x = price ($)and y = overall score.

Brand	Price ($)	Score
A	180	76
B	150	69
C	95	63
D	70	54
E	70	38
F	35	24

(a)

Compute SST (Total Sum of Squares), SSR (Regression Sum of Squares), and SSE (Error Sum of Squares). (Round your answers to three decimal places.)

SST=SSR=SSE=

(b)

Compute the coefficient of determination

r².

(Round your answer to three decimal places.)

r²

Comment on the goodness of fit. (For purposes of this exercise, consider a proportion large if it is at least 0.55.)

The least squares line provided a good fit as a large proportion of the variability in y has been explained by the least squares line.The least squares line did not provide a good fit as a small proportion of the variability in y has been explained by the least squares line. The least squares line provided a good fit as a small proportion of the variability in y has been explained by the least squares line.The least squares line did not provide a good fit as a large proportion of the variability in y has been explained by the least squares line.

In: Statistics and Probability

Subjects