Questions
A study of the effects of exercise used rats bred to have high or low capacity...

A study of the effects of exercise used rats bred to have high or low capacity for exercise. The 8 high-capacity rats had mean blood pressure 89 mm Hg and standard deviation 9 mm Hg; the 8 low-capacity rats had mean blood pressure 105 mm Hg with standard deviation 13 mm Hg. What is the value of the two-sample t statistic for comparing the two population means?

In: Statistics and Probability

NASA has been working with utility companies to set up expensive power generating windmills. In order...

NASA has been working with utility companies to set up expensive power generating windmills. In order for a site to be effective, the average wind speed must be more than 20 mph. At a specific site, NASA took a sample of forty-five wind speed readings. The sample average wind speed is 21.7 mph with a standard deviation of 4.2 mph. NASA will only build the windmill if they are sure that the true average wind speed is greater than 20 mph.

a. Should they build the windmill? Justify by conducting a hypothesis test at α = 0.05.

b. Describe in the words of the problem what making a Type I error would be and its likely consequences

c. Describe in the words of the problem what making a Type II error would be and its likely consequences.

In: Statistics and Probability

We have two possible models in two-way ANOVA: the fixed effects model and the random effects...

We have two possible models in two-way ANOVA: the fixed effects model and the random effects model. What are the differences between these two models and how does the analysis differ?

In: Statistics and Probability

Suppose that 32% of people have a dog, 27% of people have a cat, and 12%...

Suppose that 32% of people have a dog, 27% of people have a cat, and 12% have both. What is the probability that someone owns a dog but not a cat?

In: Statistics and Probability

Recorded in the table below are the blood pressure measurements (in millimeters) for a sample of...

Recorded in the table below are the blood pressure measurements (in millimeters) for a sample of 12 adults. Does there appear to be a linear relationship between the diastolic and systolic blood pressures? At the 5% significance level, test the claim that systolic blood pressure and diastolic blood pressure have a linear relationship.

Systolic

Diastolic

107

71

110

74

133

91

115

83

118

88

134

87

123

77

154

94

119

69

130

76

108

69

112

75


Data Table: Blood Pressure 7

Hypotheses:
H0: Slope and Correlation are both zero
H1: Slope and Correlation are both not zero

Results:
What is the correlation coefficient? Use 4 decimal places in answer.
r = __________

What percent of the variation of absences are explained by the model? Round to nearest hundredth percent (i.e. 65.31%).
R2=____________

What is the equation for the regression line? Use 2 decimal places in answers.
Diastolic = (Systolic) + _________

State the p-value. Round answer to nearest hundredth percent (i.e. 2.55%).
p-value = ________

Conclusion:
We ___________ sufficient evidence to support the claim that the correlation coefficient and slope of the regression line are both statistically different than zero (p_____ 0.05).
(Use “have” or “lack” for the first blank and “<” or “>” for the second blank.)

In: Statistics and Probability

Three fair coins are tossed simultaneously 10 times. Find the probability that "2 heads and one...

Three fair coins are tossed simultaneously 10 times. Find the probability that "2 heads and one tail" will show up (a) at least once and (b) at most once.

In: Statistics and Probability

Given the scores on a certain exam are normally distributed with a mean of 75 and...

Given the scores on a certain exam are normally distributed with a mean of 75 and a standard deviation of 5


a. Calculate the z-score for 80. Find the percentage of students with scores above 80

b. Calculate the z-score for 60. Find the percentage of students with scores below 60.
c. Calculate the z-scores for 70 and 90. Find the percentage of students with scores between 70 and 90.
d. What is the median?
e. What test score value has a Z-score of -2.25?

f. What test score is the 85th percentile?

In: Statistics and Probability

An experiment was performed on a certain metal to determine if the strength is a function...

An experiment was performed on a certain metal to determine if the strength is a function of heating time (hours). Results based on 25 metal sheets are given below. Use the simple linear regression model.
∑X = 50
∑X2 = 200
∑Y = 75
∑Y2 = 1600
∑XY = 400

Find the estimated y intercept and slope. Write the equation of the least squares regression line and explain the coefficients. Estimate Y when X is equal to 4 hours. Also determine the standard error, the Mean Square Error, the coefficient of determination and the coefficient of correlation. Check the relation between correlation coefficient and Coefficient of Determination. Test the significance of the slope.

In: Statistics and Probability

You will be using your Framingham dataset for the following questions. 7. Calculate a multivariable regression...

You will be using your Framingham dataset for the following questions.

7. Calculate a multivariable regression where the outcome is total serum cholesterol and the independent variables are BMI, age, sex and smoking status. Interpret.
8. Use the regression from question 7 to answer the following.
a. What is the predicted total serum cholesterol for a 50 year-old man who doesn’t smoke and whose BMI is 25?
b. What is the predicted total serum cholesterol for a 25 year-old woman who smokes and whose BMI is 32?

In: Statistics and Probability

Describe the conditions in which a nonparametric test would be a better selection than a parametric...

Describe the conditions in which a nonparametric test would be a better selection than a parametric test. Illustrate your ideas with a specific example of when you would use each type of test using similar variables for each example.

In: Statistics and Probability

A data set of 27 male African elephants shows that their weights are normally distributed, and...

A data set of 27 male African elephants shows that their weights are normally distributed, and have an average weight of 111 kg, with a 95% confidence interval of (104, 119). I was asked to find an appropriate hypothesis test for this data set, and do the calculations and interpret the data, but if I feel that there are no natural hypothesis testing to carry out, then I should state why not. I personally feel there is no natural hypothesis test to do on this data because the average weight is normally distributed, and no outliers, despite the sample size being only 27, I feel that 111 kg is a reasonable weight for baby elephants. Please let me know if my answer is correct, if not please explain to me how to arrive to the right answer, thank you.

In: Statistics and Probability

The Book of R (Question 20.2) Please answer using R code. Continue using the survey data...

The Book of R (Question 20.2) Please answer using R code.

Continue using the survey data frame from the package MASS for the next few exercises.

  1. The survey data set has a variable named Exer , a factor with k = 3 levels describing the amount of physical exercise time each student gets: none, some, or frequent. Obtain a count of the number of students in each category and produce side-by-side boxplots of student height split by exercise.
  2. Assuming independence of the observations and normality as usual, fit a linear regression model with height as the response variable and exercise as the explanatory variable (dummy coding). What’s the default reference level of the predictor? Produce a model summary.
  3. Draw a conclusion based on the fitted model from (b)—does it appear that exercise frequency has any impact on mean height? What is the nature of the estimated effect?
  4. Predict the mean heights of one individual in each of the three exercise categories, accompanied by 95 percent prediction intervals.
  5. Do you arrive at the same result and interpretation for the height-by-exercise model if you construct an ANOVA table using aov ?
  6. Is there any change to the outcome of (e) if you alter the model so that the reference level of the exercise variable is “none”? Would you expect there to be?

Now, turn back to the ready-to-use mtcars data set. One of the variables in this data frame is qsec , described as the time in seconds it takes to race a quarter mile; another is gear , the number of forward gears (cars in this data set have either 3, 4, or 5 gears).

  1. Using the vectors straight from the data frame, fit a simple linear regression model with qsec as the response variable and gear as the explanatory variable and interpret the model summary.
  2. Explicitly convert gear to a factor vector and refit the model. Compare the model summary with that from (g). What do you find?
  3. Explain, with the aid of a relevant plot in the same style as the right image of Figure 20-6 why you think there is a difference between the two models (g) and (h).

In: Statistics and Probability

In 2012 the Centers for Disease Control and Prevention reported that in a sample of 4,349...

In 2012 the Centers for Disease Control and Prevention reported that in a sample of 4,349 African Americans 31% were Vitamin D deficient. A 90% confidence interval based on this sample is (0.30, 0.32). It is believed that among the general population of Americans 8% suffer from Vitamin D deficiency.

  1. Define the appropriate parameter and state the appropriate hypotheses for testing the claim that, among African Americans, Vitamin D deficiency occurs at a rate other than 8%.

  1. Does this confidence interval provide evidence that among African Americans Vitamin D deficiency occurs at a rate other than 8%? What significance level is being used to make this decision? Briefly justify your answer.

  1. Using the definition of a p-value, explain why the area in the tail of a randomization distribution is used to compute a p-value.

  1. In a test of the hypotheses vs. , the observed sample results in a p-value of 0.0256. Would you expect a 95% confidence interval for based on this sample to contain 0? Briefly explain why or why not.

In: Statistics and Probability

Suppose a a medical researcher claims that µ, the mean concentration of lead (in mcg/g, micrograms...

Suppose a a medical researcher claims that µ, the mean concentration of lead (in mcg/g, micrograms of lead per gram of medicine)is less than 16 mcg/g. Express in symbolic form the null and alternative hypotheses needed to test the researcher's claim.

H0 : µ ≥ 16 mcg/g
HA : µ < 16 mcg/g

H0 : µ = 16 mcg/g
HA : µ < 16 mcg/g

H0 : µ = 16 mcg/g
HA : µ ≠ 16 mcg/g

H0 : µ > 16 mcg/g
HA : µ = 16 mcg/g

To perform the hypothesis test in Question 4, the researcher selected a simple random samples of the medicine and measured the lead concentration in each. (The sample data are in the StatCrunch data set for this problem.) Use the data set and the results from Question 4 to calculate the p-value for the hypothesis test. Assume lead concentrations are approximately normally distributed. Round your answer to three decimal places; add trailing zeros as needed.

The p-value = [LeadPValue].

DATA

Lead (mcg/g)   var2
18   
6.5
22  
19.5  
11.5  
16.5  
5.5   
3
13.5  
4  

In: Statistics and Probability

A programming team is in the process of testing a new software module. As part of...

A programming team is in the process of testing a new software module. As part of the effort, they need to estimate the success rate of the module when used with a particular operating system. To do this, they plan to run the module on a randomly selected set of computers, record how many individual runs execute properly, and use that result to calculate the sample success rate (p-hat, the number of successes divided by the total number of tests). Assuming a confidence level of 99%, calculate n, the number of computers they need to use for the test in order to ensure a 0.03 margin of error in the success rate. Calculate n for the following two cases: (1) no assumption is made about the value of the sample success rate, and (2) in a recent test of a similar software module, that module ran successfully in 94% of the tests. Round your answers upward to the next higher integer.

(1) If no assumptions are made about the sample success rate, the sample size required to ensure a margin of error of 0.03 is n = .

(2) If it is assumed that the new module will run successfully roughly in 94% of the tests, the required sample size required to ensure a margin of error of 0.03 is n =

In: Statistics and Probability