Using the mtcars dataset, answer the following questions:
Fill in the following table:
Variable |
Correlation with mpg |
cyl |
-0.85216 |
disp |
-0.84755 |
hp |
-0.77617 |
drat |
0.681172 |
wt |
-0.86766 |
qsec |
0.418684 |
vs |
0.664039 |
am |
0.599832 |
gear |
0.480285 |
carb |
-0.55093 |
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | ||
Mazda RX4 | 21 | 6 | 160 | 110 | 3.9 | 2.62 | 16.46 | 0 | 1 | 4 | 4 | |
Mazda RX4 Wag | 21 | 6 | 160 | 110 | 3.9 | 2.875 | 17.02 | 0 | 1 | 4 | 4 | |
Datsun 710 | 22.8 | 4 | 108 | 93 | 3.85 | 2.32 | 18.61 | 1 | 1 | 4 | 1 | |
Hornet 4 Drive | 21.4 | 6 | 258 | 110 | 3.08 | 3.215 | 19.44 | 1 | 0 | 3 | 1 | |
Hornet Sportabout | 18.7 | 8 | 360 | 175 | 3.15 | 3.44 | 17.02 | 0 | 0 | 3 | 2 | |
Valiant | 18.1 | 6 | 225 | 105 | 2.76 | 3.46 | 20.22 | 1 | 0 | 3 | 1 | |
Duster 360 | 14.3 | 8 | 360 | 245 | 3.21 | 3.57 | 15.84 | 0 | 0 | 3 | 4 | |
Merc 240D | 24.4 | 4 | 146.7 | 62 | 3.69 | 3.19 | 20 | 1 | 0 | 4 | 2 | |
Merc 230 | 22.8 | 4 | 140.8 | 95 | 3.92 | 3.15 | 22.9 | 1 | 0 | 4 | 2 | |
Merc 280 | 19.2 | 6 | 167.6 | 123 | 3.92 | 3.44 | 18.3 | 1 | 0 | 4 | 4 | |
Merc 280C | 17.8 | 6 | 167.6 | 123 | 3.92 | 3.44 | 18.9 | 1 | 0 | 4 | 4 | |
Merc 450SE | 16.4 | 8 | 275.8 | 180 | 3.07 | 4.07 | 17.4 | 0 | 0 | 3 | 3 | |
Merc 450SL | 17.3 | 8 | 275.8 | 180 | 3.07 | 3.73 | 17.6 | 0 | 0 | 3 | 3 | |
Merc 450SLC | 15.2 | 8 | 275.8 | 180 | 3.07 | 3.78 | 18 | 0 | 0 | 3 | 3 | |
Cadillac Fleetwood | 10.4 | 8 | 472 | 205 | 2.93 | 5.25 | 17.98 | 0 | 0 | 3 | 4 | |
Lincoln Continental | 10.4 | 8 | 460 | 215 | 3 | 5.424 | 17.82 | 0 | 0 | 3 | 4 | |
Chrysler Imperial | 14.7 | 8 | 440 | 230 | 3.23 | 5.345 | 17.42 | 0 | 0 | 3 | 4 | |
Fiat 128 | 32.4 | 4 | 78.7 | 66 | 4.08 | 2.2 | 19.47 | 1 | 1 | 4 | 1 | |
Honda Civic | 30.4 | 4 | 75.7 | 52 | 4.93 | 1.615 | 18.52 | 1 | 1 | 4 | 2 | |
Toyota Corolla | 33.9 | 4 | 71.1 | 65 | 4.22 | 1.835 | 19.9 | 1 | 1 | 4 | 1 | |
Toyota Corona | 21.5 | 4 | 120.1 | 97 | 3.7 | 2.465 | 20.01 | 1 | 0 | 3 | 1 | |
Dodge Challenger | 15.5 | 8 | 318 | 150 | 2.76 | 3.52 | 16.87 | 0 | 0 | 3 | 2 | |
AMC Javelin | 15.2 | 8 | 304 | 150 | 3.15 | 3.435 | 17.3 | 0 | 0 | 3 | 2 | |
Camaro Z28 | 13.3 | 8 | 350 | 245 | 3.73 | 3.84 | 15.41 | 0 | 0 | 3 | 4 | |
Pontiac Firebird | 19.2 | 8 | 400 | 175 | 3.08 | 3.845 | 17.05 | 0 | 0 | 3 | 2 | |
Fiat X1-9 | 27.3 | 4 | 79 | 66 | 4.08 | 1.935 | 18.9 | 1 | 1 | 4 | 1 | |
Porsche 914-2 | 26 | 4 | 120.3 | 91 | 4.43 | 2.14 | 16.7 | 0 | 1 | 5 | 2 | |
Lotus Europa | 30.4 | 4 | 95.1 | 113 | 3.77 | 1.513 | 16.9 | 1 | 1 | 5 | 2 | |
Ford Pantera L | 15.8 | 8 | 351 | 264 | 4.22 | 3.17 | 14.5 | 0 | 1 | 5 | 4 | |
Ferrari Dino | 19.7 | 6 | 145 | 175 | 3.62 | 2.77 | 15.5 | 0 | 1 | 5 | 6 | |
Maserati Bora | 15 | 8 | 301 | 335 | 3.54 | 3.57 | 14.6 | 0 | 1 | 5 | 8 | |
Volvo 142E | 21.4 | 4 | 121 | 109 | 4.11 | 2.78 | 18.6 | 1 | 1 | 4 | 2 | |
correlation | -0.85216 | -0.84755 | -0.77617 | 0.681172 | -0.86766 | 0.418684 | 0.664039 | 0.599832 | 0.480285 | -0.55093 |
Which of the variables is the best predictor of mpg? Justify your answer using the correlation coefficient and a scatterplot.
Fit a regression model based on your answer to question 2. Write your model below.
Is the slope significant? Justify your answer.
Interpret the slope of your regression model.
Interpret the r2 value.
Do you believe your model is a good model for predicting mpg? Justify your answer.
In: Math
1. 100,000 Massachusetts adults were randomly sampled with two factors recorded: whether or not the individual
had diabetes, and whether or not the person ate Kale. The following gives a table of the results.
Diabetes No Diabetes
Kale: 796 9187
No Kale: 9900 80117
(a) We write p(Diabetes | Kale) for the probability that a Massachusetts adult who eats kale has diabetes.
Either give a value for p(Diabetes | Kale) or explain why it cannot be computed.
(b) We write ^p(Diabetes | Kale) for the proportion of Kale-eating members of our sample above that had
diabetes. Either give a value for ^p(Diabetes | Kale) or explain why it cannot be computed.
(c) Give 95% confidence intervals for the probability of having diabetes for both the kale-eating and non-kale-
eating members of our Massachusetts adults.
(d) Can you conclude that the kale-eaters are less likely to have diabetes? Explain your reasoning.
(e) Can you conclude that kale consumption causes a lower diabetes rate in this population? Explain your
reasoning.
(f) Come up with a possible theory that explains why kale-eaters have a lower rate of diabetes, but does not
assume that kale causes the lower rate.
In: Math
I have no strong background in Probability, please, present to me an easy to understand solution to this problem with detail explanation. Thank you.
An urn contains three white, six red, and five black balls. Six
of these balls are randomly selected from the urn. Let X and Y
denote
respectively the number of white and black balls selected. Compute
the conditional probability mass function of X given that Y = 3.
Also compute E[X|Y = 1]
In: Math
Describe and discuss at least 1 other business scenario in which you believe Chi-square testing would be helpful to a company.
Use the following data to conduct a Chi-square test for each region of the company
Region | Expected |
Actual |
---|---|---|
Southeast | ||
Defined |
100 | 98 |
Open |
100 | 104 |
Northeast | ||
Defined |
150 | 188 |
Open |
150 | 214 |
Midwest | ||
Defined |
125 | 120 |
Open |
125 | 108 |
Pacific | ||
Defined |
200 | 205 |
Open |
200 | 278 |
In: Math
For 300 trading days, the daily closing price of a stock (in $) is well modeled by a Normal model with mean $195.89 and standard deviation $7.18
According to this model, what cutoff value of price would separate the
a) lowest 15% of the days?
b) highest 0.61%?
c) middle 80%?
d) highest 50%?
c) Select the correct answer below and fill in the answer box(es) within your choice.
A.The cutoff points are _________ and _________.
(Use ascending order. Round to two decimal places as needed.)
B.The cutoff point is _________. (Round to two decimal places as needed.)
In: Math
A concrete service owns 3 mixer trucks to deliver concrete to work sites in a local area. For each order, a truck is loaded and weighed with mix at the single station for the order. Then the truck makes the delivery and returns for the next load. Assume that there is always an order for each truck. The delivery times are 15 minutes plus an exponential amount with mean 30 minutes and load/weigh times are 8 minutes. This is a variation on the single server queue with a fixed population of trucks. Trucks are served at the loader and return to it as arrivals following their delivery. Simulate this system by hand for 3 hours simulated time and report on the number of deliveries completed. The trucks begin empty, ready to be loaded.
In: Math
1. A $1 scratch off lotto ticket will be a winner one out of 10 times. Out of a shipment of n = 200 lotto tickets, using the Poisson distributions in each case (for a, b and c), find the probability for the lotto tickets that there are: a. somewhere between 75 and 95 prizes. b. somewhere between 15 and 25 prizes. c. more than 50 prizes d.If a customer keeps buying tickets till she finds a winner, find the probability that her 10th ticket will be a winner.
In: Math
Each of three barrels from a manufacturing line are classified as either above (a) or below (b) the target weight. Provide the ordered sample space.
In: Math
An online poll at a popular web site asked the following:
A nationwide ban of the diet supplement ephedra went into effect recently. The herbal stimulant has been linked to 155 deaths and many more heart attacks and strokes. Ephedra manufacturer NVE Pharmaceuticals, claiming that the FDA lacked proof that ephedra is dangerous if used as directed, was denied a temporary restraining order on the ban yesterday by a federal judge. Do you think that ephedra should continue to be banned nationwide?
65% of the 17,303 respondents said “yes.” Comment on each of the following statements about this poll in one or two complete sentences.
(a) With a sample size that large, we can be pretty certain we know the true proportion of Americans who think ephedra should be banned.
(b) The wording of the question is clearly very biased.
(c) The sampling frame is all Internet users.
(d) This is a voluntary response survey, so the results can’t be reliably generalized to any population of interest.
In: Math
Describe how and why you should partition your data when using classification techniques like k-nearest neighbors and logistic regression.
In: Math
Eight pawns are placed at random on an 8 × 8 chessboard, with all configurations equally likely. What is the probability of each of these events:
1. The pawns are in a straight line. (Don’t forget the diagonals!)
2. All the pawns occupy white squares.
3. No two pawns share the same row.
4. Now two pawns share the same row or the same column.
In: Math
7. A student measures the mass percent chloride in their unknown four times by each of two methods. (a) Find the mean, standard deviation, and 95% confidence limit for each method. (b) Determine if each method’s result is “significantly” different from the expected value of 48.86% Chloride. (c) Use the F test to decide whether the standard deviations are “significantly” different. (d) Use the t test to decide whether the means are different from one another at the 95% confidence level.
Method Replicate Measurements
A 47.62% 47.91% 47.83% 47.79% 48.01%
B 47.11% 49.63% 48.72% 49.17% 47.99%
In: Math
A manufacturer of television sets is interested in the effect on tube conductivity of four different types of coating for color picture tubes. A completely randomized experiment is conducted, and the following conductivity data are obtained:
Coating type |
Conductivity |
|||
1 |
143 |
141 |
150 |
146 |
2 |
152 |
149 |
137 |
143 |
3 |
134 |
136 |
132 |
127 |
4 |
129 |
127 |
132 |
129 |
a) Is there a difference in conductivity due to a coating type? Use α= 0.05.
b) Estimate the overall mean and the treatment effects.
c) Calculate a confidence interval with 95% for the average coating type 4 and calculate the confidence interval with 99% for the difference between the average coating types 1 and 4.
d) Test all pairs of means using the Fisher LSD method with α= 0.05.
e) Use the graphical method discussed in section 3-5.3 to compare the means. What is the type of coating which produces the highest conductivity?
f) Assuming that coating type 4 is currently being used, what would you recommend to manufacturer? It wants to minimize conductivity.
g) Analyze the residuals and draw conclusions about the adequacy of the model.
In: Math
Let t 0 be a specific value of t. Use the table of critical values of t below to find t 0 dash values such that following statements are true. a. Upper P left parenthesis t greater than or equals t 0 right parenthesisequals.025, where dfequals11 b. Upper P left parenthesis t greater than or equals t 0 right parenthesisequals.01, where dfequals18 c. Upper P left parenthesis t less than or equals t 0 right parenthesisequals.005, where dfequals7 d. Upper P left parenthesis t less than or equals t 0 right parenthesisequals.05, where dfequals14 LOADING... Click the icon to view the table of critical values of t.
In: Math
7. Two fair six sided dice are rolled.
(i) What is the probability that the sum of the two results is
6?
(ii) What is the probability that the larger value of the two
results is 4?
(iii) What is the probability that the both results are at most
4?
(iv) What is the probability that the number 3 appears at least
once?
In: Math