The table below gives the list price and the number of bids received for five randomly selected items sold through online auctions. Using this data, consider the equation of the regression line, yˆ=b0+b1x, for predicting the number of bids an item will receive based on the list price. Keep in mind, the correlation coefficient may or may not be statistically significant for the data given. Remember, in practice, it would not be appropriate to use the regression line to make a prediction if the correlation coefficient is not statistically significant. Price in Dollars 23 25 32 36 40 Number of Bids 2 4 5 8 10 Table Step 4 of 6: Find the estimated value of y when x=23. Round your answer to three decimal places.
In: Statistics and Probability
The following density function describes a random variable X. f(x)= (x/64) if 0<x<8 and f(x) = (16-x)/64 if 8<x<16
A. Find the probability that X lies between 2 and 6.
B. Find the probability that X lies between 5 and 12.
C. Find the probability that X is less than 11.
D. Find the probability that X is greater than 4.
In: Statistics and Probability
Question (4) [6 marks]
I am a firm believer in sketching things on tests/assignments. I think that sometimes the sketch can help remind you what you have to do next, or correctly guide your thought process. One of my TA's for a previous Statistics course gathered data from marked tests to determine if sketching had an influence grades. For each student in the class, we recorded whether or not they sketched during a specific question, and also recorded whether or not they obtained the correct p- value. The results were as follows:
At 10% the level of significant, is there any statistical reason to believe that sketching is associated with grades?
Note: You can use the functions qchisq() in R to help you in solving the following. Why we are using qchisq() function in R.
The qchisq() function in R allows us to specify a desired area in a tail and the number of degrees of freedom. From that information, qchisq() computes the required x-value to get the specified area in the specified tail with the specified number of degrees of freedom.
O |
sketch |
No sketch |
Total |
correct |
50 |
30 |
|
incorrect |
60 |
90 |
|
Total |
E |
sketch |
No sketch |
Total |
correct |
|||
incorrect |
|||
Total |
a) State the two hypothesis of interest.
b) Calculate an appropriate test statistic for (a) by hand.
χ 2 (O E )2 E
c) Write your conclusion using the rejection region method “critical value method” include both statistical and related to the topic of the question (practical) interpretation use the function qchisq() in R.
Question 5
In this question we’ll use housetasks dataset from STHDA: http://www.sthda.com/sthda/RDoc/data/housetasks.txt. The dataset is a contingency
table containing 13 house tasks and their distribution in the couple: → rows are the different tasks
→ values are the frequencies of the tasks done :
1) by the wife only
2) alternatively
3) by the husband only 4) or jointly
Using R test whether the two variables housetasks and their distribution in the couple are statistically significantly associated (dependent) by answering the following questions.
a)
b)
(1 mark) State the two hypothesis of interest.
(0.5 mark) Import the data into R using the function read.table
Note: show your R codes but not the output (the dataset).
c)
(0.5 mark) Calculate Chi-square statistic using the function chisq.test() in R. Note: show your R codes and output.
d)
(1 mark) Use ? = 0.05 write your conclusion using the p-value (include both statistical and related to the topic of the question interpretation).
DATA for Q5
Wife Alternating Husband Jointly Laundry 156 14 2 4 Main_meal 124 20 5 4 Dinner 77 11 7 13 Breakfeast 82 36 15 7 Tidying 53 11 1 57 Dishes 32 24 4 53 Shopping 33 23 9 55 Official 12 46 23 15 Driving 10 51 75 3 Finances 13 13 21 66 Insurance 8 1 53 77 Repairs 0 3 160 2 Holidays 0 1 6 153
In: Statistics and Probability
We have a normal population of scores, with ? = 55 and ? = 17. If we select a random sample of 100 participants and obtain a mean of 57, is that a typical mean value for this distribution based on the values that cutoff the middle 95%?
In: Statistics and Probability
1. Think of three hypothetical datasets that you believe would have the binomial distribution, the uniform distribution, and the normal distribution. Use the textbook homework exercises as a reference, but as much as possible, use your own original examples. Try to be as realistic as possible. For example, 'height' is not a good example of data with uniform distribution, because you won't find the same number of people who are 7 feet tall as there are 5 feet tall.
2. Create a new post with the three examples you came up with.
Then for each of the examples, create problems for students to solve:
a) Binomial distribution. Write one problem each that's solved using the following: binomcdf, 1 - binomcdf, binompdf, μ = np.
In: Statistics and Probability
In the game of roulette, a steel ball is rolled onto a wheel that contains 18 red, 18 black, and 2 green slots. If the ball is rolled 38 times, find the probability of the following events.
A. The ball falls into the green slots 4 or more times.
B. The ball does not fall into any green slots.
C. The ball falls into black slots 15 or more times.
D. The ball falls into red slots 10 or fewer times.
In: Statistics and Probability
The amount of time Americans spend in front of a screen has a mean of 11 hours with a standard deviation of 2.7 hours. What is the probability that a random chosen American spends more than 12 hours in front of a screen? Represent the probability with a graph.
In: Statistics and Probability
v
Consider the monthly time series shown in the table.
Month |
t |
Y |
January |
1 |
185 |
February |
2 |
192 |
March |
3 |
189 |
April |
4 |
201 |
May |
5 |
195 |
June |
6 |
199 |
July |
7 |
206 |
August |
8 |
203 |
September |
9 |
208 |
October |
10 |
209 |
November |
11 |
218 |
December |
12 |
216 |
In: Statistics and Probability
A national chain of women’s clothing stores with locations in the large shopping malls thinks that it can do a better job of planning more renovations and expansions if it understands what variables impact sales. It plans a small pilot study on stores in 25 different mall locations. The data it collects consist of monthly sales, store size (sq. ft), number of linear feet of window display, number of competitors located in mall, size of the mall (sq. ft),and distance to nearest competitor (ft). USING EXCEL FUNCTIONS
Sales | Size | Windows | Competitors | Mall Size | Nearest Competitor |
4453 | 3860 | 39 | 12 | 943700 | 227 |
4770 | 4150 | 41 | 15 | 532500 | 142 |
4821 | 3880 | 39 | 15 | 390500 | 263 |
4912 | 4000 | 39 | 13 | 545500 | 219 |
4774 | 4140 | 40 | 10 | 329600 | 232 |
4638 | 4370 | 48 | 14 | 802600 | 257 |
4076 | 3570 | 37 | 16 | 463300 | 241 |
3967 | 3870 | 39 | 16 | 855200 | 220 |
4000 | 4020 | 44 | 21 | 443000 | 188 |
4379 | 3990 | 38 | 16 | 613400 | 209 |
5761 | 4930 | 50 | 15 | 420300 | 220 |
3561 | 3540 | 34 | 15 | 626700 | 167 |
4145 | 3950 | 36 | 14 | 601500 | 187 |
4406 | 3770 | 36 | 12 | 593000 | 199 |
4972 | 3940 | 38 | 11 | 347100 | 204 |
4414 | 3590 | 35 | 10 | 355900 | 146 |
4363 | 4090 | 38 | 13 | 490100 | 206 |
4499 | 4580 | 45 | 16 | 649200 | 144 |
3573 | 3580 | 35 | 18 | 685900 | 178 |
5287 | 4380 | 42 | 15 | 106200 | 149 |
5339 | 4330 | 40 | 10 | 354900 | 231 |
4656 | 4060 | 37 | 11 | 598700 | 225 |
3943 | 3380 | 34 | 16 | 381800 | 163 |
5121 | 4760 | 44 | 17 | 597900 | 224 |
4557 | 3800 | 36 | 14 | 745300 | 195 |
In: Statistics and Probability
For a study conducted by the research department of a pharmaceutical company, 295 randomly selected individuals were asked to report the amount of money they spend annually on prescription allergy relief medication. The sample mean was found to be $17.60 with a standard deviation of $5.70. A random sample of 235 individuals was selected independently of the first sample. These individuals reported their annual spending on non-prescription allergy relief medication. The mean of the second sample was found to be $18.40 with a standard deviation of $4.40 . As the sample sizes were quite large, it was assumed that the respective population standard deviations of the spending for prescription and non-prescription allergy relief medication could be estimated as the respective sample standard deviation values given above. Construct a 95% confidence interval for the difference between the mean spending on prescription allergy relief medication () and the mean spending on non-prescription allergy relief medication (). Then complete the table below. Carry your intermediate computations to at least three decimal places. Round your answers to at least two decimal places. What is the lower limit of the 95% confidence interval? What is the upper limit of the 95% confidence interval?
In: Statistics and Probability
A (very large) population has a mean µ of 800 and a standard deviation σ of 25. What is the probability that a sample mean x will be within ± 5 units of the population mean for each of the following sample sizes?
a. n = 50
b. n = 75
c. n = 100
In: Statistics and Probability
The taxi and takeoff time for commercial jets is a random variable x with a mean of 8.3 minutes and a standard deviation of 3.5 minutes. Assume that the distribution of taxi and takeoff times is approximately normal. You may assume that the jets are lined up on a runway so that one taxies and takes off immediately after the other, and that they take off one at a time on a given runway.
(a) What is the probability that for 37 jets on a given runway, total taxi and takeoff time will be less than 320 minutes? (Round your answer to four decimal places.) Incorrect: Your answer is incorrect.
(b) What is the probability that for 37 jets on a given runway, total taxi and takeoff time will be more than 275 minutes? (Round your answer to four decimal places.)
(c) What is the probability that for 37 jets on a given runway, total taxi and takeoff time will be between 275 and 320 minutes? (Round your answer to four decimal places.)
In: Statistics and Probability
According to the Air Transport Association of America, the average operating cost of an MD-80 jet airliner is $2,087 per hour. Suppose the operating costs of an MD-80 jet airliner are normally distributed with a standard deviation of $163 per hour. (Round the value of z to 2 decimal places. Round your answers to 2 decimal places.)
(a) At what operating cost would only 20% of the operating costs be less? $ enter the dollar amount at which only 20% of the operating costs would be less
(b) At what operating cost would 65% of the operating costs be more? $ enter the dollar amount at which 65% of the operating costs would be more
(c) What operating cost would be more than 85% of operating costs?
In: Statistics and Probability
Anyone who has been outdoors on a summer evening has probably heard crickets. Did you know that it is possible to use the cricket as a thermometer? Crickets tend to chirp more frequently as temperatures increase. This phenomenon was studied in detail by George W. Pierce, a physics professor at Harvard. In the following data, x is a random variable representing chirps per second and y is a random variable representing temperature (°F). x 19.4 16.6 20.6 17.7 16.3 15.5 14.7 17.1 y 87.4 70.8 92.7 83.1 82.6 75.2 69.7 82.0 x 15.4 16.2 15.0 17.2 16.0 17.0 14.4 y 69.4 83.3 79.6 82.6 80.6 83.5 76.3 Complete parts (a) through (e), given Σx = 249.1, Σy = 1198.8, Σx2 = 4176.81, Σy2 = 96,414.66, Σxy = 20,030.86, and r ≈ 0.787. (a) Draw a scatter diagram displaying the data.(b) Verify the given sums Σx, Σy, Σx2, Σy2, Σxy, and the value of the sample correlation coefficient r. (Round your value for r to three decimal places.) Σx = Σy = Σx2 = Σy2 = Σxy = r = (c) Find x, and y. Then find the equation of the least-squares line y hat = a + bx. (Round your answers for x and y to two decimal places. Round your answers for a and b to three decimal places.) x = y = y hat = + x (d) Graph the least-squares line. Be sure to plot the point (x, y) as a point on the line.(e) Find the value of the coefficient of determination r2. What percentage of the variation in y can be explained by the corresponding variation in x and the least-squares line? What percentage is unexplained? (Round your answer for r2 to three decimal places. Round your answers for the percentages to one decimal place.) r2 = explained % unexplained % (f) What is the predicted temperature when x = 18.0 chirps per second? (Round your answer to two decimal places.) °F
In: Statistics and Probability
You are the foreman of the Bar-S cattle ranch in Colorado. A neighboring ranch has calves for sale, and you are going to buy some calves to add to the Bar-S herd. How much should a healthy calf weigh? Let x be the age of the calf (in weeks), and let y be the weight of the calf (in kilograms). x 1 5 8 16 26 36 y 42 52 71 100 150 200 Complete parts (a) through (e), given Σx = 92, Σy = 615, Σx2 = 2318, Σy2 = 82,009, Σxy = 13,570, and r ≈ 0.998. (a) Draw a scatter diagram displaying the data.(b) Verify the given sums Σx, Σy, Σx2, Σy2, Σxy, and the value of the sample correlation coefficient r. (Round your value for r to three decimal places.) Σx = Σy = Σx2 = Σy2 = Σxy = r = (c) Find x, and y. Then find the equation of the least-squares line y hat = a + bx. (Round your answers for x and y to two decimal places. Round your answers for a and b to three decimal places.) x = y = y hat = + x (d) Graph the least-squares line. Be sure to plot the point (x, y) as a point on the line.(e) Find the value of the coefficient of determination r2. What percentage of the variation in y can be explained by the corresponding variation in x and the least-squares line? What percentage is unexplained? (Round your answer for r2 to three decimal places. Round your answers for the percentages to one decimal place.) r2 = explained % unexplained % (f) The calves you want to buy are 13 weeks old. What does the least-squares line predict for a healthy weight? (Round your answer to two decimal places.) kg
In: Statistics and Probability