In: Statistics and Probability
Price | Age (months) | Mileage (km) | Weight (kg) |
13500 | 23 | 46986 | 1165 |
13750 | 23 | 72937 | 1165 |
13950 | 24 | 41711 | 1165 |
14950 | 26 | 48000 | 1165 |
13750 | 30 | 38500 | 1170 |
12950 | 32 | 61000 | 1170 |
16900 | 27 | 94612 | 1245 |
18600 | 30 | 75889 | 1245 |
21500 | 27 | 19700 | 1185 |
12950 | 23 | 71138 | 1105 |
20950 | 25 | 31461 | 1185 |
19950 | 22 | 43610 | 1185 |
19600 | 25 | 32189 | 1185 |
21500 | 31 | 23000 | 1185 |
22500 | 32 | 34131 | 1185 |
22000 | 28 | 18739 | 1185 |
22750 | 30 | 34000 | 1185 |
17950 | 24 | 21716 | 1105 |
16750 | 24 | 25563 | 1065 |
16950 | 30 | 64359 | 1105 |
15950 | 30 | 67660 | 1105 |
16950 | 29 | 43905 | 1170 |
15950 | 28 | 56349 | 1120 |
16950 | 28 | 32220 | 1120 |
16250 | 29 | 25813 | 1120 |
15950 | 25 | 28450 | 1120 |
17495 | 27 | 34545 | 1120 |
15750 | 29 | 41415 | 1120 |
16950 | 28 | 44142 | 1120 |
17950 | 30 | 11090 | 1120 |
12950 | 29 | 9750 | 1100 |
15750 | 22 | 35199 | 1100 |
15950 | 27 | 29510 | 1100 |
14950 | 26 | 32692 | 1100 |
15500 | 22 | 41000 | 1100 |
15750 | 26 | 43000 | 1100 |
15950 | 25 | 25000 | 1100 |
14950 | 23 | 10000 | 1100 |
15750 | 32 | 25329 | 1100 |
14750 | 27 | 27500 | 1100 |
13950 | 22 | 49059 | 1100 |
16750 | 27 | 44068 | 1100 |
13950 | 22 | 46961 | 1100 |
16950 | 27 | 110404 | 1255 |
16950 | 22 | 100250 | 1255 |
19000 | 23 | 84000 | 1270 |
17950 | 27 | 79375 | 1255 |
15800 | 22 | 75048 | 1110 |
17950 | 22 | 72215 | 1255 |
21950 | 31 | 64982 | 1195 |
17950 | 22 | 62636 | 1255 |
15750 | 30 | 57086 | 1110 |
20500 | 26 | 56000 | 1180 |
21950 | 27 | 49866 | 1195 |
15500 | 25 | 49163 | 1165 |
13250 | 32 | 45725 | 1075 |
15250 | 28 | 43210 | 1110 |
15250 | 26 | 43000 | 1110 |
18950 | 23 | 39704 | 1180 |
15999 | 30 | 38950 | 1130 |
14950 | 22 | 37400 | 1110 |
16500 | 27 | 37177 | 1130 |
18750 | 31 | 36544 | 1130 |
17950 | 30 | 33511 | 1130 |
17950 | 27 | 32809 | 1110 |
16950 | 26 | 32181 | 1075 |
18950 | 28 | 30993 | 1130 |
14950 | 22 | 30400 | 1110 |
22250 | 22 | 30000 | 1275 |
15950 | 25 | 29719 | 1110 |
15950 | 28 | 29206 | 1110 |
12995 | 32 | 29198 | 1060 |
18950 | 28 | 28817 | 1130 |
15750 | 23 | 28227 | 1110 |
19950 | 28 | 28000 | 1130 |
16950 | 23 | 28000 | 1115 |
18750 | 31 | 25266 | 1130 |
18450 | 27 | 23489 | 1115 |
16895 | 29 | 22575 | 1115 |
14900 | 30 | 22000 | 1110 |
18950 | 25 | 20019 | 1180 |
17250 | 29 | 20000 | 1115 |
15450 | 25 | 17003 | 1110 |
17950 | 31 | 16238 | 1180 |
16650 | 25 | 15414 | 1110 |
17450 | 28 | 8537 | 1130 |
14900 | 30 | 7000 | 1100 |
17950 | 20 | 66966 | 1245 |
15950 | 19 | 51884 | 1100 |
21950 | 19 | 50005 | 1265 |
16450 | 20 | 48110 | 1100 |
22250 | 20 | 37500 | 1260 |
19950 | 16 | 34472 | 1260 |
15950 | 20 | 33329 | 1100 |
18900 | 20 | 31850 | 1120 |
19950 | 17 | 30351 | 1260 |
15950 | 19 | 29435 | 1100 |
15950 | 19 | 25948 | 1100 |
18750 | 11 | 24500 | 1120 |
Data Set Preparation
1. Using the “Toyota Corolla” data set on D2L (Content à “JMP” à “JMP Data Sets” folder), we will be interested in analyzing the “Price” of a car as the dependent variable (Y). Please select one independent variable (X) you think may help explain Price, from the following three: “Age”, “Mileage”, or “Weight” of a car. In the space below, state your choice and explain why you chose it.
2. Randomly select a subset (sample) of 100 observations from the data file using the commands: Tables → Subset → Random - sample size: → (select 100 observations). After doing so, please use the newly created data window (your randomly selected subset), and move on to the Data Exploration section below.
Data Exploration
3. Explore the dependent variable (Price) and independent variable visually, by creating histograms for each one. Paste them below. Under each histogram, note the mean, say whether the data is skewed, and note if there are any outliers.
4. Write down the 95% confidence interval for the mean, for both the dependent and independent variable.
5. For the Price variable, test the hypothesis of µ being different than 11,500 at the 5% level of significance:
(a) State your null and alternative hypotheses.
(b) Find the relevant p-value and write it down.
(c) What do you conclude, based on your findings?
6. Using Analyze à Fit Y by X, create a scatterplot between Price (Y) and the independent variable (X) you chose. Paste it below, and comment on the following 4 concepts: direction, shape, strength of relationship, and whether there are any outliers.
7. Using Analyze à Fit Y by X, find what the correlation is between the two variables and write it down. Then, comment on the strength of the relationship.
Simple Linear Regression Modeling
8. Fit the regression model:
(a) Run the regression of X (independent variable) on Y (Price), and plot the regression line over the data. Paste your output below.
(b) Identify the results of the hypothesis test (p-value) on the regression slope coefficient. What do you conclude?
(c) Write down the regression equation, by hand.
(d) Make a prediction for Y (Price), by using one of the relevant bullets below:
If your X variable for the project is Age, use 15 for X
If your X variable for the project is Mileage, use 25,000 for X
If your X variable for the project is Weight, use 1,200 for X