In: Statistics and Probability
The value of a sports franchise is directly related to the amount of revenue that a franchise can generate. The data below represents the value in 2017 ($ in millions) and the annual revenue ($ in millions) for the 30 Major League Baseball teams.
Team |
Revenue ($ in millions) |
Value ($ in millions) |
Baltimore Orioles |
253 |
1175 |
Boston Red Sox |
434 |
2700 |
Chicago White Sox |
269 |
1350 |
Cleveland Indians |
271 |
920 |
Detroit Tigers |
275 |
1200 |
Houston Astros |
299 |
1450 |
Kansas City Royals |
246 |
950 |
Los Angeles Angels |
350 |
1750 |
Minnesota Twins |
249 |
1025 |
New York Yankees |
526 |
3700 |
Oakland Athletics |
216 |
880 |
Seattle Mariners |
289 |
1400 |
Tampa Bay Rays |
205 |
825 |
Texas Rangers |
298 |
1550 |
Toronto Blue Jays |
278 |
1300 |
Arizona Diamondbacks |
253 |
1150 |
Atlanta Braves |
275 |
1500 |
Chicago Cubs |
434 |
2675 |
Cincinnati Reds |
229 |
915 |
Colorado Rockies |
248 |
1000 |
Los Angeles Dodgers |
462 |
2750 |
Miami Marlins |
206 |
940 |
Milwaukee Brewers |
239 |
925 |
New York Mets |
332 |
2000 |
Philadelphia Phillies |
325 |
1650 |
Pittsburgh Pirates |
265 |
1250 |
St. Louis Cardinals |
310 |
1800 |
San Diego Padres |
259 |
1125 |
San Francisco Giants |
428 |
2650 |
Washington Nationals |
304 |
1600 |
1. Using Excel or JMP, construct a scatterplot of value versus the revenue for the 30 MLB teams in 2016. Provide a copy of the resulting scatterplot. (3 points)
2. Based upon your scatterplot, does it appear that the linear model is a reasonable approximation of the data? Comment on the direction and form of the relationship. (2 points)
3. Using Minitab provide (or attach) the simple linear regression analysis for predicting a team’s value based upon its revenue. (2 points)
4. State the slope for the simple linear regression analysis and interpret this value in this context. (2 points)
5. State the y-intercept for the simple linear regression analysis and interpret, if applicable. (1 point)
6. State the standard error of the regression analysis and interpret that value. (2 points)
7. State the coefficient of determination and interpret the value in this context. (2 points)
8. State the sum of square errors. (1 point)
SSE=504849.5953
9. State the standard error of the slope. (1 point)
SEb(1)=sqrt(504849.5953/30-2)/sqrt(186742.7)=134.277/432.137=0.3107
Source: www.forbes.com
10. Calculate and interpret the 95% confidence interval for slope. (2 points)
8.6507+-2.048(0.3107)=8.6507+-0.6363=(8.0144,
We are 95% confident that the slope of the interval is between 8.0144 and 9.287.
11. From the coefficient of determination, standard error of regression, and the confidence interval for slope does that model appear to fit well? Explain. (2 points)
## 1. Using Excel or JMP, construct a scatterplot of value versus the revenue for the 30 MLB teams in 2016. Provide a copy of the resulting scatterplot.
## 2. Based upon your scatterplot, does it appear that the linear model is a reasonable approximation of the data? Comment on the direction and form of the relationship.
Answer : from scatter plot we can say that all points are in linear pattern , if we draw line there is linear relationship occur .
there is positive relationship between Revenue and value , as Revenue value increases Value is increases .
## 3) Using Minitab provide (or attach) the simple linear regression analysis for predicting a team’s value based upon its revenue.
Regression Equation
value = -1066.2 + 8.651 Revenue
### 4) State the slope for the simple linear
regression analysis and interpret this value in this context.
Answer : Slope is 8.651 as Revenue value increases as 1 then value increases as the 8.651 units .
slope is positive that means there is positive relationship between value and Revenue .
## 5) State the y-intercept for the simple linear regression analysis and interpret, if applicable.
Answer : Yes it is applicable its value is : -1066.2 it is negative it is affect on value (y)
## 6) State the standard error of the regression analysis and interpret that value.
Answer : S = standard error = 134.277
S represents the average distance that the observed values fall from the regression line.
Model Summary
S R-sq R-sq(adj) PRESS R-sq(pred)
134.277 96.51% 96.39% 612504 95.77%
## 7) State the coefficient of determination and interpret the value in this context.
Answer : it is R square value = 96.51 % it is very very good ,
variation explained by model is 96.51 % , if R square value is larger then model is good .
## 8) State the sum of square errors.
SSE=504849.5953
It is the sum of squared deviation of actual values from predicted values .
## 9) State the standard error of the slope. (1 point)
SEb(1)=sqrt(504849.5953/30-2)/sqrt(186742.7)=134.277/432.137
=0.3107
The standard error of the regression slope S represents the average distance that observed values deviate from the regression line
## 10. ) Calculate and interpret the 95% confidence interval for slope. (2 points)
8.6507+-2.048(0.3107)=8.6507+-0.6363
=(8.0144, 9.287)
We are 95% confident that the slope of the interval is between 8.0144 and 9.287
## 11) From the coefficient of determination, standard error of regression, and the confidence interval for slope does that model appear to fit well? Explain.
Answer : 1) overall model is significant ,
2) it is variation explained by model is very very good .
3) there is significant linear regression occur .
## here attached Minitab output :