Question

In: Statistics and Probability

Our societal values: do taller basketball players get better paid? Consider the data set labeled NBA...

Our societal values: do taller basketball players get better paid? Consider the data set labeled NBA 2008-2009 Data.

(a) Select 25 basketball players (use random.org as explained in Problem 2), and record their heights and annual salary in two columns. Display your data values. There should be 25 data values in each column.

Player #

Height

Annual salary

13

81

$10,000,000

21

73

$2,700,000

49

84

$711,517

53

83

$2,295,480

57

77

$1,931,160

79

80

$1,600,000

83

76

$21,372,000

121

79

$25,000

127

77

$442,114

270

84

$2,986,080

283

82

$1,141,838

290

80

$1,173,480

294

81

$9,226,550

350

82

$1,542,600

363

78

$12,222,221

386

79

$1,983,453

392

83

$5,784,480

397

81

$3,395,760

400

79

$4,000,000

418

82

$7,348,018

419

75

$4,250,000

428

83

$21,372,000

433

78

$1,081,440

451

75

$998,398

452

84

$5,500,000

(b) You would like to see whether there is a correlation between the players’ height and annual salary. Let height be the explanatory (X) variable, and annual salary be the response (Y) variable. Use appropriate software to obtain a full regression output.

(c) Identify the intercept and slope, and write the regression equation. Identify the coefficient of determination, and interpret the result.

(d) Calculate the coefficient of correlation, and interpret the result.

(e) Find a 95% confidence interval for the population slope. Does the population slope exceed 0? To answer this question, state the hypotheses, identify the p-value, and interpret the result.

(f) Provide a scatter diagram produced by EXCEL.

(g) Identify and interpret the greatest positive residual. Provide the complete list of residuals.

Solutions

Expert Solution

b) You would like to see whether there is a correlation between the players’ height and annual salary. Let height be the explanatory (X) variable, and annual salary be the response (Y) variable. Use appropriate software to obtain a full regression output.

1. Put the data in excel as shown below.


2. We use the regression option from the data analysis tab (found under Data)


3. Update the
dialogue box as shown below.


4. The output will be generated as given below.


5. The regression equation is obtained from the output (highlighted in green )


(c) Identify the intercept and slope, and write the regression equation. Identify the coefficient of determination, and interpret the result.

Intercept = -977566.5
Slope = 74911.19

The regression equation
Annual Salary = -977566.51 + 74911.19 (Height)

The coefficient of determination is nothing but the R square value. (Highlighted in Blue. It has a value between 0 and 1. Higher the value better the model is.
The coefficient of determination explains the amount of variability in the data explained by the model.

In this case the Rsquare = 0.001587082, which is very low and it not a good model at all.

(d) Calculate the coefficient of correlation, and interpret the result.

The coefficient of correlation is very very small, indicate that there is no relationship between annual salary and the height of the player.


(e) Find a 95% confidence interval for the population slope. Does the population slope exceed 0? To answer this question, state the hypotheses, identify the p-value, and interpret the result.

Hypothesis:
H0: Beta coefficient of the independent variable(height) is equal to zero.
H1: Beta coefficient of the independent variable is not equal to zero.

From the regression output ( highlighted in orange)
we get
tstat = 0.191209
pvalue = 0.8500

From the regression output we see that the pvalue for independent variable is 0.8500, which is greater than 0.05, hence we fail to reject the null hypothesis and conclude that the Beta coefficient is not significant and the independent variable is a not significant predictor of y.


Confidence interval

Confidence interval (-732148.0057,881970.4024)
Since the confidence interval contains zero the variable is not significant.


(f) Provide a scatter diagram produced by EXCEL.

(g) Identify and interpret the greatest positive residual. Provide the complete list of residuals.


Related Solutions

1. Consider the following set of data relating Distance from School and Time to get to...
1. Consider the following set of data relating Distance from School and Time to get to school: Construct a scatter plot for the given data. Does the scatter plot show positive/negative/no correlation? Justify your answer. Find the least-square regression line (Best fit line). If a student lives 6.8 miles away from school, what is her predicted time to get to school? X (miles) Y (minutes) 2 10 3 7 3.1 12 4.5 15 5 20 5.5 27 7 25 8.1...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT