Question

In: Statistics and Probability

Wageweb conducts surveys of salary data and presents summaries on its website. Based on salary data...

Wageweb conducts surveys of salary data and presents summaries on its website. Based on salary data as of October 1, 2002, Wageweb reported that the average annual salary for sales vice presidents was $142,111, with an average annual bonus of $15,432. Assume the following data are a sample of the annual salary and bonus for 10 sales vice presidents. Data are in thousands of dollars.

Vice President

Salary

Bonus

1

135

12

2

115

14

3

146

16

4

167

19

5

165

22

6

176

24

7

98

7

8

136

17

9

163

18

10

119

11

  1. Compute SST, SSR, and SSE

  2. Compute the coefficient of determination r2. Comment on the goodness of fit.

  3. What is the value of the sample correlation coefficient?

  4. Develop the null and alternative hypothesis to test the linear relationship between salary and bonus

  5. At the .05 level of significance, determine whether salary and bonus are linearly related. Use the t test.

  6. Solve the problem in Excel and compare your results.

Solutions

Expert Solution

f) Compute SST, SSR, and SSE

RELATIONSHIPAMONG SST, SSR, AND SSE

SST=SSR+SSE

Where SST - total sum of squares, SSR - sum of squares due to regression, SSE - sum of squares due to error. SSR can be thought of as the explained portion of SST, and SSE can be thought of as the unexplained portion of SST.

g) Compute the coefficient of determination r2. Comment on the goodness of fit.

The ratio SSR/SST, which will take values between zero and one, is used to evaluate the goodness of fit for the estimated regression equation. This ratio is called the coefficient of determination and is denoted by r2. When we express the coefficient of determination as a percentage, r2 can be interpreted as the percentage of the total sum of squares that can be explained by using the estimated regression equation.

Solution:

The least square line provided a very good fit; 85% of the variability in y has been explained by the least squares line y=-10.16+0.18x.

h) What is the value of the sample correlation coefficient?

The correlation coefficient as a descriptive measure of the strength of linear association between two variables, x and y. Values of the correlation coefficient are always between -1 and +1. A value of -1 indicates that the two variables x and y are perfectly related in a positive linear sense. That is, all data points are on a straight line that has a positive slope. A value of -1 indicates that x and y are perfectly related in a negative linear sense, with all data points on a straight line that has a negative slope. Values of the correlation coefficient close to zero indicate that x and y are not linearly related. The sign for the sample correlation coefficient is positive if the estimated regression equation has a positive slope (b1>0) and negative if the estimated regression equation has a negative slope (b1 < 0).

Solution:

i) Develop the null and alternative hypothesis to test the linear relationship between salary and bonus.

We would like to test whether there exist significant linear relationship between X and Y. The simple linear regression model is y=β0+ β1x+e. If x and y are linearly related, we must have β1≠0. The purpose of the t test is to see whether we can conclude that β1≠0. We will use the sample data to test the following hypotheses about the parameter β1.

H0:β1=0

Ha:β1≠0

If H0 is rejected, we will conclude that β1≠0 and that a statistically significant relationship exists between the two variables. However, if H0 cannot be rejected, we will have insufficient evidence to conclude that a significant relationship exists. The properties of the sampling distribution of b1, the least squares estimator of β1, provide the basis for the hypothesis test.

Solution:

j) At the .05 level of significance, determine whether salary and bonus are linearly related.

To find p-value we should use Excel. P-value=TDIST (t, df, number of tails)

t≈6.8, df=n-2=10-2=8.This is two tail test. p-value==TDIST(6.8769,8,2)≈ 0.00013

[p-value ≈ 0.00013] < [α = .05]

Since p-value < α we reject Ho. So alternative hypothesis is a true: β1 ≠ 0. It means there is a relationship between Salary and Bonus.

Where do you find tstat and its associated p-value on the Excel printout?

In the table below the ANOVA table, in the row labeled with the name you used for X. tstat is given in the column labeled t Stat and the p-value is given in the column labeled P-value.

Testing method 2: F-test

An F test, based on the F probability distribution, can also be used to test for significance in regression. With only one independent variable, the F test will provide the same conclusion as the t test; that is, if the t test indicates β1≠ 0 and hence a significant relationship, the F test will also indicate a significant relationship. But with more than one independent variable, only the F test can be used to test for an overall significant relationship.

The logic behind the use of the F test for determining whether the regression relationship is statistically significant is based on the development of two independent estimates of σ2. We explained how MSE provides an estimate of σ2. If the null hypothesis H0: β1=0 is true, the sum of squares due to regression, SSR, divided by its degrees of freedom provides another independent estimate of σ2. This estimate is called the mean square due to regression, or simply the mean square regression, and is denoted MSR. In general,

For the models we consider in this text, the regression degrees of freedom is always equal to the number of independent variables in the model:

Because we consider only regression models with one independent variable in the chapter 12, we have MSR= SSR/1= SSR.

If the null hypothesis (H0: β1=0) is true, MSR and MSE are two independent estimates of σ2 and the sampling distribution of MSR/MSE follows an F distribution with numerator degrees of freedom equal to one and denominator degrees of freedom equal to n - 2. Therefore, when β1=0, the value of MSR/MSE should be close to 1. However, if the null hypothesis is false (β1≠0), MSR will overestimate σ2 and the value of MSR/MSE will be inflated; thus, large values of MSR/MSE lead to the rejection of H0 and the conclusion that the relationship between x and y is statistically significant.

Solution:

Use as the test statistic which, should the model assumptions be valid and H0 be true, has over repeated sampling the F-distribution with 1 numerator df and n-2 denominator df. (k=1 for simple linear regression). Rejection rule: Reject H0 if p-value < α

critical value: Reject H0 if F-stat>F-critical

The value of F we can calculate manually or we can find F and its associated p-value on the Excel printout. This is ANOVA table. Test statistic F is given in the F column. The p-value is given in the Significance [of] F column.

Numerator Degree of Freedom is 1. Denominator degree of freedom is n-2.

df1=1, df2=8.

Excel

p-value =FDIST(F, numerator degrees of freedom df1, denominator degrees of freedom df2)

for our case p-value = FDIST(47,1,8)≈0. 0.00013

Since p-value < α we reject Ho. Alternative hypothesis is a true: β1 ≠ 0.

It means Salary and Bonus are related.

k) Solve the problem in Excel and compare your results.

Solution: use file Regression Excel tutorials

  

General form of the ANOVA Table

For Simple Linear Regression

*****Please please please LIKE THIS ANSWER, so that I can get a small benefit, Please****


Related Solutions

Wageweb conducts surveys of salary data and presents summaries on its website. Based on salary data...
Wageweb conducts surveys of salary data and presents summaries on its website. Based on salary data as of October 1, 2002, Wageweb reported that the average annual salary for sales vice presidents was $142,111, with an average annual bonus of $15,432. Assume the following data are a sample of the annual salary and bonus for 10 sales vice presidents. Data are in thousands of dollars. Vice President Salary Bonus 1 135 12 2 115 14 3 146 16 4 167...
1. Wageweb conducts surveys of salary data and presents summaries on its website. Based on salary...
1. Wageweb conducts surveys of salary data and presents summaries on its website. Based on salary data as of October 1, 2002, Wageweb reported that the average annual salary for sales vice presidents was $142,111, with an average annual bonus of $15,432. Assume the following data are a sample of the annual salary and bonus for 10 sales vice presidents. Data are in thousands of dollars. Vice President Salary Bonus 1 135 12 2 115 14 3 146 16 4...
A used celebrity guitar business conducts most of its sales via its website.
1: A used celebrity guitar business conducts most of its sales via its website. Its warehouse and office operations are housed in a large building, which is insured under a BPP. The office contains the company’s servers, currently valued at $300,000. Two years ago, the company purchased a separate electronic data processing (EDP) equipment policy to cover its computer hardware. The company has kept the policy in force, but it has not increased the original $200,000 amount of insurance. A...
A polling institute routinely conducts surveys to gauge the impact of the Internet and technology on...
A polling institute routinely conducts surveys to gauge the impact of the Internet and technology on daily life. A recent survey asked respondents if they read online journals or? blogs, an Internet activity of potential interest to many businesses. A subset of the data from this survey shows responses to this question. Test whether reading online journals or blogs is independent of generation. Use a significance level of alpha?equals=0.05. Read Online Journal or Blog ?Yes, Yesterday ?Yes, But Not Yesterday...
Are technology-based surveys better than handwritten surveys? Why or why not?
Are technology-based surveys better than handwritten surveys? Why or why not?
The following quote has appeared in the literature: “Travel cost demand data based on surveys of...
The following quote has appeared in the literature: “Travel cost demand data based on surveys of individuals at a recreation site create significant bias in the welfare measure for recreation.” Is this statement correct? Explain why or why not (Natural and Resource Environmental Economics)
4. GR Inc is a US based MNC that conducts a part of its business in...
4. GR Inc is a US based MNC that conducts a part of its business in Singapore. Its US sales are denominated in US dollars while its sales in Singapore are denominated in Singaporean dollars. Its pro-forma income statement for the next year is shown below. Assume US Sales will be unaffected by the exchange rate. Also Assume the Singaporean dollar earning will be remitted to the US at the end of the period. The average rate is USD 0.6956/...
GR Inc is a US based MNC that conducts a part of its business in Singapore....
GR Inc is a US based MNC that conducts a part of its business in Singapore. Its US sales are denominated in US dollars while its sales in Singapore are denominated in Singaporean dollars. Its pro-forma income statement for the next year is shown below. Assume US Sales will be unaffected by the exchange rate. Also Assume the Singaporean dollar earning will be remitted to the US at the end of the period. The average rate is USD 0.6956/ SGD...
The U.S. Census Bureau conducts annual surveys to obtain information on the percentage of the voting-age...
The U.S. Census Bureau conducts annual surveys to obtain information on the percentage of the voting-age population that is registered to vote. Suppose that 387 employed persons and 359 unemployed persons are independently and randomly selected, and that 243 of the employed persons and 202 of the unemployed persons have registered to vote. Can we conclude that the percentage of employed workers ( p1 ), who have registered to vote, exceeds the percentage of unemployed workers ( p2 ), who...
The U.S. Census Bureau conducts annual surveys to obtain information on the percentage of the voting-age...
The U.S. Census Bureau conducts annual surveys to obtain information on the percentage of the voting-age population that is registered to vote. Suppose that 700 employed persons and 300 unemployed persons are independently and randomly selected and that 500 of the employed persons and 200 of the unemployed persons have registered to vote. Can we conclude that the percentage of the employed workers (p1), who have registered to vote, exceeds the percentage of unemployed workers (p2), who have registered to...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT