The dataset Golfers2008.xlsx saved in Datasets in Blackboard contains data on the top 40 golfers in...

The dataset Golfers2008.xlsx saved in Datasets in Blackboard contains data on the top 40 golfers in 2008. This was the year when Tiger Woods won the U.S. Open in June and then had year-ending surgery.

Using all the explanatory variables, run a regression predicting Earnings per Round.

Determine the best fit model by removing any insignificant x-variables. Rerun the analysis with your best fit model. Make a clear notation of which model is your best-fit model by labeling the worksheet of that model “BEST FIT MODEL”.

Age	Events	Rounds	Cuts Made	Top 10s	Wins	Earnings per Round
45	23	82	18	8	3	$80,501
32	6	23	6	6	4	$251,087
37	21	79	20	8	2	$65,682
28	19	70	18	6	1	$69,403
47	26	97	24	7	3	$48,080
22	22	81	19	8	2	$57,485
26	22	78	19	7	2	$56,701
36	15	51	12	6	2	$84,579
35	23	85	19	7	1	$46,815
35	25	96	24	8	1	$41,079
36	28	108	27	9	0	$33,395
38	26	94	23	9	0	$36,763
31	25	81	16	5	1	$37,400
38	26	88	20	8	0	$34,320
31	20	64	14	6	1	$45,002
38	21	72	16	5	1	$37,270
31	22	80	18	5	0	$32,697
43	26	98	22	6	0	$26,340
28	22	70	14	3	1	$36,660
38	16	50	11	5	1	$50,746
30	29	110	25	5	1	$22,841
37	23	84	21	7	0	$29,579
41	22	74	16	6	0	$32,950
34	28	95	19	7	0	$25,313
34	24	83	19	5	1	$28,901
27	27	94	21	4	1	$24,515
44	24	83	19	7	0	$27,539
39	33	116	24	5	0	$19,301
39	22	74	15	6	0	$29,984
26	27	87	18	5	0	$25,389
36	31	103	20	6	1	$21,413
26	26	86	19	3	1	$25,188
44	30	107	24	6	0	$20,060
28	32	119	27	6	0	$17,599
25	25	82	16	3	1	$25,486
27	20	67	15	3	1	$30,815
36	30	114	26	3	0	$17,893
29	28	89	16	3	0	$22,465
27	15	50	12	3	1	$39,583
34	27	91	18	5	0	$21,648

Expert Solution

Solution:

Here, we have to use the regression model by using excel for the prediction of dependent or response variable earnings per round based on the all independent variables given in the data set.

Required regression model is given as below:

Regression Statistics
Multiple R	0.889899769
R Square	0.791921598
Adjusted R Square	0.754089162
Standard Error	18716.83797
Observations	40

ANOVA
	df	SS	MS	F	Significance F
Regression	6	43998116543	7333019424	20.93234451	5.89276E-10
Residual	33	11560560782	350320023.7
Total	39	55558677325

	Coefficients	Standard Error	t Stat	P-value	Lower 95%	Upper 95%
Intercept	133191.7361	27969.62313	4.76201397	3.71311E-05	76287.11022	190096.3619
Age	-217.8285416	550.5505735	-0.395655825	0.694905337	-1337.9321	902.2750166
Events	-15255.38379	4171.90591	-3.656694116	0.000881233	-23743.19014	-6767.577442
Rounds	5162.007669	1717.492414	3.005549037	0.005034762	1667.743098	8656.272241
Cuts Made	-10206.14138	3537.932973	-2.884775222	0.006850496	-17404.1201	-3008.162659
Top 10s	4668.563854	2301.427516	2.028551333	0.050637046	-13.72560943	9350.853317
Wins	15009.16535	3925.005194	3.823986112	0.00055291	7023.682284	22994.64842

For above regression model, two independent variables such as age and Top 10s are not statistically significant as their corresponding P-values are greater than the 5% level of significance or alpha value 0.05.

So, we will remove these two independent variables from this regression model and rerun this regression again.

After rerunning the regression model by using excel, we get the following regression model:

Regression Statistics
Multiple R	0.874288986
R Square	0.76438123
Adjusted R Square	0.737453371
Standard Error	19339.57245
Observations	40

ANOVA
	df	SS	MS	F	Significance F
Regression	4	42468010134	10617002534	28.38626048	1.48365E-10
Residual	35	13090667191	374019062.6
Total	39	55558677325

	Coefficients	Standard Error	t Stat	P-value	Lower 95%	Upper 95%
Intercept	144725.6433	23535.56187	6.149232555	4.91279E-07	96945.91281	192505.3737
Events	-15169.03589	4276.771411	-3.546842801	0.001131595	-23851.34338	-6486.728396
Rounds	4666.583541	1743.158061	2.677085713	0.011227519	1127.784563	8205.382518
Cuts Made	-7718.911929	3428.149071	-2.251626685	0.03072514	-14678.42449	-759.3993655
Wins	15911.90544	4023.767478	3.954479359	0.000356073	7743.223232	24080.58765

The final regression equation is given as below:

Earnings per round = 144725.6433 - 15169.03589*events + 4666.583541*Rounds - 7718.911929*Cuts made + 15911.90544*Wins

orchestra answered 3 years ago

Use "PLUC" data and the description for the dataset on the blackboard. Is there sufficient evidence...

Use "PLUC" data and the description for the dataset on the blackboard. Is there sufficient evidence that the population mean of "PLUC.pre" is different than that of "PLUC.post"? Use R to find the p-value for the test. ***Answer is 8.739E-08*** PLUC.post PLUC.pre 6.483496 8.078464 8.607279 8.539505 12.41932 13.32073 11.72048 8.640824 12.26601 9.979111 11.15877 8.878284 7.527066 10.6834 10.60626 8.728163 6.276827 10.68463 10.1099 11.35035 6.520483 12.71441 11.91813 8.892171 10.66304 8.830107 9.777328 10.50259 9.220989 5.537055 11.79612 8.710783 11.11839 12.75601 8.965028 6.423624 11.48719 3.823811...

Use "PLUC" data and the description for the dataset on the blackboard. Conduct a two sample...

Use "PLUC" data and the description for the dataset on the blackboard. Conduct a two sample independent t test to test if the population means of heights of male is higher than that of female. Use R to calculate the p-value. ***Answer is 0.8974*** sex hgt m 45.68187 m 54.76593 m 43.80479 f 46.1765 m 57.60508 f 40.02826 f 52.50647 f 43.14426 m 45.27999 m 41.95513 m 43.67319 f 58.09449 m 42.47022 f 55.91853 m 44.01857 f 43.25757 m 57.4945...

Use "PLUC" data and the description for the dataset on the blackboard. Conduct a two sample...

Use "PLUC" data and the description for the dataset on the blackboard. Conduct a two sample independent t test to test if the population means of heights of male is higher than that of female. Use R to calculate the p-value. ***Answer is 0.8974*** sex hgt m 45.68187 m 54.76593 m 43.80479 f 46.1765 m 57.60508 f 40.02826 f 52.50647 f 43.14426 m 45.27999 m 41.95513 m 43.67319 f 58.09449 m 42.47022 f 55.91853 m 44.01857 f 43.25757 m 57.4945...

R has a number of datasets built in. One such dataset is called mtcars. This data...

R has a number of datasets built in. One such dataset is called mtcars. This data set contains fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models) as reported in a 1974 issue of Motor Trend Magazine. We do not have to read in these built-in datasets. We can just attach the variables by using the code attach(mtcars) We can just type in mtcars and see the entire dataset. We can see the variable...

Download the dataset CARS1 from BlackBoard. a. Do not worry about outliers. Assume the data is...

Download the dataset CARS1 from BlackBoard. a. Do not worry about outliers. Assume the data is correct and any outliers will remain in the dataset. b. Do scatterplot and analyze the results. c. Test for correlation (correlation coefficient) d. Regress weight (column 2) against gas mileage in the city (column 1). Make sure you make gas mileage the dependent (Y) variable. e. Determine and fully explain R2 MPG City Weight 19 3545 23 2795 23 2600 19 3515 23 3245...

Load the USArrests sample dataset from the built-in datasets (data(USArrests)) into R using a dataframe (Note:...

Load the USArrests sample dataset from the built-in datasets (data(USArrests)) into R using a dataframe (Note: Row names are states, not numerical values!). Use the kmeans package to perform a clustering of the data with increasing values of k from 2 to 10 - you will need to decide whether or not to center/scale the observations - justify your choice. Plot the within-cluster sum of squares for each value of k - what is the optimal number of clusters? Use...

The file HW_05.xlsx contains data from a survey of 105 randomly selected households. a. Interpret the...

The file HW_05.xlsx contains data from a survey of 105 randomly selected households. a. Interpret the ANOVA table for this model. In particular, does this set of independent variables provide at least some power in explaining the variation in the dependent variable? Report the F ratio statistics and p- value for this hypothesis test. b. Interpret coefficients of independent variables in the model. c. Using the regression output, determine which of the independent variables should be excluded from the regression...

You will be performing an analysis on a dataset that contains data on fertility and life...

You will be performing an analysis on a dataset that contains data on fertility and life expectancy for 198 different countries. All data is from the year 2013. The fertility numbers are the average number of children per woman in each of the countries. The life expectancy numbers are the average life expectancy in each of the countries. You will be turning in a paper that should include section headings, graphics and tables when appropriate and complete sentences which explain...

The file P02_35.xlsx contains data from a survey of 500 randomly selected households. a. Suppose you...

The file P02_35.xlsx contains data from a survey of 500 randomly selected households. a. Suppose you decide to generate a systematic random sample of size 25 from this population of data. How many such samples are there? What is the mean of Debt for each of the first three such samples, using the data in the order given? b. If you wanted to estimate the (supposedly unknown) population mean of Debt from a systematic random sample as in part a,...

The file P08_06.xlsx contains data on repetitive task times for each of two workers. John has...

The file P08_06.xlsx contains data on repetitive task times for each of two workers. John has been doing this task for months, whereas Fred has just started. Each time listed is the time (in seconds) to perform a routine task on an assembly line. The times shown are in chronological order. a. Calculate a 95% confidence interval for the standard deviation of times for John. Do the same for Fred. What do these indicate? b. Given that these times are...

Question

The dataset Golfers2008.xlsx saved in Datasets in Blackboard contains data on the top 40 golfers in...

Solutions

Expert Solution

Related Solutions

Use "PLUC" data and the description for the dataset on the blackboard. Is there sufficient evidence...

Use "PLUC" data and the description for the dataset on the blackboard. Conduct a two sample...

Use "PLUC" data and the description for the dataset on the blackboard. Conduct a two sample...

R has a number of datasets built in. One such dataset is called mtcars. This data...

Download the dataset CARS1 from BlackBoard. a. Do not worry about outliers. Assume the data is...

Load the USArrests sample dataset from the built-in datasets (data(USArrests)) into R using a dataframe (Note:...

The file HW_05.xlsx contains data from a survey of 105 randomly selected households. a. Interpret the...

You will be performing an analysis on a dataset that contains data on fertility and life...

The file P02_35.xlsx contains data from a survey of 500 randomly selected households. a. Suppose you...

The file P08_06.xlsx contains data on repetitive task times for each of two workers. John has...