Question

In: Advanced Math

I'm using 2005 NFL stats to come up with a multiple linear regression analysis models with...

I'm using 2005 NFL stats to come up with a multiple linear regression analysis models with the winning percentage being the dependent variable. My question would be, what are the most significant variables that are used in deciding an NFL team's capacity to win? Passing yards, rushing game, defense or field goals are some of my independent variables. But I’m considering adding the defensive stats to the regression. How do I complete the introduction and model subtopics for my presentation?

1. Introduction

A. Topic: Select a topic of research in an area of applied science in which you are interested. Make sure to get instructor approval before proceeding. Use research literature to provide an introductory summary of the topic that establishes your interest and expertise within the area of study.

B. Research Question: Formulate an analytical, researchable question based on your topic. The scope of your research question should be reasonable given the time constraints of the course.

C. Information: Gather and summarize applicable data, research, or other information that you intend to use in the creation and analysis of your model. For example, you might first consider principles or data sets to inform the creation and analysis of your model.

D. Assessment: Assess the appropriateness of the information you gathered. How will the information help you create an effective mathematical model to address your research question? Are there any underlying assumptions or limitations in the information you gathered?

II. Model

A. Selection: Select a model type to create. Defend your selection, comparing it to other types of models. In other words, why did you choose your model type? Does the model type you chose, have any limitations? What comparative tests can you perform to support your model choice? Use research to support your model type selection.

B. Creation: Create the model using the information you gathered. Remember to be cognizant of the research question and to use appropriate mathematical tools, technology, theoretical underpinnings, and/or data, if applicable.

C. Process: Explain the process you used to build your model. Include your reasoning for specific choices and decisions you made while building your model. Use research to support your choices.

D. Tools: What tools and techniques did you use to create your model? Why are these tools and techniques appropriate for the information you gathered, and the model type you selected? Be sure to provide support for your choices.

E. Analysis: Analyze the results of your model to determine whether the model fits the research question and information you gathered. How well do the model results fit the question and information? Consider including computational tests or graphical displays to support your argument.

F. Limitations: What limitations does your model have? Use algorithmic, tabular, and graphical displays to articulate the limitations of your model.

G. Approach: Explain the approach that you took to answer your research question. Be sure to fully explain the process and steps you used to achieve your results. H. Applicability: Articulate the purpose for your model. How applicable is the model to the research question? How does the model help you answer your research question? How well does it align to the research

Solutions

Expert Solution

You can solve the problem with the use of Microsoft Excel and Data Analysis.

Data Collection

Prior to making any predictions, the first thing you must do is collect your data. In regards to NFL data, Pro Football Reference offers an abundant amount of data.

Determining the Relevant Variables

Too much irrelevant data can be a problem. After importing the Team Offense and Defense tables from Pro Football Reference to Microsoft Excel. It will give you 25 columns of statistics. Therefore, you need to filter out the data into only the variables that you deem most relevant to a team's ability to win.

What are the most important variables that are used in determining an NFL team's ability to win? The answer to this question varies depending on your opinion. Personally, I feel that these variables are the best determinants of an NFL team's ability to win:

  • Margin - Difference between Points For (PF) and Points Against (PA)
  • Sc% OFF - Percentage of drives ending in an offensive score for
  • TO% OFF - Percentage of drives ending in an offensive turnover
  • YdsPen OFF - Penalty yards committed by offense
  • Sc% DEF - Percentage of drives ending in an offensive score against
  • TO% DEF - Percentage of drives ending in a defensive turnover
  • YdsPen DEF - Penalty yards committed by defense

Using only these variables, you can now transform your data into a less intimidating table. By carefully studying the variables, you may conclude that the Margin variable is determined by the other six variables. Therefore, the dependent variable of my data set is Margin and the independent, a.k.a. explanatory, variables are Sc% OFF, TO% OFF, YdsPen OFF, Sc% DEF, TO% DEF, YdsPen DEF. This information will be important for the next step.

How to Make Use of the Data

With all of this data, you need to create a relationship between your variables that will serve as a formula for computing your ratings. Linear regression is a good way to model the relationship between two variables (dependent and independent) of a data set. Since we previously determined that this model contains a dependent variable that is explained by several independent variables, linear regression is the method that we will use.

In Microsoft Excel, you can run a linear regression by going into the Data tab, then clicking Data Analysis and scrolling down to Regression. The Input Y Range (dependent variable) in my model is the Margin column. The Input X Ranges (independent variables) are the columns containing Sc% OFF, TO% OFF, YdsPen OFF, Sc% DEF, TO% DEF, YdsPen DEF. Once your linear regression is set-up, simply press OK to see your results.

The results of this linear regression were good. The regression output determined that there was an R Squared value of 0.9077 which, as expected, tells me that there is a strong relationship between the X and Y Ranges. Using the coefficients in the regression summary, this is the formula that I will use to determine each teams' rating:

Rating = A*ScOFF - B*TOOFF - C*YdsPenOFF - D*ScDEF + E*TODEF + F*YdsPenDEF

Here is how you can read the formula:

  • Every 1% increase in offensive scoring percentage increases the team rating by A
  • Every 1% increase in offensive turnover percentage decreases the team rating by B
  • Every 1 yard increase in offensive penalty yards decreases the team rating by C
  • Every 1% increase in offensive scoring percentage against decreases the team rating by D
  • Every 1% increase in defensive turnover percentage increases the team rating by E
  • Every 1 yard increase in defensive penalty yards increases the team rating by F (positive value for this coefficient is counter-intuitive)

Ranking Each Team

With a formula in place, all you have to do now is calculate the rating of each team by inputting the variable data into the rating formula. The VLOOKUP() and SUM() functions in Microsoft Excel make this an easy task. Now, you can rank the teams based on their rating. The higher the rating, the better the rank.

Predictions & Results

You can predict results using the final ranking table you have obtained.

Conclusion

Obviously, there is no perfect method for predicting the outcome of a football game or any sports event for that matter. There are simply too many unpredictable variables to account for. However, by choosing your variables wisely it is evident that you can make a quality prediction.


Related Solutions

I'm using 2005 NFL stats to come up with a multiple linear regression analysis models with...
I'm using 2005 NFL stats to come up with a multiple linear regression analysis models with the winning percentage being the dependent variable. My question would be, what are the most significant variables that are used in deciding an NFL team's capacity to win? Passing yards, rushing game, defense or field goals are some of my independent variables. But I’m considering adding the defensive stats to the regression. How do I complete my presentation subtopics? Presentation: I. Introduction: Summarize your...
The data presented in Problem 7 are analyzed using multiple linear regression analysis and the models...
The data presented in Problem 7 are analyzed using multiple linear regression analysis and the models are shown here. In the models, the data are coded as 1 = new medication and 0 = standard medication, and age 65 and older is coded as 1 = yes and 0 = no. ŷ = 53.85 − 23.54 (Medication) ŷ = 45.31 − 19.88 (Medication) + 14.64 (Age 65 +) ŷ = 45.51 − 20.21 ( Medication ) + 14.29 ( Age...
Run a linear regression using Excel’s Data Analysis regression tool. Construct the linear regression equation and...
Run a linear regression using Excel’s Data Analysis regression tool. Construct the linear regression equation and determine the predicted total sales value if the number of promotions is 6. Is there a significant relationship? Clearly explain your reasoning using the regression results. Number of Promotions Total Sales 3 2554 2 1746 11 2755 14 1935 15 2461 4 2727 5 2231 14 2791 12 2557 4 1897 2 2022 7 2673 11 2947 11 1573 14 2980
Describe how simple linear regression analysis and multiple regression are used to support areas of industry...
Describe how simple linear regression analysis and multiple regression are used to support areas of industry research, academic research, and scientific research.
Assignment on Multiple Linear Regression                                     &nb
Assignment on Multiple Linear Regression                                                                                          The Excel file BankData shows the values of the following variables for randomly selected 93 employees of a bank. This real data set was used in a court lawsuit against discrimination. Let = monthly salary in dollars (SALARY), = years of schooling at the time of hire (EDUCAT), = number of months of previous work experience (EXPER), = number of months that the individual was hired by the bank (MONTHS), = dummy variable...
Discuss the application of multiple linear regression
Discuss the application of multiple linear regression
Using the following data, conduct a linear regression analysis (by hand). In this analysis, you are...
Using the following data, conduct a linear regression analysis (by hand). In this analysis, you are testing whether SAT scores are predictive of overall college GPA. Also, write out the regression equation. How would you interpret your results? SAT Score GPA 670 1.2 720 1.8 750 2.3 845 1.9 960 3.0 1,000 3.3 1,180 3.2 1,200 3.4 1,370 2.9 1,450 3.8 1,580 4.0 1,600 3.9
Please find at least one application of a multiple linear regression model in business analysis and...
Please find at least one application of a multiple linear regression model in business analysis and post your comments/thoughts and the web link of the information source,
What is the difference between simple linear regression and multiple linear regression? What is the difference...
What is the difference between simple linear regression and multiple linear regression? What is the difference between multiple linear regression and logistic regression? Why should you use adjusted R-squared to choose between models instead of R- squared? Use SPSS to: Height (Xi) Diameter (Yi) 70 8.3 72 10.5 75 11.0 76 11.4 85 12.9 78 14.0 77 16.3 80 18.0 Create a scatterplot of the data above. Without conducting a statistical test, does it look like there is a linear...
1.Develop a multiple linear regression model to predict the price of a house using the square...
1.Develop a multiple linear regression model to predict the price of a house using the square feet of living area, number of bedrooms, and number of bathrooms as the predictor variables     Write the reqression equation.      Discuss the statistical significance of the model as a whole using the appropriate regression statistic at a 95% level of confidence. Discuss the statistical significance of the coefficient for each independent variable using the appropriate regression statistics at a 95% level of confidence....
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT