“OJ (Simpson) had to of known he wasn't ever gonna get away with a head that size. Only person u could mistake his silhouette for is stewey griffin.”
-Channing Crowder
The first (and maybe only) major revision of our model predicting NFL wins is here. Keeping in mind that the foremost goal of this project is to maximize the adjusted R square of this multiple regression model, I decided to compare different models across different lengths in time. More specifically, I used the same procedure to select the best model using data from 1996, 2001, and 2006. The corresponding adjusted R squares were as follows:
Year | RA2 |
1996 | .128 |
2001 | 143 |
2006 | .193 |
As you can see the model using data from 2006 increased our predictive power by 6.5%, which is substantial given my stated goal is to get the adjusted RA2 to .3. One quick note: There is another model that could have gotten the value up to .197. However, I made the executive decision to manually include winning last year’s Super Bowl as a variable because the Packers were being predicted very low (as in winning only 5 games low).
Since, I was reanalyzing all 30 or so variables a model with different predictors came out. Thanks to the inadvertent (maybe) advice of a professor, I was able to use a procedure in SAS that will select the combination of variables that produces the highest RA2. Below is a table containing the variables in the original model with those in the revised model.
Original Model | Revised Model |
Lost the previous Super Bowl | New Coaching Staff |
Won the previous Super Bowl | Making the Playoffs the Previous Year |
Defensive Take Aways | Offensive Points Scored |
Offensive Passing Yards | Offensive 1st Downs |
Passing Yards Allowed | Offensive Passing Yards |
Rushing TDs Allowed | Turnovers |
Rushing Yards Gained | Offensive Rushing Attempts |
Offensive Plays | Rushing Yards Allowed |
Offensive Rushing Attempts | Won the previous Super Bowl |
Offensive Passing Attempts | Defensive Take Aways |
Offensive Points Scored | |
As you can see some variables were included in both models while others were retained. Of particular note: new coaching staff became highly predictive. Further, the table below shows that the variable measuring take aways is behaving oddly. That is it is predicting that the more take aways generated the lower the win total. This could be a result of inter-correlations between predictors however it has appeared in both models so I’m concerned it may be measuring some other variable. Below are the regression coefficients:
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 -9.20641 5.54052 -1.66 0.0987
New_Coach 1 -0.85085 0.58044 -1.47 0.1448
Playoffs_Prev 1 0.78022 0.64109 1.22 0.2255
PPoints 1 0.01734 0.00675 2.57 0.0112
PO1stD 1 0.02443 0.01297 1.88 0.0616
POPYards 1 -0.00092570 0.00062871 -1.47 0.1430
POTO 1 0.06768 0.04225 1.60 0.1113
PDRAtt 1 0.03470 0.01195 2.90 0.0043
PDRYards 1 -0.00443 0.00142 -3.12 0.0022
PDTO 1 -0.07588 0.04272 -1.78 0.0777
SB_W_Prev 1 0.79177 1.33607 0.59 0.5543
A quick reminder of the interpretation: values under the column headed Parameter Estimates are interpreted as for every one unit increase in X, Y increases by that value. For example, for everyone point scored (PPoints) wins increases by .01734. Another example would be the offensive 1st down variable: For every offensive 1st down gained we can expect to gain .02443 wins. Note how this interpretation doesn’t make sense for take aways (PDTO).
Below are the model fit statistics. Again I won’t go into detail unless someone asks me to.
Analysis of Variance
Source DF Squares Square F Value Pr > F
Model 10 376.91710 37.69171 4.81 <.0001
Error 149 1168.07665 7.83944
Corrected Total 159 1544.99375
Dependent Mean 7.99375 Adj R-Sq 0.1932
Coeff Var 35.02612
Finally, I may possibly update this twice more by the end of the week. The first update would be to add a variable which adds values to the number of playoff games won previously. I may not have time to collect that data. However, I will definitely be reviewing this model for outliers and influential observations so expect some predictions to change. Below is the current predictions. Like before I weighted the predictions by order of finish in the division however I changed the weights (now 1st place gets 1.5 wins, 2nd gets 1, 3rd loses 1, and 4th lose 1.5). The 3rd column is what I would bet using the information from this model and common sense.
Division | Prediction | Lines 9/5* | My Bet |
AFC East | |||
Dolphins | 8 | 7.5 | Under |
Bills | 6 | 5.5 | Over |
Patriots | 11 | 11.5 | Under |
Jets | 10 | 10 | Over |
AFC North | |||
Ravens | 10 | 10 | Under |
Bengals | 6 | 5.5 | Over |
Browns | 6 | 7 | Under |
Steelers | 9 | 10.5 | Under |
AFC South | |||
Texans | 11 | 9 | Over |
Colts | 12 | N/A | N/A |
Jags | 9 | 6.5 | Over |
Titans | 6 | 6.5 | Under |
AFC West | |||
Broncos | 6 | 6 | Over |
Chiefs | 10 | N/A | N/A |
Raiders | 7 | 6.5 | Over |
Chargers | 12 | 10 | Over |
NFC West | |||
49ers | 6 | 7.5 | Under |
Seahawks | 9 | 6 | Over |
Rams | 4 | 7.5 | Over |
Cardinals | 8 | 7.5 | Over |
NFC South | |||
Falcons | 9 | 10 | Under |
Panthers | 4 | 4.5 | Under |
Buccaneers | 5 | 8 | Under |
Saints | 11 | 10 | Under |
NFC North | |||
Lions | 6 | 8 | Under |
Packers | 9 | 11.5 | Under |
Vikings | 5 | 7 | Under |
Bears | 8 | 8 | Over |
NFC East | |||
Giants | 9 | 9 | Over |
Eagles | 11 | 10.5 | Over |
Redskins | 5 | 6 | Under |
Cowboys | 6 | 9 | Under |
The Colt’s are still off the table because of the Peyton situation and for some reason the Chiefs are as well. Not to toot my own horn but look at how close the predictions are to the set lines. My model predicts 46.7% of the teams winning within 1 game of the line and predicts 16.7% of the lines exactly.
Finally I have some announcements for future plans. I’ll be keeping a summary of the bets I would’ve made if I wasn’t broke as hell and actually trusted the model. My expectation is that this summary will show I’ve lost money when the season is done. Secondly, due to recommendations from people who have read this blog, I am going to attempt to do the same for weekly lines. Kyle Kelly and Matthew Cornelius Wojay has graciously volunteered to help me create a model and to collect data from weekly games. I’ve also had colleagues offer to help me with data analysis on weekly lines. Weekly line prediction is slated to start for Week 3, so look forward to that.
Any Thoughts?
No comments:
Post a Comment