Weekly game probabilities are available now at the nytimes.com Fifth Down. This week I also look at how the Chargers can be so dominant statistically yet only have two wins to show for it. It's something more in depth than my usual lead-ins for the game probabilities.

Here's an excerpt along with a graph I did not include in the original post:

"Successful plays are not enough. *Consecutive *successes are required to win.

...Two equal teams could each have 12 first downs in a game. One team could have three drives of four consecutive first downs, each leading to a touchdown, and the rest of its drives could be three-and-outs. The other team could have 12 drives consisting of one first down followed by a punt. Both teams could have equal yards, first downs and efficiency stats, and yet one team could win, 21-0. It’s easy to imagine a game in which one team has many more first downs and yards, but still loses. Could something like this bunching effect be cursing the Chargers?

It’s a given that N.F.L. offenses tend to score in proportion to their yards gained. It’s actually an extremely tight correlation, and the best–fit estimate of a team’s points per game is to take just under 10 percent of its yards per game and subtract 10. For the Chargers, who lead the N.F.L. with 433 yards gained per game, we’d expect the offense to score about 32 points per game, but they’ve actually scored only 26."

## NYT: Game Probabilities - Week 7

By
Brian Burke

published on 10/21/2010
in
analysis,
predictions,
research

Subscribe to:
Post Comments (Atom)

Continuing what I said yesterday:

if the model gives the Patriots a 15% chance to win this weekend, and the multi million dollar betting market says the Patriots are +125 (44%), the model is busted. It needs to be massaged with some qualitative input.

That's an interesting analysis given that all of San Diego's losses have been 8 points or less.

Vegas is also adjusted for what the public believes...the stats don't consider the public.

I'll say it another way: the line isn't really reflective of "Vegas" it's much more reflective of the collective knowledge (ie $$) of every NFL bettor out there. it's a nearly efficient market, as can be seen by the # of people who can consistently beat closing lines (very few). a discrepancy between the wisdom of the entire public and this model is very hard to believe.

I meant to say, "a discrepancy this large..."

mwh-Put a gun to my head and I would not say SD is an 85% favorite over the Pats. However, consider their respective passing games: SD offense 7.9 net YPA (#1); NE defense 6.9 net YPA (#27).

Throw in that the game is at SD, and this suggests the betting markets are saying NE is the slightly better team. Bettors are basically banking that turnover rates and special teams break-downs will continue on pace, and that's a bad bet.

So while I'd agree NE's chances are better than 15%, it's almost certainly a lot less than 45%.

@makewayhomer,

Your point only makes sense if the betting market is rational. Economists often assume markets are rational, but social economists have shown in recent years that these assumptions are incorrect. A stat that performs better than Vegas would be but one more piece of evidence in the trend.

"Bettors are basically banking that turnover rates and special teams break-downs will continue on pace, and that's a bad bet."

Turnover rates, yes. But bad special teams play? Absolutely liable to continue. It's not just a matter of luck, it's a combination of bad coaching and bad players.

I think a lot of readers are struggling over your use of the word "luck". Maybe another word like "unpredictable" is better. The idea is exactly the same, but people tend to view any human-influenced event as inherently skill based. I can see people scoffing at the idea that a fumble was "luck", but recognizing that it was unpredictable.

I took the liberty of evaluating your week 6 picks by mean squared error and comparing to several other simple systems. The results are not very good, but I realize this was not an ordinary week. I haven't checked but, I suspect the week 5 picks would have come out on top.

Vegas Picks (100% for favorite): .1818

Vegas moneyline (converted to %): .2128

Home picks (at 57%): .2485

Zero info picks (all 50%): .25

AdvancedNFLStats: .2813

@tgt

"A stat that performs better than Vegas would be but one more piece of evidence in the trend"

it would be. however, all kinds of results over the years have shown that beating the NFL point spread, over any long sample size, is an extremely difficult or impossible thing to do, -110 juice considered. which means the point spreads are pretty damn efficient, if perhaps not 100% perfectly so. which means that any discrepancy of 15% vs 44%, to me, points to systematic error. now, Brian has said that he thinks 15% is probably too low as well. but I think the lowest it could possibly be is 30-35% or so.

Accuscores predictions are good. Just what vegas don't want, a lop sided bet. So,is the NFL riged. I hope not,but if it is people can be predictable too. I don't know if vegas shows the balance of the bets on the weekly games before the games actually start, but I would like to see that before I layed my money down. Would an accurate prediction system force almighty Vegas to cheat to win. I thank so. You?

Does your calculation of yards factor in special teams? We know the Chargers have been brutal on special teams this year, and in fact, your remark about adding and subtracting those 8 pts/game is virtually identical to considering the impact of special teams (if you didn't in your analysis). Special teams has surrendered 30 pts in their 3 losses. In fact, if you simply consider the average impact of each special teams play (such as, an average punt return, kick, etc.) then San Diego is 5-0. A lot of hypotheticals, but special teams has been massive for San Diego this season.

I think the discrepancy between this model and a relatively efficient market raises very interesting questions. It's not impossible that as of today this model catches things that the wisdom of the crowds does not. If so, the market would react very quickly and would incoporate the considerations found in the model into the odds. For now, the "market" seems to be discounting this model for one reason or another.

On a related note - any objective model that closely mirrored vegas odds would be an impressive analytical feat.

@makewayhomer

Basically, your position is: if it hasn't been done before, it won't be done now. Also, vegas builds alot of wiggle room in, so you have to beat it badly to win consistently, which somehow supports the vegas line being more accurate than this stat.

I don't follow your "logic."

@Aaron Gordon,

Brian ignores special teams as being impossible to predict confidently. It's a known flaw in his work.

@Anonymous

The vegas market is not efficient. The casinos are, but the bettor are not. Also, without confidence in the various different advanced stats' confidence levels, it actually is more rational and efficient for the bettors to not follow the new models.

"Brian ignores special teams as being impossible to predict confidently. It's a known flaw in his work."

Speak for yourself. I challenge anyone to measure ST performance meaningfully and be able to show that previous ST outcomes predict future ST outcomes in a season. I would say that the flaw lies in the fallacy that week-to-week ST performance is predictable.

Sure, by the end of the season some teams will have more missed FGs, and some will have a few kick returns, and some will have a blocked punt or two. But this would also be true if ST outcomes were overwhelmingly random.

If I take 32 pennies, and flip them each 16 times, some pennies will appear to be far more capable than others at landing on heads. That does not mean that you can predict how each penny will land on the next flip.

@Brian,

Your ratings do ignore special teams, you do think they are impossible to predict confidently, and special teams do have an impact on the game. I see a flaw in your ratings. Your contention that flawlessness is impossible does not impact the existence of said flaw.

More interesting than the knee jerk defensiveness is the reasoning behind your belief.

The largest ST plays (touchdowns and blocks) cause a huge impact to the game without being predictable each week. Sure. Why that means you should ignore all special teams plays is inscrutable.

First off, some features of ST are predictable each week: Kickoff distance(touchbacks) and punting distance. Similarly, median kickoff return distance and punt return distance are correlated from week to week.

The real difficulty I see is that you insist on rate stats. While offensive and defensive rate stats are easy to create, pretty well correlated with winning, and relatively consistent, ST rate stats are pretty much junk. The main reason rate stats work for offense in defense is the size of the sample. 1 50 yard pass a week does not throw off the sample, and shows a reasonable liklihood for something to occur each game, but the same number of 50 yard returns would be one every 4 games. Moreover, as special teams performance diverges from the mean, the rates stats lose predictive value.

I'm not sure what the best solution is. Maybe you could try tying EPA of special teams plays in with the rate stats for offense and defense? Maybe use median values instead of averages for special teams plays? Maybe vary the strength of each special teams factor based on how often that rating is expected in a given game (e.g., if two teams have awesome offenses and no defenses, punting and punt returns would be set low while kickoffs and kick returns would be set high)? Maybe include more incertainty around teams who have more big special teams plays(pos and neg), and less around teams with consistent special teams?

Like you, I don't think STs rate stats are worth even the 32 pennies you referenced, but that doesn't mean that there aren't other stats that could fill that hole.

tgt-Allow me to rephrase your comment from "It's a known flaw in his work." to:

"I think it might be a flaw."

Let's keep separate things that are 'known' and accepted from the things that are personal theories.

Brian doesn't ignore special teams, as his post about my favorite punter, Zoltan Mesko, proved. He also understands how important it can be, as the WPA analysis of his punt suggested. That is why I was so puzzled to see it not even mentioned in this post about the Chargers' WP this weekend against those very Patriots. It is just such an obvious consideration. One team has consistently given up points in ST play through five games, while the other has consistently been the beneficiary of ST points. Even if he gave it lip service to say "...but ST is completely unpredictable like fumbles."

I think one of the biggest things to keep in consideration for the chargers this week against the pats is the health of gates, floyd,and naane (with vjax already out) Can their passing efficiency remain so high with so many receivers out? That information is being taken into consideration for the vegas odds and not in the model. That isn't a flaw in the model but instead something that is a given and can help explain a certain percentage of the variance between the two.

That said most people in my confidence level pool picked the Pats and I picked the chargers (although with a lower confidence than the model would suggest). Will be interesting to see if this great team on paper can show up despite some injuries.

The Vegas books have a bias to account for the proximity to the CA bettors, and especially the So. Cal market. So, online its Chargers -1, in my local book its Chargers -3.

Like everyone, I was curious about how this model compares to Vegas because this is the best way to gauge success rate. Since Brian has predictions archived back to 2007, there is a pretty good base to work with for anyone willing to dedicate the time. So I did.

Based on this post - http://www.advancednflstats.com/2010/10/how-accurate-is-prediction-model.html - the prediction model is too confident with large spreads as the highest accuracy occurs within the 25-75% range. To give his model a fair chance, I disregarded all games that had a >75.0% favorite.

For the next step I converted all Vegas game lines to Win percentage to make for an easy comparison. Since low percentage differences do little to separate the two methods, a 10% and 20% difference for Weeks 4-10 and 8% and 15% difference for Weeks 11-17 was used (the game win percent lines between Vegas and Brian's model become more equal as the season progresses so this is why the range needed falls later in the year). This narrows down the number of games used to an average of 2-3 per week.

When Brian's predictions were different from Vegas, this how the prediction model did week-by-week:

10.0-19.9%/8.0-14.9%

04: 5-7

05: 7-2

06: 0-8

07: 4-0

08: 1-3

09: 4-2

10: 1-5

11: 5-0

12: 6-3

13: 5-0

14: 5-2

15: 5-5

16: 6-5

17: 2-0

Total: 56-42 (57.1%)

20.0+%/15.0+%

04: 4-1

05: 4-4

06: 3-1

07: 2-1

08: 1-2

09: 2-2

10: 0-4

11: 0-2

12: 2-3

13: 1-3

14: 1-4

15: 0-0

16: 4-1

17: 1-2

Total: 25-30 (45.5%)

Combined Total: 81-72 (52.9%)

Since a +9 Win record over a 153 game period falls within a common statistical variability for something that is 50/50, it can be said that the Game Prediction Model has thus far proved no skill over Vegas.

To see if a small percentage advantage is legitimate, we would look for a higher win percentage in the "bigger difference" games. The fact that the biggest differential games (20%+/15%+) came out with a -5 overall record in 55 attempts does more to show that this model falls close to the 50/50 range.

In compiling this data, I did notice that there were two periods that performed better than the rest - Most of the 2007 Season, and in the later weeks of last Season, 2009. It would be interesting to know if the model has undergone any changes since the beginning. Also, is a 7% win percentage factored into the home team (it actually should be 8%, but the former is what Vegas uses)? That variable could provide a significant difference in how this compares.

This has quickly become my favorite site to use for NFL information so please don't take this post to mean anything negative. My main motivation for doing the research and posting these statistics is in hopes that Brian can re-evaluate his model and improve it or create something better. But as it stands now, Vegas is the better alternative for weekly game predictions.

@Brian

Which of these premises do you not agree with:

* special teams affect who wins each game,

* special teams are not factored into the game probablilities, or

* a win probability system is flawed if it ignores factors which contribute toward winning?

You've shown support for the first two and the last one is just the definition of a flawed system.

I stand by my objective reasoning. The only thing possibly subjective is my definition of a flawed system, and I doubt many people would disagree with it. If that isn't a flaw, what would be?

As an aside, flawed does not mean bad. All the nfl, nba, and mlb ratings I've seen are flawed. All attempts to model the stock market and our economy are flawed. Flaws are. We just need to handle them as best we can.

Instead of attempting to sweep an obvious flaw under the rug, you should explain the impact of that flaw. It looks to me like the following occurs:

* At the specials teams margins, your game probabilities are unreliable.

* At the special teams median, your game probabilities slightly underestimate confidence.

Much like Chuck above me, I'm just trying to be helpful.

I'm guessing the gist of the original post is to say that the Chargers' statistical performance indicates that they are a better team than their record. I don't think that's a surprise to anyone. Their poor turnover and special teams numbers will regress to the mean in the remainder of the season and their record will recover. The only disappointment is that they've played the softest part of their schedule (Chiefs, Jags, Cardinals, Seahawks, Raiders, and Rams) and are only 2 - 4 when they "should" be 6 - 0. But when you go 14-2 and fire your coach, you gotta expect a few bad breaks.

Chuck,

Thanks for doing that. I thought it would be cool if somebody did something like that.

I'm not sure you quite answered the question I would have asked. Do you still have the spreadsheet available? Could you answer some follow up questions?

The biggest question is would Brian's system have beaten Vegas over the last 5 years?

To beat Vegas long term, you don't need to win all the games or even a majority of the games, you just need to successfully identify games in which your teams have a greater chance of winning than Vegas says they do. If you identify a group of games where you think a team has a 60% chance of winning and Vegas says 30%, you'll still win a lot of money if they only win 45%. If you say 60% and Vegas says 30% and they win 25%, well you've got a problem.

I guess the big question is what were the average odds via Vegas for those games and what were the average odds via Brian's model for those games.

Frankly, I don't think the Chargers have a 85% chance of winning, but I do think they have more than a 55% chance of winning. If I was to bet on the game, that would make it worthwhile to bet on the Chargers even if I thought they only had a 60% shot.

Aaron-Reread the article. You must have missed the paragraph detailing the impact of SD's ST.

tgt-You have to be kidding. Airlines don't provide airbags behind each seat. It would improve safety in crashes, right? So is that a " known flaw" in airliner design? Of course not. It is a net negative effect due to the expense and complexity, and its benefit is negligible. Airline engineers know all about airbags and their potential benefit and choose to worry about more important things. Not a flaw.

All the data is available. Produce an open model that will accurately predict ST performance measured in a way that correlates with team wins in weeks 9-17 this season, and I'll send you a free ANS sweatshirt.

Chuck-Thanks. I'll take a +9 game system any day. To be fair, keep in mind this model is completely ignorant of playoff teams resting starters or top QBs or other players who are injured.

Since this system has been more accurate than Vegas, it's more likely that the true flaw is in trying to take special teams into account.

@tgt

i think maybe using the word flaw brings up the wrong connotation (at least in my mind). "Flaw" makes it sound like Brian should run out and change his model to include special teams. But if you are just trying to say that this is one aspect that is not modeled, and could result in a "delta" between what is predicted and what actually happens, then you are right. Probability is all about what is known and unknown. Prediction models have to treat what is unknown (or unknowable) as random. I would say that even though one could predict small pockets of special teams performance reliably (kickoff distance), special teams when taken as a whole are random (unpredictable). And randomness just is; that's why predictions are in terms of probabilities. So I wouldn't call it a "flaw"...at least not a flaw in the model.

Also, I understand that the conversation is about over/underestimating game probabilities, but I would like to see a straight up wins comparison against Vegas...I don't remember where i read it (somewhere on this site) but I thought this model had beaten the Vegas picks for a few years in a row?

Brian, if I could add my own two pence. This is what you're talking about when you mention the difference between descriptive or explanatory stats and predictive ones.

Yes, past special teams play correlates with past wins. It is a big factor in explaining why a team won a certain game. But past special teams play does not correlate with future wins, at least not in any meaningful way yet found.

So if Brian was in the wild-card week explaining why your team missed out on the playoffs, he'd likely include special teams in his calculations. But sat here going into week 7, San Diego's past special teams play doesn't provide any clue (mathematically speaking) as to how they'll do against the Patriots.

Bingo.

Wow. This blog sure is getting a lot of interest lately!

It's been a great ride.

@Andy

I don't see the word flaw with as many connotations as other people, so I didn't see it as a huge negative. It does appear to have struck a nerve though, so feel free to replace "flaw" with any of the following "design decision," "feature," "aspect," or "horrible, atrocious idea of horribleness" wherever you see fit. :)

We do seem to agree that SOME special teams are likely measurable and predictable. I think the incremental improvement might be worthwhile, but I don't have the patience or ability to factor it in myself. Of course, Brian is of no compulsion to act on my wishes.

@Brian

As above, pretend I didn't use the word flaw, instead used something with less negative connotations.

Continuing on my nitpicks though, your airplane example is pretty horrible. Modelling a system is not analogous to building an object. True, in both situations, you make tradeoffs between improvements and costs, but that's roughly where the similarities end. That isn't even very similar as the basic definitions of improvements and costs are not even the same between the types of situations.

@Chuck Winkler

Where do you see the archived predictions? I would be very interested in doing something a little more in depth in evaluating the predictions.

@Chuck Winkler

Nevermind. It looks like you went through all the old posts and compiled it yourself. I'm pretty sure I don't have the patience for that, but if you or Brian or anyone else wants to send me what you put together, preferably with moneylines, I think I could add a lot to the discussion.