This is a guest post from Ed Feng. Ed founded The Power Rank to bring more analytics and visualization to sports, football in particular. It all started when he got inspired to apply his Ph.D. research in physics to ranking sports teams. Now he hopes to get young people interested in math through sports.
How good is an NFL coach?
How do you compare established coaches like Bill Belichick with upstarts like Jim Harbaugh?
As a reader of Advanced NFL Stats, you know the answer is not Super Bowl rings. In the playoffs, a team plays at most 4 games, a small sample size.
The best coaches will not always win the Super Bowl. Most who saw the two Super Bowl wins by the New York Giants or last year's run by Baltimore would agree.
Instead, let's look at winning percentage in the regular season to evaluate coaches. However, sample size is still an issue.
Let me share a story about why.
How to make a stat head puke
In 2009, Josh McDaniels got off to a 6-0 start as the new head coach of the Denver Broncos. If you remember any game from this streak, it was their win over Cincinnati in the opener. Late in the 4th quarter, Brendan Stokely caught a tipped passed and rumbled into the end zone for a game winning touchdown. The play was immortalized as the Immacculate Deflection.
After his 6-0 start, ESPN's Tom Jackson proclaimed McDaniels "one of the great ones." I nearly soiled the carpet with vomit.
Denver would only win 2 more games the rest of the season. They missed the playoffs the last game of the season by losing to Kansas City, a team that finished 4-12.
Denver started the next season 3-9. Coach McDaniels was fired.
Never judge a coach after 6 games.
Let's look at why no one should judge a coach after 6 games.
Consider the fortunes of Coach Average. He coaches in a league with such parity that the outcome of each is like flipping a coin. Coach Average has a 50% chance to win each game.
Coin flipping isn't a bad model for NFL games in the salary cap era. Later, we'll look at which coaches do better than random coin flips.
But first, let's look at some simulations results. The visual show the first 50 games of Coach Average's career.
Just for the record, I was committed to only generating this sequence once. In no way did I ask Python to perform this calculation numerous times in hopes of finding a streak of 6 consecutive wins.
However, there are clearly streaks of wins and losses. In fact, Coach Average goes on a 10 game tear starting in his 19th game. It's easy to see order (or coaching skill) in randomness.
How the law of large numbers applies to football
Overall, Coach Average wins 31 of his first 50 games for a win percentage of 62%. Since 1991, only Tony Dungy, Bill Belichick and Bill Cowher have coached 200 or more games with that high a win rate. Tom Jackson would start Coach Average's petition for the Hall of Fame.
Since I didn't know how many games would fit in the visual, I simulated 200 coin flips for Coach Average. He won 106 of those 200 games for win percentage of 53%. As the number of flips increases, the win rate converges towards 50%, the true win rate in the model.
This is an example of the large of large numbers, the reason why no one should judge a coach after 6 games.
A look at great coaches
The coin flipping model provides a baseline to compare coaches.
In the visual, the black line represents one standard deviation away from the expected 50% win probability. For example, the standard deviation is 12.5% for one 16 game regular season.
Let's call a set of 16 coin flips a season. If we simulate a season a million times, 2 out of every 3 seasons would have a win percentage within 12.5% of the expected 50% for Coach Average.
The visual also shows the regular season win percentage for every coach since 1994, the year the NFL imposed a salary cap. I also included Bill Belichick and Bill Cowher, even though their tenures as head coach started a few years before 1994. (Cowher is the unlabeled data point above Andy Reid; Tom Coughlin is the other coach with over 200 games that I couldn't label.)
The visuals lets you see the greatness of coaches like Tony Dungy and Bill Belichick. No tests of statistical significance required.
3 other coaches stand out in this visual.
Andy Reid. Despite a 4-12 record in 2012, Reid did a remarkable job, winning more than 58% of his games in Philadelphia. Eagles fans never gave him enough respect because he never won a Super Bowl. The visual shows his win percentage is more than 2 standard deviations better than 50%.
Jim Harbaugh. The San Francisco coach is off to a fast start, winning 24 of 32 games after two seasons. But judging from the elite coaches with more than 200 games under their belt, Harbaugh will not continue to win an excess of 76% of his games.
Mike Smith. The Atlanta coach finally won his first playoff game over Seattle last season. However, his regular season record has been stellar, posting a win percentage of over 70% for 5 seasons. Keep that guy around.
In a league as competitive as the NFL, random coin flips are not a bad model for game outcomes. Coaches that outperform this model should be kept, and data visualization captures the big picture.