Traditional NFL stats often act like funhouse mirrors –- making a quarterback’s performance look like something it isn’t.
For example, take a look at these stat lines from the 2015 NFC wild-card game between the Green Bay Packers and Washington Redskins:
Aaron Rodgers: 21 of 36 passing, 210 yards, 2 touchdowns, 0 interceptions, 93.5 passer rating.
Kirk Cousins: 29 of 46, 329 yards, 1 touchdown, 0 interceptions, 91.7 passer rating.
If you asked 100 random people in a “Pepsi-Coke”-type challenge which quarterback had the better game based on these stats, chances are Cousins would win in a landslide. But any objective observer who watched this game would acknowledge that Rodgers was the better quarterback in Green Bay’s 35-18 win.
Traditional box score stats distort the performances of Rodgers and Cousins in this game because they (1) fail to account for all of the ways a quarterback can affect a game, (2) don’t put plays into the proper context (a 5-yard gain on second-and-5 is very different from a 5-yard gain on third-and-10), and 3) don’t acknowledge that a quarterback has teammates who affect each play and should also get credit for everything that happens on the field.
Examines all of a quarterback's contributions
ESPN’s Total Quarterback Rating (Total QBR), which was released in 2011, has never claimed to be perfect, but unlike other measures of quarterback performance, it incorporates all of a quarterback’s contributions to winning, including how he impacts the game on passes, rushes, turnovers and penalties. Also, since QBR is built from the play level, it accounts for a team’s level of success or failure on every play to provide the proper context and then allocates credit to the quarterback and his teammate to produce a clearer measure of quarterback efficiency.
Leaving out key areas of impact can make a quarterback’s performance look very different. Omitted from Cousins’ stat line, for example, are his 6 sacks taken, 3 fumbles (1 lost) and 2 pre-snap penalties on Washington’s offense. Rodgers, on the other hand, took only one sack, did not fumble and drew a number of defensive penalties that kept drives alive. Each quarterback impacted the game through these plays, but none of them are reflected in the traditional stats.
The lack of context for each play also increases the distortion of the performance. Most would acknowledge that a 7-yard completion on third-and-10 is not a successful play, but base-level statistics treat all yards equally. Coaches, players and fans know what wins games; it only makes sense that the statistics that judge the most important position in the game do, too.
In the NFC wild-card game referred to above, Rodgers started slow but manufactured five straight scoring drives and posted an 87 Total QBR. In comparison, Cousins’ errors cost him the game, and despite throwing for 119 more yards than Rodgers, Cousins had a Total QBR nearly 30 points lower. QBR is a measure of efficiency, so Rodgers created far more value per play than Cousins did.
Degrees of success on each play
So how does QBR actually work?
For each play, QBR begins by asking: How successful was the play for the team, given its context?
Context for each play includes the down, yards to go for a first down, distance to the end zone and time remaining in the half. All of these factors can be used before the ball is snapped to estimate the future net score advantage the team currently on offense can expect. This estimate is known as "expected points.” After the play, the change in those factors lead to a change (positive or negative) to the team’s net point advantage. That change in the expected points caused by the outcome of the play represents the play’s value, or its Expected Points Added (EPA), given all the context.
When a team fails to convert on third down, struggles in the red zone, takes a lot of sacks or turns the ball over, it generally registers as negative EPA for the offense. But not all turnovers are created equal: A Hail Mary interception at the end of the half is not as impactful as one in the middle of the second quarter –- and EPA knows that.
Division of credit
EPA provides the context for every play and also holds the key to separating the quarterback’s impact from his teammates’. For all plays in which a quarterback is involved -– passes, rushes, sacks, penalties, fumbles, etc. -– the team-level EPA is calculated and then divided among a quarterback and his teammates. In other words, was the play successful and how much of that success is a result of a quarterback’s skill?
For example, Rodgers’ longest completion against the Redskins was a 34-yarder to James Jones in the second quarter, but he could have gained those yards through the air or on a short screen that was broken for a long gain. He also could have completed the pass when under duress or thrown it from a clean pocket. In all of those scenarios, Rodgers’ level of skill differs, and the credit he receives for the 34-yard gain (or in this case, plus-2.0 EPA) should differ as well.
That means on completed passes, the EPA is divided among the quarterback, his receivers and the offensive line based on how far the ball travels in the air, what percentage of the yards were gained after the catch (compared to how many yards after catch are expected) and whether the quarterback was under pressure. This division of credit is based on statistical analysis of thousands upon thousands of NFL plays. In this sense, QBR knows that Cousins was helped by his receiver, who gained fewer yards after the catch than expected given where he caught the ball, but hurt by his offensive line.
The details of every play (air yards, drops, pressures, etc.) are charted by a team of trained analysts in the ESPN Stats & Information Group. Every play of every game is tracked by at least two different analysts to provide the most accurate representation of how each play occurred.
Before moving on to the next play, QBR asks one more question: Did this play come in garbage time?
As we know, amassing yards and points in a blowout does not tell you too much about a quarterback’s true skill. When the game is out of reach, which is measured by a team’s win probability at the start of the play, a quarterback receives less credit than on an otherwise “normal” play. Unlike the initial version of QBR released in 2011, plays are no longer up-weighted for “clutch situations,” but we felt it was important to keep the down-weighting feature.
Efficiency stat, not a value stat
This process of determining the EPA, dividing credit among the QB and his teammates and then determining the weight of play occurs for every play in which a quarterback is involved. All of these plays are then added together and divided by the total number of clutch-weighted plays to produce a per-play measure of QB efficiency.
That last piece is important! QBR is an efficiency stat similar to yards per play or yards per attempt. Therefore, Cousins might have provided more total value than Rodgers because he was involved in more plays, but on a per-play basis, Rodgers was significantly more efficient.
Finally, the per-play measure of efficiency is translated to a number on a 0-to-100 scale to produce a player’s Total QBR. The scaling process is a fairly standard logistic regression that produces a number that is easier to grasp. An average quarterback will have a QBR around 50, and a Pro Bowl-level player will have a QBR around 75 for the season. On a game level, however, a QBR of 75 means that holding all other factors constant (defense, offensive teammates, etc.), a quarterback’s team would be expected to win about 75 percent of time, given that level of QB play.
Although QBR is not always a perfect reflection of a quarterback’s performance, it does solve most of the problems of traditional stats and bring the differences between Rodgers and Cousins’ performances into sharper focus.