Putting It All Together
Taking a step back, the goal is to compare two strategies for the defense. The first is to play conventionally and force a stop and a FG attempt, hoping it will either fail or that there is enough time to match it with a counter-score. The second is to intentionally allow a TD immediately and use the time remaining to respond with a counter-TD.
So far, we have estimates for the key inputs:
-When the team on defense would get the ball back
-The probability of failure on the FG attempt
-The probability of responding to a made FG with a score
-The probability of responding to an allowed TD with a TD
The allow-the-TD strategy is the simpler one to value. We can account for the time of the intentional TD play and plot the probability of responding with another TD as a function of time. However, there is one wrinkle. If the team on defense is only ahead by one point, the offense would be smart to go for the two point conversion following a TD, allowed or not. If the offense converts the two point conversion, a response TD only ties. If the two point conversion fails, it's no different than kicking the extra point. A response TD wins either way. The offense therefore has nothing to lose by going for the two point conversion.
For when the team on defense is ahead by two points:
wp[allow TD] = p(scoring own TD in response)
But for when the team on defense is ahead by one point, and the offense would go for the two point conversion to take a 7-point lead:
wp[allow TD] = p(scoring own TD in response) * p(2-pt conv fails) + 0.5 * p(2-pt conv succeeds)
The value of forcing the FG is slightly more complicated because it combines the possibility of a failed FG with the possibility of responding with another score.
wp[force FG] = p(FG fail) + p(scoring | made FG)
This is a total probability computation, much like we do for typical 4th down decisions. The probability of scoring is a function of time remaining, and the probability of a failed FG is a function of field position. The calculation becomes:
wp [force FG] = p(FG fail) *1 + p(scoring) * (1 - p(FG fail))
Final Result with a Prominent Example
There are too many variables to show in a single illustration, so presenting the results required some creativity. There are timeouts, field position, time remaining, plus the result variable, win probability. To simplify things, the results are broken out into separate graphs for each possible number of timeouts remaining. Also, field position is represented by multiple lines on each graph, with each color denoting a 5-yard increment.
I'm going to go out of order so I can illustrate the results with a prominent example from Super Bowl 46 between the Giants and Patriots. Up by 2 points, the Patriots defense took the field to stop a final Giants drive that started on the New York 12 with 3:46 to play. It took only three plays for the Giants to make it inside New England's 35.
The graph below shows the win probability for defenses with two timeouts remaining. The horizontal axis is time remaining at the 1st down snap. The vertical axis represents the wp for the various situations described in the curves. The black line is the wp for allowing an intentional TD. The colored lines are the wp for forcing the FG attempt and each one represents the field position at the 1st down snap. Wherever the black "allow TD" line is higher, an immediate offensive TD would be preferable to forcing a FG.
You'll notice two abrupt vertical inclines in the colored curves for the force-FG option. The leftward one is due to the rapidly increasing probability of responding to a made FG with a score with respect to time. The second is due to the two minute warning. The force-FG option curves are so irregular because the time the defense would get the ball back is so irregular. The allow-TD curve is smooth because the time the defense would get the ball back is nearly immediate.
The Giants had three first downs inside FG range. The first (1) was at 2:52 at the NE 34. The second (2) was at the NE 18 immediately following the two minute warning. The third (3) was at 1:09 at the NE 7.
As the chart shows, 1st down #1 was well above the choke-hold zone. The probability of winning by forcing a stop and a FG attempt was greater than for trying to match an intentionally allowed TD in that situation. However, 1st down #2 was barely outside the choke-hold zone. My original analysis had suggested the TD be allowed at this point, but that's partly because retaining two timeouts on defense in that situation is so uncommon that the general Win Probability model discounted it. If NE had only one timeout left, it would have been a no-brainer to allow the TD on 1st down #2 (see below). The other reason is that the Giants played unconventionally and with abandon, passing the ball aggressively even inside FG range.
Curiously, Patriots coach Bill Belichick did not call a timeout between play #2 and play #3. Not until following a 1-yard gain on 1st and goal from the 7 did Belichick call his second timeout. On the very next play, Ahmad Bradshaw was (by most accounts) allowed to score the TD. Ultimately, the Patriots got the ball with 57 sec to play and one timeout remaining.
Had NE called a timeout prior to 1st down #3 and the identical events unfolded--NYG scoring a TD on their subsequent 2nd down--NE would have had an estimated 20% chance of winning instead of the 6% or so they had when they actually took possession. It's possible Belichick was hoping that somehow time would run out on the Giants. But it's more likely that, with two timeouts in his pocket, Belichick chose to wait to see how the first down play turned out before deciding to use them. If his defense held, he would use one, but if his defense allowed a conversion, he would wait to see how the subsequent first down play went. I think this was a mistake because at any time the very next play could be a touchdown, and he'd rather have the extra 39 seconds than an extra timeout on offense.
Here are the resulting charts for when an immediate TD is preferable to forcing a FG. (Suitable for lamination, coaches!) With no timeouts remaining, the situation is very dire, and there is a relatively large window for preferring to allow a TD (or for taking a knee). The solid black line is the wp for when the team on defense leads by two points. The dashed black line is the wp for when the team on defense leads by one point. Note: I chose a value of 47% for the chance of the offense converting a two point conversion in the case of the 1-point lead for the defense.
As a reminder, wherever the first down situation is above the appropriate black line, the preferred option is to force the stop and FG attempt. Wherever the situation is below the appropriate black line, the preferred option is to allow the TD.
With a single timeout, the window gets smaller as the team on defense's ability to respond to a made FG comes into play.
Here is the chart for two timeouts remaining, which we saw earlier in the example from Super Bowl 46.
With all three timeouts available to the defense, the immediate TD is almost never preferable to forcing the FG. There's just a tiny window with about a minute left and the ball inside the 15.
There's more work to be done. As pointed out by a commenter, if the offense misses its FG attempt but still has timeouts and time on the clock, the probability of winning by making a stop would be lower than I've estimated here. We also want to know the numbers for when the game is tied, or when the defense is up by three.
This is why football is uniquely compelling. In what other sport would it be better to allow your opponent to achieve a major score? When would you prefer that your opponent score a goal in hockey or soccer or lacrosse? When would you want your opponent to ever hit a three-pointer? What about baseball or cricket? Sure, you'd prefer to walk in one run to save four runs, but that's instinctively intuitive, the same way a football defense would normally prefer to give up 3 points instead of 7.
This may be the most complex, most challenging, and most counter-intuitive analysis I've done. There were some assumptions made in this analysis that could use some refinement, but I think we've got our arms around the problem, and we have a framework for further research. We also have a clear way of presenting the results in a way a coach can look up quickly in the heat of battle.