Let's examine the leverage of each challenge using the Win Probability (WP) model.
The first challenge was on a spot on a NYJ 3rd down. Late in the 3rd qtr, the refs spotted the ball just short of a 1st down at the BUF 38, creating a 4th and 1. Ryan challenged the spot and lost. Ryan eventually went for it on 4th down (which was smart), but the Jets were stuffed. A successful challenge would have given NYJ a 0.87 WP. A failed challenge would leave NYJ with a 0.84 WP based on going for it. That's a leverage of 0.03 WP. Every bit of WP matters, so it's nothing to sneeze at. But remember that challenges come with a cost.
(I also looked at the punt option, which was worth 0.82 WP for the Jets, so at best the leverage was 0.05 WP.
The second challenge came immediately later after BUF gained possession. BUF completed a 23-yd pass on 1st down, setting up a 1st and 10 at the NYJ 40. Ryan unsuccessfully challenged the completion. The result of the play was a 0.27 WP for BUF, but a 2-10 from their own 37 would mean a 0.21 WP. That's a leverage of 0.06 WP for the second challenge.
What about the fumble that wasn't? A successful challenge would have given the Jets a 1st and 10 on the BUF 41, up by 8 with 13:00 to play. That's worth 0.89 WP. An unsuccessful challenge wasn't so bad either. The Bills were called for holding, forcing a 2nd and 20 from their own 10, which gives the Jets a 0.87. Surprisingly that's only a leverage of 0.02 WP.
So of the three challenge situations, real and potential, the one that seemed biggest was actually the smallest. The biggest was the big 23-yd pass play that put BUF in NYJ territory. In full disclosure, I set out to prove how stupid it was for Ryan to squander his challenges by pointing to how big an impact a turnover would have been in that situation. But it wasn't as it seemed. Here's why I was wrong:
1. There was still a lot of time left in the game. A turnover would not have been nearly as fatal was we might think.
2. BUF's penalty on the play meant that the alternative to a fumble recovery was still really crappy for BUF. A 2nd and 20 from a team's own 10, down by 8 in the 4th quarter, is not a good place to be.
3. Running out of challenges only looks like a really big mistake in retrospect. BUF went on to score a TD and a 2-pt conversion on that drive. Outcome bias can affect us all, even stat wonks.
4. The Jets were already up by 8, meaning it would take BUF scoring a TD more than the Jets, plus a 2-pt conversion...just to tie. Realistically, the worst case scenario is a 0.50 WP for the Jets in regulation.
5. When the WP estimate gets close to 1 or 0 for a team, there isn't much further you can move the needle. You can never get more than a 1.00 WP, so events that would swing the WP wildly when the game is closer to 0.50 WP naturally can't have as big an impact when it's not so close. It's just the way probability works (or is it?)
I still think that Ryan's second challenge was unwise. Despite the fact it had the biggest leverage, it was a) his final challenge and b) it was his second timeout in a one-score game. And because he lost his first challenge, winning the second would not have granted him a third.
In this case, the numbers don't condemn Ryan's decisions. But I think it's a good demonstration of the potential for analyzing challenge decisions using the WP model. Ultimately, it's a very tough problem because we have to estimate the 'potential' value of saving the challenge, which varies greatly based on game situation.