Back to The Negative Binomial Distribution
Chapter 9

When Consistency Isn't the Whole Story

Recognizing overdispersion in player data

When Consistency Isn't the Whole Story

It was Week 12 of the 2023 NFL season, and Vinny, a sharp bettor who had been successfully using Poisson models for touchdown props, was reviewing his results from the previous month. Something wasn't adding up.

He had identified a backup running back who was getting consistent goal-line carries. The player averaged 0.85 touchdowns per game over his last 12 appearances, and Vinny had been betting his touchdown props using a Poisson distribution with λ = 0.85.

The math looked clean. According to Poisson, with λ = 0.85:

OutcomePoisson Probability
P(0 TDs)42.7%
P(1 TD)36.3%
P(2 TDs)15.4%
P(3+ TDs)5.6%

But when Vinny pulled up the player's actual game log, the pattern didn't match.

The Problem: Actual Distribution vs. Predicted

Over those same 12 games, the running back had posted:

OutcomeActual FrequencyPoisson Predicted
0 TDs50% (6 games)42.7%
1 TD25% (3 games)36.3%
2 TDs16.7% (2 games)15.4%
3 TDs8.3% (1 game)5.6%

The mean was still 0.85 touchdowns per game—his projection was correct. But the distribution was wrong.

Warning

The player had far more games with 0 touchdowns (50% actual vs. 42.7% predicted) and far more games with 2+ touchdowns (25% actual vs. 21% predicted) than Poisson suggested. The middle outcome—exactly 1 TD—happened less often than expected (25% actual vs. 36.3% predicted).

Discovering Overdispersion

Vinny calculated the variance from the game log: 1.03. But the mean was 0.85. The variance-to-mean ratio (VMR) was:

VMR = 1.03 / 0.85 = 1.21

Poisson assumes VMR = 1.0, meaning variance equals the mean. This player's outcomes were overdispersed—more spread out than Poisson predicts.

Key Insight

By using Poisson, Vinny had been systematically underestimating the probability of extreme outcomes (0 TDs or 3+ TDs) while overestimating the probability of middle outcomes (1 TD). He was leaving money on the table.

Enter the Negative Binomial Distribution

The Negative Binomial distribution is Poisson's more flexible cousin. Where Poisson assumes variance equals the mean, Negative Binomial allows variance to exceed the mean.

It's perfect for modeling players whose performance is more volatile than average—players who either boom or bust, with less middle ground.

What Makes Negative Binomial Different

DistributionVariance AssumptionBest For
PoissonVariance = Mean (VMR = 1.0)Consistent players
Negative BinomialVariance > Mean (VMR > 1.0)Boom-or-bust players

The Negative Binomial uses the same μ (or λ) parameter for the expected rate, but adds a new parameter, r, that controls how spread out the distribution is. The result is a distribution with fatter tails: higher probabilities for extreme outcomes, lower probabilities for middle outcomes.

Key Insight

The Negative Binomial distribution is your tool for modeling boom-or-bust players. When a player's variance exceeds their mean (VMR > 1.0), Poisson will systematically misprice extreme outcomes. Negative Binomial corrects this by allowing the distribution to spread out more than Poisson does.

Why This Matters for Betting

The market often prices all players as if they follow Poisson (VMR = 1.0), creating opportunities when you can identify the boom-or-bust types.

Consider two players with identical averages:

Player A: Consistent Goal-Line Back

  • Game log: 1, 1, 0, 1, 1, 0, 1, 1, 0, 1
  • Mean = 0.70, Variance = 0.23
  • VMR = 0.33 → Use Poisson

Player B: Boom-or-Bust Backup

  • Game log: 2, 0, 0, 3, 0, 0, 2, 0, 3, 0
  • Mean = 1.00, Variance = 1.78
  • VMR = 1.78 → Use Negative Binomial

Both players score touchdowns, but their patterns are completely different. The market that prices them identically is leaving money on the table for the sharp bettor who recognizes the difference.

Chapter Preview

In this chapter, we'll explore:

  1. What the Negative Binomial distribution is and how its two parameters work
  2. How to detect overdispersion using the Variance-to-Mean Ratio (VMR)
  3. How to estimate the dispersion parameter (r) from historical data
  4. How to calculate probabilities using Excel's NEGBINOM.DIST function
  5. Sport-specific applications across NFL, NBA, MLB, and NHL
  6. Practical workflows for integrating Negative Binomial into your betting process

📝 Exercise

Instructions

Review the following game logs and identify which player is likely overdispersed (would benefit from Negative Binomial modeling).

Player X has these TD totals over 10 games: 1, 0, 1, 1, 0, 1, 0, 1, 1, 0. Player Y has these TD totals: 0, 0, 3, 0, 0, 0, 2, 0, 0, 3. Which player is overdispersed?