Detecting Overdispersion: The Variance-to-Mean Ratio
The decision to use Negative Binomial instead of Poisson comes down to one question: Is the data overdispersed?
Overdispersion means the variance in the data is larger than the mean. Poisson assumes they're equal (VMR = 1.0). When VMR is significantly greater than 1.0, Poisson will underestimate the probability of extreme outcomes and overestimate the probability of middle outcomes.
Step 1: Calculate the Variance-to-Mean Ratio (VMR)
Variance-to-Mean Ratio (VMR)
VMR = σ² / μ = Variance / Mean=VAR.S(range)/AVERAGE(range)For a player's last 15-20 games, calculate:
| Metric | Excel Formula |
|---|---|
| Mean (μ) | =AVERAGE(range) |
| Variance (σ²) | =VAR.S(range) |
| VMR | =VAR.S(range)/AVERAGE(range) |
Tip
Use VAR.S (sample variance, divides by n-1) rather than VAR.P (population variance). For betting purposes, your historical games are a sample used to estimate the underlying process, so sample variance is the appropriate choice.
Step 2: Interpret VMR
Use these thresholds to guide your distribution choice:
| VMR Range | Interpretation | Recommended Model |
|---|---|---|
| VMR ≈ 1.0 | Data fits Poisson well | Use Poisson |
| VMR = 1.0-1.3 | Acceptable Poisson fit | Use Poisson |
| VMR = 1.3-1.8 | Moderate overdispersion | Consider Negative Binomial |
| VMR > 1.8 | Strong overdispersion | Use Negative Binomial |
Key Insight
The 1.3 threshold is your decision point. Below 1.3, Poisson is good enough and simpler. Above 1.3, the added complexity of Negative Binomial is worth it because the distribution differences become material for betting.
Real Examples: Three Player Profiles
Let's analyze three players with similar touchdown averages but very different consistency profiles:
Running Back A: Consistent Goal-Line Back
Game Log (10 games): 1, 1, 0, 1, 1, 0, 1, 1, 0, 1
| Metric | Value |
|---|---|
| Mean | 0.70 |
| Variance | 0.23 |
| VMR | 0.33 |
Note
VMR = 0.33 → Use Poisson (very consistent)
This player is actually underdispersed—more consistent than Poisson would predict. He's reliable, scoring in 7 of 10 games with no multi-TD explosions. Poisson is conservative here but appropriate.
Running Back B: Boom-or-Bust Backup
Game Log (10 games): 2, 0, 0, 3, 0, 0, 2, 0, 3, 0
| Metric | Value |
|---|---|
| Mean | 1.00 |
| Variance | 1.78 |
| VMR | 1.78 |
Warning
VMR = 1.78 → Use Negative Binomial (boom-or-bust)
Classic boom-or-bust profile. This player either disappears (6 games with 0 TDs) or explodes (4 games with 2-3 TDs). Poisson would badly underestimate his probability of 0 TDs and his probability of 3+ TDs.
Wide Receiver C: Consistent Red Zone Target
Game Log (10 games): 1, 0, 1, 1, 0, 1, 0, 1, 1, 0
| Metric | Value |
|---|---|
| Mean | 0.60 |
| Variance | 0.27 |
| VMR | 0.45 |
Note
VMR = 0.45 → Use Poisson (consistent)
Another consistent player with low variance. Poisson is the right choice here—no need for the added complexity of Negative Binomial.
The Pattern to Remember
| Player Profile | Typical VMR | Distribution Choice |
|---|---|---|
| Very consistent (role players, reliable targets) | < 0.8 | Poisson |
| Average consistency | 0.8 - 1.2 | Poisson |
| Moderately volatile | 1.2 - 1.5 | Either (lean NB) |
| Boom-or-bust | 1.5 - 2.5 | Negative Binomial |
| Extremely volatile | > 2.5 | Negative Binomial |
Common Overdispersion Causes
Understanding why a player is overdispersed helps you project whether it will continue:
Game Script Variance
- Blowouts create feast-or-famine TD opportunities
- Backup RBs only get meaningful touches in garbage time
Opportunity Clustering
- Some games: 8 targets; other games: 3 targets
- Power play time varies dramatically
Role Volatility
- Players who shuttle between starter and bench
- Matchup-specific usage patterns
Hot Hand Effects
- Shooters who get hot and keep firing
- Creates clustering of makes/misses
Tip
If you can identify why a player is boom-or-bust, you can better assess whether it's likely to continue. A backup RB whose opportunity depends on game script will remain volatile. A shooter recovering from injury might stabilize as he finds his rhythm.
Quick VMR Calculation Template
Here's a ready-to-use Excel template structure:
| Column | Content | Formula |
|---|---|---|
| A | Game dates | (manual entry) |
| B | TD counts (0, 1, 2, etc.) | (manual entry) |
| C1 | "Mean" label | — |
| D1 | Mean value | =AVERAGE(B:B) |
| C2 | "Variance" label | — |
| D2 | Variance value | =VAR.S(B:B) |
| C3 | "VMR" label | — |
| D3 | VMR value | =D2/D1 |
| C4 | "Model" label | — |
| D4 | Recommendation | =IF(D3>1.3,"Neg Binomial","Poisson") |
📝 Exercise
Instructions
Calculate VMR for the following player and determine which distribution to use.
Player Data (15 games): 0, 0, 3, 0, 2, 0, 0, 0, 3, 1, 0, 2, 0, 0, 3
Calculate the mean, variance, and VMR.