Back to Selecting the Right Distribution
Chapter 10

Why Distribution Choice Matters

Why one model doesn't fit all props

Why Distribution Choice Matters

In previous chapters, we introduced the Poisson distribution for modeling count-based props (touchdowns, receptions, strikeouts) and the Negative Binomial for handling overdispersion (variance larger than the mean). But there's another structural feature in real prop markets that neither of these handles well:

  • Excess zeros
  • And more importantly, different kinds of zeros

A backup running back might post 0 receptions not because he ran routes and failed to catch the ball, but because he was used as a blocker and never had a chance. A hitter might record 0 RBIs not because he failed with runners on base, but because he never had a plate appearance with runners in scoring position.

These are structural zeros—outcomes generated by a different process than the count process itself.

Key Insight

For props near zero, choosing the wrong distribution can completely flip your edge calculation—even when your projected mean is correct. Distribution choice is not a technical detail; it's a core part of finding value.

The Four Distributions in Your Toolkit

By now, you should have four distributions available:

DistributionBest ForKey Characteristic
NormalHigh-volume continuous stats (yards, points)Symmetric, allows negatives
PoissonConsistent count-based events (VMR ≈ 1)Variance = Mean
Negative BinomialBoom-or-bust count events (VMR > 1.3)Allows overdispersion
Zero-Inflated/HurdleProps with structural zerosSeparates opportunity from production

The Critical Question: Where Does the Mean Come From?

When you project a player's expected output, you need to understand the source of that mean:

  1. Is it from opportunity changes? (More playing time, more targets, more at-bats)
  2. Is it from production changes? (Better efficiency when given opportunity)

Warning

Your mean projection must be paired with a story: is the mean rising because opportunity changes (π) or because conditional production changes (λ)? These two stories produce very different Over probabilities.

A Preview: The Stafford Case Study

Throughout this chapter, we'll use one extended example to demonstrate why distribution choice matters so much: Matthew Stafford's rushing yards prop.

The Market:

  • Over 0.5 rushing yards at +134 (risk $100 to win $134)
  • Under 0.5 rushing yards at -180 (risk $180 to win $100)

The Projected Mean: 1.41 rushing yards

The Question: Should you bet the Over?

Here's what happens when we fit four different distributions, all forced to match the same mean of 1.41:

ModelP(Y ≥ 1)EV on Over ($100 bet)
Poisson75.6%+$76.90
Normal62.4%+$46.02
ZIP34.7%-$18.80
Hurdle35.3%-$17.40

Market break-even: 42.7%

Key Insight

Same mean. Same projection. Completely opposite betting recommendations. Poisson and Normal suggest massive Over value. ZIP and Hurdle suggest the Over is negative EV. That's not a small discrepancy—it's a completely different bet.

Understanding the Break-Even Calculation

Before diving deeper into each model, let's establish how we calculate break-even probability:

Break-Even Probability (Positive Odds)

p_BE = 100 / (100 + odds)
Excel: =100/(100+A1)

For +134 odds:

  • Break-even = 100 / (100 + 134) = 100 / 234 ≈ 42.7%

What this means: If your model says P(Over) > 42.7%, the Over is +EV. If it's below 42.7%, the Under is +EV.

Understanding the EV Calculation

For a $100 stake at +134 odds:

  • If you win (probability p): you profit +$134
  • If you lose (probability 1-p): you lose -$100

Expected Value

EV = p × profit - (1-p) × stake
Excel: =A1*134-(1-A1)*100

We'll plug in different values of p from each model to see how distribution choice affects profitability.

Why This Chapter Matters

The rest of this chapter will teach you:

  1. How to detect structural zeros using the Variance-to-Mean Ratio (VMR) and observed zero frequencies
  2. When to use Zero-Inflated Poisson (ZIP) for "no-opportunity" scenarios
  3. When to use Hurdle models for "any vs. none" decisions
  4. Sport-specific guidance for choosing the right distribution
  5. Common mistakes that cause bettors to systematically misprice props near zero

Tip

The goal isn't to memorize formulas—it's to develop intuition for when your baseline model is likely to fail and which alternative to reach for.


📝 Exercise

Instructions

Warm-Up Exercise: Identifying Structural Zeros

For each scenario below, identify whether the zeros are likely to be:

  • A) Poisson zeros (player had opportunity but produced zero)
  • B) Structural zeros (player had no meaningful opportunity)

A backup running back has 0 receptions. He was used exclusively as a pass blocker and never ran a route.

A wide receiver has 0 touchdowns. He had 8 targets, 5 catches, and 67 yards, but no red zone opportunities.

A pocket quarterback has 0 rushing yards. The game script never required him to scramble and no designed runs were called.