Back to Selecting the Right Distribution
Chapter 10ExerciseCalculator

Common Mistakes and Workflow

Compare distributions on real prop data

Common Mistakes and the Distribution Selection Workflow

You now have a complete statistical toolkit: Normal, Poisson, Negative Binomial, ZIP, and Hurdle. But having the tools isn't enough—you need to avoid the common mistakes that cause even experienced bettors to systematically misprice props.

The Four Most Common Mistakes

Mistake #1: Using Poisson Because It's Convenient

Poisson is the default for many bettors because it's simple and widely taught. But if the stat has many structural zeros or allows negative outcomes, Poisson can wildly misprice the Over near 0.5.

Example: Pocket QB Rushing Yards

  • Poisson predicts P(Y=0) = e^(-1.41) = 24.4%
  • Actual observed rate: 65%
  • Result: Poisson overstates P(Over) by 40+ percentage points

Warning

If you see a prop near 0.5 and your Poisson model says "massive Over value," stop and check if you're seeing structural zeros. You might be walking into a trap.

Mistake #2: Using Normal Because It "Allows Negatives"

When bettors realize Poisson can't handle negative outcomes (like negative rushing yards from kneel-downs), they often switch to Normal. This helps—but it doesn't solve the spike-at-zero problem.

The Issue: Normal spreads probability smoothly across the range. But real data for props like QB rushing yards isn't smooth—it has a massive spike at Y ≤ 0.

Example: Stafford Rushing Yards

  • Normal predicts P(Over 0.5) = 62.4%
  • Actual historical rate: ~35%
  • Result: Normal still overestimates by ~27 percentage points

Key Insight

Allowing negatives helps, but it doesn't solve the spike-at-zero problem. Normal can still overstate P(Y ≥ 1) when the true data-generating process has a large structural pile at Y ≤ 0.

Mistake #3: Holding π Fixed Without Realizing What It Implies

When using ZIP or Hurdle, your mean projection must be consistent with your π and λ assumptions. If you raise the mean but keep π constant, you're implicitly pushing all the increase into conditional production.

Two Stories for the Same Mean Increase:

Storyπ (opportunity)λ (production)P(Over 0.5)
"More opportunity"DecreasesSameIncreases significantly
"Better production"SameIncreasesBarely changes

Example:

  • Original: π = 0.65, λ = 2.83, Mean = 0.99
  • Projection: Mean rises to 1.41

Story A (More opportunity):

  • New π = 0.50, λ = 2.82, Mean = 1.41
  • P(Over 0.5) = 50% ✓ Major improvement

Story B (Better production when active):

  • Same π = 0.65, λ = 4.03, Mean = 1.41
  • P(Over 0.5) = 35% ✗ Almost no change

Warning

If you keep π fixed while raising the mean, you're asserting: "The chance of a positive outcome stays the same, but the average positive-game outcome increases." That might be wrong for this matchup.

Mistake #4: Forgetting the Bet's Threshold

For 0.5 lines, the distribution of positive outcomes matters much less than P(positive at all). Bettors often focus on getting the mean exactly right while ignoring where the probability mass sits around zero.

The Reality:

  • Over 0.5 cares about P(Y ≥ 1), not E[Y]
  • Two models with identical means can have opposite P(Over 0.5) values
  • For 0.5 thresholds, getting π right matters more than getting λ right

Key Insight

Most errors in zero-heavy props come from modeling the wrong event—size of outcomes instead of probability of clearing the threshold.


The Complete Distribution Selection Workflow

Here's your step-by-step process for any prop:

Step 1: Classify the Prop Type

QuestionAnswerAction
Is this high-volume continuous data? (50+ yards, 15+ points)YesUse Normal
Is this low-volume count data? (0, 1, 2, 3 typical outcomes)YesContinue to Step 2

Step 2: Calculate VMR from Historical Data

Excel Formula:
=VAR.S(A1:A20)/AVERAGE(A1:A20)
VMR ResultInterpretationDistribution
VMR < 0.8UnderdispersedPoisson (conservative)
VMR = 0.8-1.3Good Poisson fitPoisson
VMR = 1.3-1.8Moderate overdispersionNegative Binomial
VMR > 1.8Strong overdispersionNegative Binomial

Step 3: Check for Excess Zeros

Compare observed zero rate to Poisson prediction:

Poisson-predicted P(X=0) = e^(-λ) = EXP(-AVERAGE(A1:A20))
Observed P(X=0) = COUNTIF(A1:A20,0)/COUNT(A1:A20)
Gap = Observed - Predicted
GapInterpretationAction
Gap < 10 pointsNo excess zerosUse distribution from Step 2
Gap ≥ 10 pointsExcess zeros presentContinue to Step 4

Step 4: Identify the Zero Source

QuestionIf Yes →If No →
Are there games where the player had NO OPPORTUNITY?Use ZIPUse Hurdle
Examples: Blocker-only games, DNPs, scripted out by game planPlayer always active, stat just rare

Step 5: Estimate Parameters

For ZIP:

π = (Games with zero or negative) / (Total games)
λ = Mean / (1 - π)

For Hurdle:

π = (Games with zero) / (Total games)
λ = Mean of positive games only

Step 6: Validate Mean Consistency

Ensure your parameters produce the correct projected mean:

ZIP/Hurdle Mean = (1 - π) × λ

Does this equal your projection? If not, adjust.

Step 7: Calculate P(Over) and EV

For 0.5 threshold:

  • ZIP: P(Over 0.5) = (1 - π) × (1 - e^(-λ))
  • Hurdle: P(Over 0.5) = 1 - π

EV Calculation:

EV = P(Over) × profit - (1 - P(Over)) × stake

The Distribution Selector Tool

Use this calculator to help guide your distribution choice:

Negative Binomial Calculator

Try the interactive calculator for this concept

Open Tool
Open Normal Distribution Calculator

Summary: The Right Model for Near-Zero Props

ModelWhen to UseKey Formula
NormalHigh-volume continuous (yards, points)Z = (line - μ) / σ
PoissonCount data, VMR ≈ 1, no excess zerosP(X=0) = e^(-λ)
Negative BinomialCount data, VMR > 1.3Var = μ + μ²/r
ZIPStructural zeros (no opportunity games)P(Y=0) = π + (1-π)e^(-λ)
HurdleAny vs. none (player active, stat rare)P(Y=0) = π

Key Insight

For near-zero props, the right model is the one that gets P(Y ≤ 0) and P(Y ≥ 1) right—not just the mean. Choosing the right distribution is not optional; it's a core part of accurate edge estimation.


Chapter Summary

You now have a complete toolkit for selecting the right distribution:

  1. Structural zeros occur when a player is active but has effectively no opportunity to accumulate the stat.

  2. ZIP models two processes: structural zero vs. Poisson production, and is ideal when you can identify "no-opportunity" games.

  3. Hurdle models the "any vs. none" decision first, then models positive outcomes separately.

  4. The Stafford case study shows that matching the same mean across distributions can still produce opposite EV conclusions on a 0.5 line.

  5. Choosing the right distribution is not optional—it's a core part of accurate edge estimation.


📝 Exercise

Instructions

Final Exercise: Complete Workflow Practice

Walk through the complete distribution selection workflow for this prop:

Prop: Backup WR Over 0.5 Receptions at +120 Data (last 15 games): 0, 2, 0, 0, 3, 0, 1, 0, 0, 2, 0, 0, 1, 0, 4 Note: In 8 of the 15 games, this WR played fewer than 10 snaps (likely decoy or special teams only).

Step 1: Calculate the mean and VMR. What do you find?

Step 2: Check for excess zeros. Observed zeros = 9/15 = 60%. Poisson predicts e^(-0.87) ≈ 42%. What's the gap?

Step 3: Given that 8 of 15 games had fewer than 10 snaps (decoy/special teams), which model is most appropriate?

Step 4: Estimate π for the ZIP model. How do you calculate it?

Step 5: With π = 0.53 and Mean = 0.87, calculate λ. Then calculate P(Over 0.5) and determine if the bet is +EV at +120 odds.


Key Takeaways

Key Insight

The Bottom Line: For props near zero thresholds, your edge depends on modeling opportunity correctly—not on "being better at averages." If two models with the same mean disagree on EV, the question isn't which mean is right. The question is which model correctly captures the probability of clearing the threshold.

Your Distribution Selection Checklist

Always calculate VMR first — it takes 30 seconds and prevents the most common errors

Check for excess zeros — compare observed to e^(-λ)

Ask "why are there zeros?" — structural (no opportunity) vs. count process (active but didn't produce)

Match your story to your parameters — is the mean changing because of opportunity (π) or production (λ)?

Remember the threshold — for 0.5 lines, P(positive at all) matters more than the mean

Keep building your intuition by tracking your projections against outcomes. The sharps who win consistently aren't the ones with perfect models—they're the ones who know when their model is likely to fail and reach for the right alternative.