Common Mistakes and the Distribution Selection Workflow
You now have a complete statistical toolkit: Normal, Poisson, Negative Binomial, ZIP, and Hurdle. But having the tools isn't enough—you need to avoid the common mistakes that cause even experienced bettors to systematically misprice props.
The Four Most Common Mistakes
Mistake #1: Using Poisson Because It's Convenient
Poisson is the default for many bettors because it's simple and widely taught. But if the stat has many structural zeros or allows negative outcomes, Poisson can wildly misprice the Over near 0.5.
Example: Pocket QB Rushing Yards
- Poisson predicts P(Y=0) = e^(-1.41) = 24.4%
- Actual observed rate: 65%
- Result: Poisson overstates P(Over) by 40+ percentage points
Warning
If you see a prop near 0.5 and your Poisson model says "massive Over value," stop and check if you're seeing structural zeros. You might be walking into a trap.
Mistake #2: Using Normal Because It "Allows Negatives"
When bettors realize Poisson can't handle negative outcomes (like negative rushing yards from kneel-downs), they often switch to Normal. This helps—but it doesn't solve the spike-at-zero problem.
The Issue: Normal spreads probability smoothly across the range. But real data for props like QB rushing yards isn't smooth—it has a massive spike at Y ≤ 0.
Example: Stafford Rushing Yards
- Normal predicts P(Over 0.5) = 62.4%
- Actual historical rate: ~35%
- Result: Normal still overestimates by ~27 percentage points
Key Insight
Allowing negatives helps, but it doesn't solve the spike-at-zero problem. Normal can still overstate P(Y ≥ 1) when the true data-generating process has a large structural pile at Y ≤ 0.
Mistake #3: Holding π Fixed Without Realizing What It Implies
When using ZIP or Hurdle, your mean projection must be consistent with your π and λ assumptions. If you raise the mean but keep π constant, you're implicitly pushing all the increase into conditional production.
Two Stories for the Same Mean Increase:
| Story | π (opportunity) | λ (production) | P(Over 0.5) |
|---|---|---|---|
| "More opportunity" | Decreases | Same | Increases significantly |
| "Better production" | Same | Increases | Barely changes |
Example:
- Original: π = 0.65, λ = 2.83, Mean = 0.99
- Projection: Mean rises to 1.41
Story A (More opportunity):
- New π = 0.50, λ = 2.82, Mean = 1.41
- P(Over 0.5) = 50% ✓ Major improvement
Story B (Better production when active):
- Same π = 0.65, λ = 4.03, Mean = 1.41
- P(Over 0.5) = 35% ✗ Almost no change
Warning
If you keep π fixed while raising the mean, you're asserting: "The chance of a positive outcome stays the same, but the average positive-game outcome increases." That might be wrong for this matchup.
Mistake #4: Forgetting the Bet's Threshold
For 0.5 lines, the distribution of positive outcomes matters much less than P(positive at all). Bettors often focus on getting the mean exactly right while ignoring where the probability mass sits around zero.
The Reality:
- Over 0.5 cares about P(Y ≥ 1), not E[Y]
- Two models with identical means can have opposite P(Over 0.5) values
- For 0.5 thresholds, getting π right matters more than getting λ right
Key Insight
Most errors in zero-heavy props come from modeling the wrong event—size of outcomes instead of probability of clearing the threshold.
The Complete Distribution Selection Workflow
Here's your step-by-step process for any prop:
Step 1: Classify the Prop Type
| Question | Answer | Action |
|---|---|---|
| Is this high-volume continuous data? (50+ yards, 15+ points) | Yes | Use Normal |
| Is this low-volume count data? (0, 1, 2, 3 typical outcomes) | Yes | Continue to Step 2 |
Step 2: Calculate VMR from Historical Data
Excel Formula:
=VAR.S(A1:A20)/AVERAGE(A1:A20)
| VMR Result | Interpretation | Distribution |
|---|---|---|
| VMR < 0.8 | Underdispersed | Poisson (conservative) |
| VMR = 0.8-1.3 | Good Poisson fit | Poisson |
| VMR = 1.3-1.8 | Moderate overdispersion | Negative Binomial |
| VMR > 1.8 | Strong overdispersion | Negative Binomial |
Step 3: Check for Excess Zeros
Compare observed zero rate to Poisson prediction:
Poisson-predicted P(X=0) = e^(-λ) = EXP(-AVERAGE(A1:A20))
Observed P(X=0) = COUNTIF(A1:A20,0)/COUNT(A1:A20)
Gap = Observed - Predicted
| Gap | Interpretation | Action |
|---|---|---|
| Gap < 10 points | No excess zeros | Use distribution from Step 2 |
| Gap ≥ 10 points | Excess zeros present | Continue to Step 4 |
Step 4: Identify the Zero Source
| Question | If Yes → | If No → |
|---|---|---|
| Are there games where the player had NO OPPORTUNITY? | Use ZIP | Use Hurdle |
| Examples: Blocker-only games, DNPs, scripted out by game plan | Player always active, stat just rare |
Step 5: Estimate Parameters
For ZIP:
π = (Games with zero or negative) / (Total games)
λ = Mean / (1 - π)
For Hurdle:
π = (Games with zero) / (Total games)
λ = Mean of positive games only
Step 6: Validate Mean Consistency
Ensure your parameters produce the correct projected mean:
ZIP/Hurdle Mean = (1 - π) × λ
Does this equal your projection? If not, adjust.
Step 7: Calculate P(Over) and EV
For 0.5 threshold:
- ZIP: P(Over 0.5) = (1 - π) × (1 - e^(-λ))
- Hurdle: P(Over 0.5) = 1 - π
EV Calculation:
EV = P(Over) × profit - (1 - P(Over)) × stake
The Distribution Selector Tool
Use this calculator to help guide your distribution choice:
Negative Binomial Calculator
Try the interactive calculator for this concept
Summary: The Right Model for Near-Zero Props
| Model | When to Use | Key Formula |
|---|---|---|
| Normal | High-volume continuous (yards, points) | Z = (line - μ) / σ |
| Poisson | Count data, VMR ≈ 1, no excess zeros | P(X=0) = e^(-λ) |
| Negative Binomial | Count data, VMR > 1.3 | Var = μ + μ²/r |
| ZIP | Structural zeros (no opportunity games) | P(Y=0) = π + (1-π)e^(-λ) |
| Hurdle | Any vs. none (player active, stat rare) | P(Y=0) = π |
Key Insight
For near-zero props, the right model is the one that gets P(Y ≤ 0) and P(Y ≥ 1) right—not just the mean. Choosing the right distribution is not optional; it's a core part of accurate edge estimation.
Chapter Summary
You now have a complete toolkit for selecting the right distribution:
-
Structural zeros occur when a player is active but has effectively no opportunity to accumulate the stat.
-
ZIP models two processes: structural zero vs. Poisson production, and is ideal when you can identify "no-opportunity" games.
-
Hurdle models the "any vs. none" decision first, then models positive outcomes separately.
-
The Stafford case study shows that matching the same mean across distributions can still produce opposite EV conclusions on a 0.5 line.
-
Choosing the right distribution is not optional—it's a core part of accurate edge estimation.
📝 Exercise
Instructions
Final Exercise: Complete Workflow Practice
Walk through the complete distribution selection workflow for this prop:
Prop: Backup WR Over 0.5 Receptions at +120 Data (last 15 games): 0, 2, 0, 0, 3, 0, 1, 0, 0, 2, 0, 0, 1, 0, 4 Note: In 8 of the 15 games, this WR played fewer than 10 snaps (likely decoy or special teams only).
Step 1: Calculate the mean and VMR. What do you find?
Step 2: Check for excess zeros. Observed zeros = 9/15 = 60%. Poisson predicts e^(-0.87) ≈ 42%. What's the gap?
Step 3: Given that 8 of 15 games had fewer than 10 snaps (decoy/special teams), which model is most appropriate?
Step 4: Estimate π for the ZIP model. How do you calculate it?
Step 5: With π = 0.53 and Mean = 0.87, calculate λ. Then calculate P(Over 0.5) and determine if the bet is +EV at +120 odds.
Key Takeaways
Key Insight
The Bottom Line: For props near zero thresholds, your edge depends on modeling opportunity correctly—not on "being better at averages." If two models with the same mean disagree on EV, the question isn't which mean is right. The question is which model correctly captures the probability of clearing the threshold.
Your Distribution Selection Checklist
✅ Always calculate VMR first — it takes 30 seconds and prevents the most common errors
✅ Check for excess zeros — compare observed to e^(-λ)
✅ Ask "why are there zeros?" — structural (no opportunity) vs. count process (active but didn't produce)
✅ Match your story to your parameters — is the mean changing because of opportunity (π) or production (λ)?
✅ Remember the threshold — for 0.5 lines, P(positive at all) matters more than the mean
Keep building your intuition by tracking your projections against outcomes. The sharps who win consistently aren't the ones with perfect models—they're the ones who know when their model is likely to fail and reach for the right alternative.