Common Mistakes and the Distribution Selection Workflow

You now have a complete statistical toolkit: Normal, Poisson, Negative Binomial, ZIP, and Hurdle. But having the tools isn't enough—you need to avoid the common mistakes that cause even experienced bettors to systematically misprice props.

The Four Most Common Mistakes

Mistake #1: Using Poisson Because It's Convenient

Poisson is the default for many bettors because it's simple and widely taught. But if the stat has many structural zeros or allows negative outcomes, Poisson can wildly misprice the Over near 0.5.

Example: Pocket QB Rushing Yards

Poisson predicts P(Y=0) = e^(-1.41) = 24.4%
Actual observed rate: 65%
Result: Poisson overstates P(Over) by 40+ percentage points

Warning

If you see a prop near 0.5 and your Poisson model says "massive Over value," stop and check if you're seeing structural zeros. You might be walking into a trap.

Mistake #2: Using Normal Because It "Allows Negatives"

When bettors realize Poisson can't handle negative outcomes (like negative rushing yards from kneel-downs), they often switch to Normal. This helps—but it doesn't solve the spike-at-zero problem.

The Issue: Normal spreads probability smoothly across the range. But real data for props like QB rushing yards isn't smooth—it has a massive spike at Y ≤ 0.

Example: Stafford Rushing Yards

Normal predicts P(Over 0.5) = 62.4%
Actual historical rate: ~35%
Result: Normal still overestimates by ~27 percentage points

Key Insight

Allowing negatives helps, but it doesn't solve the spike-at-zero problem. Normal can still overstate P(Y ≥ 1) when the true data-generating process has a large structural pile at Y ≤ 0.

Mistake #3: Holding π Fixed Without Realizing What It Implies

When using ZIP or Hurdle, your mean projection must be consistent with your π and λ assumptions. If you raise the mean but keep π constant, you're implicitly pushing all the increase into conditional production.

Two Stories for the Same Mean Increase:

Story	π (opportunity)	λ (production)	P(Over 0.5)
"More opportunity"	Decreases	Same	Increases significantly
"Better production"	Same	Increases	Barely changes

Example:

Original: π = 0.65, λ = 2.83, Mean = 0.99
Projection: Mean rises to 1.41

Story A (More opportunity):

New π = 0.50, λ = 2.82, Mean = 1.41
P(Over 0.5) = 50% ✓ Major improvement

Story B (Better production when active):

Same π = 0.65, λ = 4.03, Mean = 1.41
P(Over 0.5) = 35% ✗ Almost no change

Warning

If you keep π fixed while raising the mean, you're asserting: "The chance of a positive outcome stays the same, but the average positive-game outcome increases." That might be wrong for this matchup.

Mistake #4: Forgetting the Bet's Threshold

For 0.5 lines, the distribution of positive outcomes matters much less than P(positive at all). Bettors often focus on getting the mean exactly right while ignoring where the probability mass sits around zero.

The Reality:

Over 0.5 cares about P(Y ≥ 1), not E[Y]
Two models with identical means can have opposite P(Over 0.5) values
For 0.5 thresholds, getting π right matters more than getting λ right

Key Insight

Most errors in zero-heavy props come from modeling the wrong event—size of outcomes instead of probability of clearing the threshold.

The Complete Distribution Selection Workflow

Here's your step-by-step process for any prop:

Step 1: Classify the Prop Type

Question	Answer	Action
Is this high-volume continuous data? (50+ yards, 15+ points)	Yes	Use Normal
Is this low-volume count data? (0, 1, 2, 3 typical outcomes)	Yes	Continue to Step 2

Step 2: Calculate VMR from Historical Data

Excel Formula:
=VAR.S(A1:A20)/AVERAGE(A1:A20)

VMR Result	Interpretation	Distribution
VMR < 0.8	Underdispersed	Poisson (conservative)
VMR = 0.8-1.3	Good Poisson fit	Poisson
VMR = 1.3-1.8	Moderate overdispersion	Negative Binomial
VMR > 1.8	Strong overdispersion	Negative Binomial

Step 3: Check for Excess Zeros

Compare observed zero rate to Poisson prediction:

Poisson-predicted P(X=0) = e^(-λ) = EXP(-AVERAGE(A1:A20))
Observed P(X=0) = COUNTIF(A1:A20,0)/COUNT(A1:A20)
Gap = Observed - Predicted

Gap	Interpretation	Action
Gap < 10 points	No excess zeros	Use distribution from Step 2
Gap ≥ 10 points	Excess zeros present	Continue to Step 4

Step 4: Identify the Zero Source

Question	If Yes →	If No →
Are there games where the player had NO OPPORTUNITY?	Use ZIP	Use Hurdle
Examples: Blocker-only games, DNPs, scripted out by game plan		Player always active, stat just rare

Step 5: Estimate Parameters

For ZIP:

π = (Games with zero or negative) / (Total games)
λ = Mean / (1 - π)

For Hurdle:

π = (Games with zero) / (Total games)
λ = Mean of positive games only

Step 6: Validate Mean Consistency

Ensure your parameters produce the correct projected mean:

ZIP/Hurdle Mean = (1 - π) × λ

Does this equal your projection? If not, adjust.

Step 7: Calculate P(Over) and EV

For 0.5 threshold:

ZIP: P(Over 0.5) = (1 - π) × (1 - e^(-λ))
Hurdle: P(Over 0.5) = 1 - π

EV Calculation:

EV = P(Over) × profit - (1 - P(Over)) × stake

The Distribution Selector Tool

Use this calculator to help guide your distribution choice:

Negative Binomial Calculator

Try the interactive calculator for this concept

Open Tool

Open Normal Distribution Calculator

Summary: The Right Model for Near-Zero Props

Model	When to Use	Key Formula
Normal	High-volume continuous (yards, points)	Z = (line - μ) / σ
Poisson	Count data, VMR ≈ 1, no excess zeros	P(X=0) = e^(-λ)
Negative Binomial	Count data, VMR > 1.3	Var = μ + μ²/r
ZIP	Structural zeros (no opportunity games)	P(Y=0) = π + (1-π)e^(-λ)
Hurdle	Any vs. none (player active, stat rare)	P(Y=0) = π

Key Insight

For near-zero props, the right model is the one that gets P(Y ≤ 0) and P(Y ≥ 1) right—not just the mean. Choosing the right distribution is not optional; it's a core part of accurate edge estimation.

Chapter Summary

You now have a complete toolkit for selecting the right distribution:

Structural zeros occur when a player is active but has effectively no opportunity to accumulate the stat.
ZIP models two processes: structural zero vs. Poisson production, and is ideal when you can identify "no-opportunity" games.
Hurdle models the "any vs. none" decision first, then models positive outcomes separately.
The Stafford case study shows that matching the same mean across distributions can still produce opposite EV conclusions on a 0.5 line.
Choosing the right distribution is not optional—it's a core part of accurate edge estimation.

📝 Exercise

Instructions

Final Exercise: Complete Workflow Practice

Walk through the complete distribution selection workflow for this prop:

Prop: Backup WR Over 0.5 Receptions at +120 Data (last 15 games): 0, 2, 0, 0, 3, 0, 1, 0, 0, 2, 0, 0, 1, 0, 4 Note: In 8 of the 15 games, this WR played fewer than 10 snaps (likely decoy or special teams only).

Step 1: Calculate the mean and VMR. What do you find?

Step 2: Check for excess zeros. Observed zeros = 9/15 = 60%. Poisson predicts e^(-0.87) ≈ 42%. What's the gap?

Step 3: Given that 8 of 15 games had fewer than 10 snaps (decoy/special teams), which model is most appropriate?

Step 4: Estimate π for the ZIP model. How do you calculate it?

Step 5: With π = 0.53 and Mean = 0.87, calculate λ. Then calculate P(Over 0.5) and determine if the bet is +EV at +120 odds.

Key Takeaways

Key Insight

The Bottom Line: For props near zero thresholds, your edge depends on modeling opportunity correctly—not on "being better at averages." If two models with the same mean disagree on EV, the question isn't which mean is right. The question is which model correctly captures the probability of clearing the threshold.

Your Distribution Selection Checklist

✅ Always calculate VMR first — it takes 30 seconds and prevents the most common errors

✅ Check for excess zeros — compare observed to e^(-λ)

✅ Ask "why are there zeros?" — structural (no opportunity) vs. count process (active but didn't produce)

✅ Match your story to your parameters — is the mean changing because of opportunity (π) or production (λ)?

✅ Remember the threshold — for 0.5 lines, P(positive at all) matters more than the mean

Keep building your intuition by tracking your projections against outcomes. The sharps who win consistently aren't the ones with perfect models—they're the ones who know when their model is likely to fail and reach for the right alternative.

Common Mistakes and Workflow

Common Mistakes and the Distribution Selection Workflow

The Four Most Common Mistakes

Mistake #1: Using Poisson Because It's Convenient

Mistake #2: Using Normal Because It "Allows Negatives"

Mistake #3: Holding π Fixed Without Realizing What It Implies

Mistake #4: Forgetting the Bet's Threshold

The Complete Distribution Selection Workflow

Step 1: Classify the Prop Type

Step 2: Calculate VMR from Historical Data

Step 3: Check for Excess Zeros

Step 4: Identify the Zero Source

Step 5: Estimate Parameters

Step 6: Validate Mean Consistency

Step 7: Calculate P(Over) and EV

The Distribution Selector Tool

Summary: The Right Model for Near-Zero Props

Chapter Summary

📝 Exercise

Key Takeaways

Your Distribution Selection Checklist