Back to Correlation, Covariance & Same-Game Parlays
Chapter 11

Understanding Covariance and Correlation

Understanding the relationship between two variables

Understanding Covariance and Correlation

Now that you understand why correlation matters for SGPs, let's learn how to measure it. This lesson covers the only two statistical concepts you need: covariance and correlation coefficient (r).

Covariance: Direction of Movement

Covariance tells you whether two stats tend to be above (or below) their averages in the same games.

CovarianceMeaning
PositiveThey move together (both above or both below average)
NegativeThey trade off (when one is up, the other is down)
ZeroNo linear relationship

The Problem with Covariance

Covariance is useful for direction, but it has a fatal flaw: the number depends on the scale of your data.

  • Covariance between passing yards (200-400 range) and receiving TDs (0-2 range) will be small
  • Covariance between two yardage stats will be large

This makes raw covariance hard to interpret. Enter correlation.

Correlation (r): Direction + Strength on a Clean Scale

Correlation rescales covariance to a unit-free number between -1 and +1.

Correlation Coefficient

r = Cov(X,Y) / (σₓ × σᵧ)
Excel: =CORREL(range1, range2)

Interpreting r Values

r ValueInterpretationSGP Implication
+1.0Perfect positiveLegs always hit together
+0.4 to +0.7Strong positiveBook shortens payout significantly
+0.1 to +0.3Weak positiveSmall adjustment to independence
0No relationshipIndependence formula works
-0.1 to -0.3Weak negativeSmall boost to payout
-0.4 to -0.7Strong negativeBook lengthens payout significantly
-1.0Perfect negativeLegs never hit together

Key Insight

In prop betting, you almost never see ±1. What matters is whether dependence is weak enough to ignore or strong enough that the payout will be aggressively shortened.

The Excel Formulas You Need

You don't need to compute covariance and correlation by hand. Excel does it for you.

Primary Formula: Correlation

=CORREL(range1, range2)

This is your default tool. It's the fastest and least error-prone way to measure relationships from paired game logs.

Supporting Formulas

PurposeExcel Formula
Correlation=CORREL(A2:A20, B2:B20)
Sample Covariance=COVARIANCE.S(A2:A20, B2:B20)
Sample Std Dev=STDEV.S(A2:A20)

Tip

Use =CORREL() as your default. It handles the covariance-to-correlation conversion automatically.

Stats Correlation vs. Leg Correlation (The Trap)

Here's a crucial distinction that trips up many bettors:

Raw-Stat Correlation

  • Correlates actual numbers (e.g., passing TD counts and receiving TD counts)
  • Answers: Do these stats move together?

Leg Correlation

  • Correlates binary events (Over/Under hit or miss)
  • Answers: Do these events hit together?

They are related but not identical.

Warning

When pricing SGPs, you're betting on events (hit/miss). If you use r, make sure it matches the event definition (Over/Under, Yes/No), not just the underlying stat.

Example: The Difference Matters

Suppose you correlate:

  • Williams passing TDs (0, 1, 2, 3, 4...)
  • Loveland receiving TDs (0, 1, 2...)

You get r = +0.47 (strong positive).

But the SGP legs are:

  • Williams Over 1.5 TDs (binary: hit or miss)
  • Loveland Over 0.5 TDs (binary: hit or miss)

The correlation between the binary indicators could be different from +0.47.

Best practice: Use conditional probability from game logs when possible (covered in Lesson 3).

Worked Example: Calculating Correlation in Excel

Let's calculate correlation for the Williams-Loveland pair using real game data.

Step 1: Set Up Your Data

GameWilliams Pass TDsLoveland Rec TDs
110
220
340
410
500
600
732
810
900
1031

Step 2: Calculate Correlation

In Excel, with Williams TDs in cells A2:A11 and Loveland TDs in B2:B11:

=CORREL(A2:A11, B2:B11)

Result: r ≈ +0.72 (strong positive correlation)

Step 3: Interpret the Result

A correlation of +0.72 means:

  • When Williams throws more TDs, Loveland tends to score more TDs
  • The relationship is strong
  • The book will significantly shorten SGP payouts for this pair

Understanding Correlation Strength for Betting

Here's a practical guide for how correlation affects SGP pricing:

Correlation RangeEffect on SGPYour Action
r > 0.4Book shortens payout heavilyNeed strong positive r assumption to find value
r = 0.1 to 0.4Moderate shorteningMay find edges if book overestimates r
r ≈ 0Near independence pricingStandard parlay math applies
r = -0.1 to -0.4Book lengthens payoutMay find edges if book underestimates negative r
r < -0.4Significant payout boostRare—look for mispriced negative correlation

Key Insight

Correlation is the scoreboard. It doesn't tell you why the relationship exists, but it tells you how strongly the game environment ties the outcomes together.

Practice Exercise

📝 Exercise

Instructions

You have 10 games of data for two props:

Prop A (QB passing TDs): [2, 1, 3, 2, 0, 2, 3, 1, 2, 4]

Prop B (WR receiving TDs): [1, 0, 1, 1, 0, 0, 2, 0, 1, 1]

  1. Enter the data into Excel
  2. Compute r with =CORREL(A2:A11, B2:B11)
  3. Classify the relationship as weak/moderate/strong

Key Takeaways

  1. Covariance tells you direction; correlation tells you direction AND strength
  2. Use =CORREL() in Excel—it's your primary tool
  3. Correlation ranges from -1 to +1, with values above ±0.4 being "strong"
  4. Watch out for the stats vs. legs trap—SGPs bet on binary events, not raw numbers
  5. Strong positive correlation (r > 0.4) means books shorten payouts significantly

Note

Coming Up Next: Now that you can measure correlation, we'll learn how to actually price 2-leg SGPs using conditional probability from game logs.