Scottfree Sports Historical Odds Data — Schema
Document version: 2026-05-08 Supersedes: schema_20260507.mdCoverage snapshot: 2026-05-08 (the field-coverage table below is a snapshot — see "Live coverage" at the bottom for current per-CSV numbers)
This document describes the columns of the per-sport {sport}_game_scores_1g_*.csv files. Six sports are covered: MLB, NBA, NFL, NHL, NCAAB, NCAAF. Each row is one game.
Changes since schema_20240930
- Removed:
index(was column 1) — no replacement. - Retained:
time_estfrom the raw game-score input. Product rows are uniquely identified by(date, time_est, away_team, home_team). This is required for MLB doubleheaders and for reconciling nightly product outputs back to raw game-score inputs. - Added (all sports), at columns 16–20:
open_home_money_line,open_away_money_line,open_home_point_spread,open_away_point_spread,open_over_under— opening lines captured before any in-day movement. See coverage matrix below for sport-specific availability. - Removed:
game_type— the MLB-only field was removed so every sport ships the same customer-facing schema. - Added: season-to-date ATS record fields for the home and away teams:
home_wins_ats,away_wins_ats,home_losses_ats,away_losses_ats,home_ties_ats,away_ties_ats. - Added: home-team-keyed opponent ATS history fields:
opponent_wins_ats,opponent_losses_ats,opponent_ties_ats.
Column count: 146 for every sport.
Field Reference
Identifiers, scores, and lines (1–16)
- season — The league season the game took place in. Format depends on the sport:
2007-08for cross-year seasons (NBA, NHL, NCAAB), single year for within-year seasons (MLB, NFL, NCAAF). - date — Game date in
YYYY-MM-DD. - time_est — Scheduled game time in Eastern time. Together with
date,away_team, andhome_team, this uniquely identifies a game. - away_team — Team name as reported by the odds feed.
- away_score — Final points scored by the away team.
- away_point_spread — Closing point spread offered for the away team (positive = underdog, negative = favorite).
- away_point_spread_line — Closing odds (American) attached to the away spread bet.
- away_money_line — Closing moneyline (American) for the away team to win outright.
- home_team — Team name as reported by the odds feed.
- home_score — Final points scored by the home team.
- home_point_spread — Closing point spread offered for the home team.
- home_point_spread_line — Closing odds (American) attached to the home spread bet.
- home_money_line — Closing moneyline (American) for the home team to win outright.
- over_under — Closing over/under (total) line.
- over_line — Closing odds (American) for the over.
- under_line — Closing odds (American) for the under.
Opening lines (17–21) — added 2026
These columns capture the first observed market price of the day, before any in-day line movement. They support line-movement research. Coverage is non-uniform by sport × market — see the coverage matrix below before relying on these for any specific sport.
- open_home_money_line — Opening moneyline for the home team (American odds).
- open_away_money_line — Opening moneyline for the away team (American odds).
- open_home_point_spread — Opening point spread for the home team.
- open_away_point_spread — Opening point spread for the away team.
- open_over_under — Opening over/under (total) line.
NaN convention: when the opener for a given market was not captured for a given game, the value is empty (NaN in pandas, null in JSON). Coverage percentages in the matrix below show the share of rows that have a non-null value.
Date breakdown (22–25)
- year — Calendar year extracted from
date. - month — Calendar month extracted from
date. - day — Day of month extracted from
date. - dayofweek — Day of week (e.g. Monday, Tuesday) extracted from
date.
Game-level outcomes (26–34)
- total_points — Sum of
home_score+away_score. - point_margin_game —
home_score - away_score(positive = home win). - won_on_points — 1 if home team won outright, else 0.
- lost_on_points — 1 if home team lost outright, else 0.
- cover_margin_game —
point_margin_game + home_point_spread. Positive = home covered the spread. - won_on_spread — 1 if home team covered the spread.
- lost_on_spread — 1 if home team failed to cover the spread.
- overunder_margin —
total_points - over_under. - over — 1 if
total_points > over_under, else 0. - under — 1 if
total_points < over_under, else 0.
Team season state and streaks (36–101)
Home and away feature pairs are adjacent in the CSV. Season, streak, and rolling-window features are computed using only games prior to the current row.
- home_wins — Home team's outright wins in the current season before this game.
- away_wins — Away team's outright wins in the current season before this game.
- home_losses — Home team's outright losses in the current season before this game.
- away_losses — Away team's outright losses in the current season before this game.
- home_ties — Home team's ties in the current season before this game.
- away_ties — Away team's ties in the current season before this game.
- home_days_since_first_game — Days since the home team's first game of the season.
- away_days_since_first_game — Days since the away team's first game of the season.
- home_days_since_previous_game — Days since the home team's previous game.
- away_days_since_previous_game — Days since the away team's previous game.
- home_point_win_streak — Current outright win streak length for the home team.
- away_point_win_streak — Current outright win streak length for the away team.
- home_point_loss_streak — Current outright loss streak length for the home team.
- away_point_loss_streak — Current outright loss streak length for the away team.
- home_point_margin_game — Current game point margin from the home team's perspective.
- away_point_margin_game — Current game point margin from the away team's perspective.
- home_point_margin_season — Home team's season-to-date point margin before this game.
- away_point_margin_season — Away team's season-to-date point margin before this game.
- home_point_margin_season_avg — Home team's average season-to-date point margin.
- away_point_margin_season_avg — Away team's average season-to-date point margin.
- home_point_margin_streak — Home team's point margin over its current streak.
- away_point_margin_streak — Away team's point margin over its current streak.
- home_point_margin_streak_avg — Home team's average point margin over its current streak.
- away_point_margin_streak_avg — Away team's average point margin over its current streak.
- home_point_margin_ngames — Home team's point margin over the rolling N-game window.
- away_point_margin_ngames — Away team's point margin over the rolling N-game window.
- home_point_margin_ngames_avg — Home team's average point margin over the rolling N-game window.
- away_point_margin_ngames_avg — Away team's average point margin over the rolling N-game window.
- home_cover_win_streak — Current ATS cover streak length for the home team.
- away_cover_win_streak — Current ATS cover streak length for the away team.
- home_cover_loss_streak — Current ATS non-cover streak length for the home team.
- away_cover_loss_streak — Current ATS non-cover streak length for the away team.
- home_cover_margin_game — Current game cover margin from the home team's perspective.
- away_cover_margin_game — Current game cover margin from the away team's perspective.
- home_cover_margin_season — Home team's season-to-date cover margin before this game.
- away_cover_margin_season — Away team's season-to-date cover margin before this game.
- home_cover_margin_season_avg — Home team's average season-to-date cover margin.
- away_cover_margin_season_avg — Away team's average season-to-date cover margin.
- home_cover_margin_streak — Home team's cover margin over its current ATS streak.
- away_cover_margin_streak — Away team's cover margin over its current ATS streak.
- home_cover_margin_streak_avg — Home team's average cover margin over its current ATS streak.
- away_cover_margin_streak_avg — Away team's average cover margin over its current ATS streak.
- home_cover_margin_ngames — Home team's cover margin over the rolling N-game window.
- away_cover_margin_ngames — Away team's cover margin over the rolling N-game window.
- home_cover_margin_ngames_avg — Home team's average cover margin over the rolling N-game window.
- away_cover_margin_ngames_avg — Away team's average cover margin over the rolling N-game window.
- home_total_points — Points scored by the home team in this game.
- away_total_points — Points scored by the away team in this game.
- home_overunder_margin — Current game total margin from the home team's perspective.
- away_overunder_margin — Current game total margin from the away team's perspective.
- home_over_streak — Consecutive games where the home team's games went over.
- away_over_streak — Consecutive games where the away team's games went over.
- home_under_streak — Consecutive games where the home team's games went under.
- away_under_streak — Consecutive games where the away team's games went under.
- home_overunder_season — Home team's season-to-date over/under margin.
- away_overunder_season — Away team's season-to-date over/under margin.
- home_overunder_season_avg — Home team's average season-to-date over/under margin.
- away_overunder_season_avg — Away team's average season-to-date over/under margin.
- home_overunder_streak — Home team's over/under margin over its current total streak.
- away_overunder_streak — Away team's over/under margin over its current total streak.
- home_overunder_streak_avg — Home team's average over/under margin over its current total streak.
- away_overunder_streak_avg — Away team's average over/under margin over its current total streak.
- home_overunder_ngames — Home team's over/under margin over the rolling N-game window.
- away_overunder_ngames — Away team's over/under margin over the rolling N-game window.
- home_overunder_ngames_avg — Home team's average over/under margin over the rolling N-game window.
- away_overunder_ngames_avg — Away team's average over/under margin over the rolling N-game window.
ATS record features (102–107)
These are season-to-date Against The Spread records for each team prior to the current game.
- home_wins_ats — Home team's ATS wins in the current season before this game.
- away_wins_ats — Away team's ATS wins in the current season before this game.
- home_losses_ats — Home team's ATS losses in the current season before this game.
- away_losses_ats — Away team's ATS losses in the current season before this game.
- home_ties_ats — Home team's ATS pushes in the current season before this game.
- away_ties_ats — Away team's ATS pushes in the current season before this game.
Home-vs-away deltas (108–140)
Each delta_* field is home_* - away_* for the corresponding home/away pair, useful as a single matchup-strength feature.
- delta_wins
- delta_losses
- delta_ties
- delta_days_since_first_game
- delta_days_since_previous_game
- delta_point_win_streak
- delta_point_loss_streak
- delta_point_margin_game
- delta_point_margin_season
- delta_point_margin_season_avg
- delta_point_margin_streak
- delta_point_margin_streak_avg
- delta_point_margin_ngames
- delta_point_margin_ngames_avg
- delta_cover_win_streak
- delta_cover_loss_streak
- delta_cover_margin_game
- delta_cover_margin_season
- delta_cover_margin_season_avg
- delta_cover_margin_streak
- delta_cover_margin_streak_avg
- delta_cover_margin_ngames
- delta_cover_margin_ngames_avg
- delta_total_points
- delta_overunder_margin
- delta_over_streak
- delta_under_streak
- delta_overunder_season
- delta_overunder_season_avg
- delta_overunder_streak
- delta_overunder_streak_avg
- delta_overunder_ngames
- delta_overunder_ngames_avg
Opponent ATS history (141–143)
These fields are keyed off the home team and the current opponent across the whole dataset, prior to the current row.
- opponent_wins_ats — Home team's historical ATS wins against this away opponent before this game.
- opponent_losses_ats — Home team's historical ATS losses against this away opponent before this game.
- opponent_ties_ats — Home team's historical ATS pushes against this away opponent before this game.
Daily-mean lag features (144–146)
These columns are intentionally lagged by one day to prevent target leakage when training models on a per-row basis.
- won_on_points_daily_mean_lag1 — Mean of
won_on_pointsacross all games on the previous calendar day. - won_on_spread_daily_mean_lag1 — Mean of
won_on_spreadacross all games on the previous calendar day. - over_daily_mean_lag1 — Mean of
overacross all games on the previous calendar day.
Opener-Coverage Matrix
The five opening-line columns are sourced from a one-time historical archive (where available per sport × market) plus forward capture starting 2026-04-18. Coverage is uneven across sports because no single free or commercial archive covers all six leagues uniformly. The table below is a snapshot from 2026-05-08 — see "Live coverage" below for current per-CSV numbers.
| Sport | Rows | Date range | open_home_ml | open_away_ml | open_home_spread | open_away_spread | open_over_under |
|---|---|---|---|---|---|---|---|
| MLB | 28,566 | 2014-03-22 -> 2026-05-08 | 97.6% | 97.6% | 64.4% | 64.4% | 97.6% |
| NBA | 24,275 | 2007-10-30 -> 2026-05-08 | 0.2% | 0.2% | 97.8% | 97.8% | 98.0% |
| NFL | 5,164 | 2007-09-06 -> 2026-02-08 | 88.0% | 88.0% | 97.9% | 97.9% | 98.6% |
| NHL | 15,852 | 2014-10-08 -> 2026-05-08 | 99.0% | 99.0% | 63.1% | 63.1% | 99.1% |
| NCAAB | 85,033 | 2007-11-05 -> 2026-04-06 | 0.0% | 0.0% | 67.2% | 67.2% | 57.1% |
| NCAAF | 18,892 | 2007-08-30 -> 2026-01-31 | 0.0% | 0.0% | 66.8% | 66.8% | 63.4% |
How to read this:
- High coverage (≥85%): usable as a primary feature.
- Medium coverage (50–85%): usable, but the model must be tolerant of NaN. NCAAB / NCAAF point-spread + over/under fall here — the archive covers most pre-2026 rows but not all conferences/games.
- Low coverage (<5%): forward-only, captured starting 2026-04-18. NBA / NCAAB / NCAAF moneyline openers fall here. Treat as a forward-going-feature only; do not use for training on pre-2026 history.
Why coverage varies:
- MLB / NHL / NFL: a long-running historical odds archive covers moneyline + total back to ~2008.
- NBA: archive covers spread + total back to 2006; moneyline was never archived publicly, so we capture it forward only.
- NCAAB / NCAAF: no public archive of money lines for college sports. Spread + over/under were back-filled from a historical-archive vendor; moneyline starts at 2026-04-18.
Live coverage
Each refresh of the product writes a coverage sidecar JSON next to the CSV. Customers can use that sidecar to verify current numbers before each refresh.
{sport}_data_coverage_latest.json
Schema:
{
"league": "MLB",
"row_count": 28566,
"first_game_date": "2014-03-22",
"last_game_date": "2026-05-08",
"openers": {
"open_home_money_line": {
"non_null": 27879,
"coverage_pct": 97.59,
"first_non_null_date": "2014-03-22"
},
...
},
"generated_at": "2026-05-08T22:14:38+00:00"
}
Use the generated_at timestamp to confirm the coverage matches the CSV you downloaded — the two are written atomically per refresh.
Refresh policy
- Subscribers to Scottfree Sports Data ($19/mo) can refresh up to 8 times per UTC calendar month via the
/api/v1/scottfree-sports-data/refreshendpoint, returning a short-lived signed download URL. - The underlying CSVs are regenerated daily at 12:15 PM ET by the
pipeline-product-dailyjob, after the morning odds refresh and repair window. A refresh fetched after that job completes reflects all finalized games through the previous day.
Contact
Questions: scottfree.analytics@scottfreellc.com. Documentation: https://scottfreellc.github.io/alphapy-sports
Sports Docs