Pre-registration

2026-27 NFL
forward proof.

The 2026-27 NFL season is VAR's first publicly-verifiable forward test of the production model. This document locks in advance what we will measure, what counts as success, and what counts as a failed forward test. It is signed by the commit hash below and frozen as of 2026-05-14.

Signed2026-05-14
Commitsee git history at /victoryar repo
HVP version2026-04-30
Why this exists

Pre-registration is what separates analytics from astrology.

The Honest Validation Protocol (HVP) rule 3 requires that any model, threshold, or filter affecting live betting decisions has its success and failure criteria fixed before the test data is looked at. Running an analysis and picking the threshold that maximizes claimed performance afterward is in-sample fitting, and the result is unreliable.

NFL backtest performance was measured on three independent walk-forward test seasons (2023, 2024, 2025). The 2026-27 regular season and playoffs are the first set of games the model has never seen and was not trained on. The numbers from this season are the forward proof that matters.

For institutional buyers, academic citers, and journalists: this document is the contract. Everything below is locked. If any of it changes after Week 1 kickoff, the changelog at the bottom of this page will record the date, the reason, and the new signing commit, with the original copy preserved in git history.

Scope

Two production tiers under test.

VAR's NFL model runs across five production tiers internally. For this forward test we are publicly pre-committing to the two highest-confidence tiers: PRIME spread and PRIME_TOT. Both have passed HVP validation across three walk-forward test seasons and carry the same lower-bound discipline.

TierMarketFilterWhat it means
PRIME spreadNFL game spread (against the spread)|edge| ≥ 6The model's spread projection differs from the closing line by at least 6 points
PRIME_TOTNFL game total (over / under)|edge| ≥ 7The model's total projection differs from the closing line by at least 7 points

The other three production tiers (STRONG, PLAYABLE, PLAYABLE_SPREAD) remain operational internally but are not part of this public forward test. If they prove durable through the 2026-27 season, they will be promoted into a future pre-registration document.

HVP-validated backtest

The numbers being tested forward.

Both tiers were validated 2026-04-30 against three independent walk-forward test seasons (2023, 2024, 2025), each held out from training. The lower bound of the 95% Beta-Binomial credible interval is the figure we plan and size against. Point estimates are reported for transparency, not as commitments.

TierAccuracyCI lower (95%)CI upperROIn
PRIME spread62.83%56.36%68.87%+19.95%226
PRIME_TOT63.5%56.2%70.2%+21.0%178
Break-even at -110 closing line52.4%

Per-season PRIME spread results: 2023 60.32% (n=63), 2024 65.85% (n=82), 2025 61.73% (n=81). Per-season consistency was a pre-registered requirement of the backtest itself; no individual test season was below 60%.

See /performance for the full methodology and per-season breakdowns. The Honest Validation Protocol governing these measurements is version 2026-04-30.

Forward projection

Expected sample sizes for 2026-27.

The backtest figures above span three regular seasons. Scaled to a single season, the model is expected to surface roughly:

These are estimates derived from the per-season counts above and may move within a normal envelope (the filter is fixed; the number of games that clear it depends on market line dynamics that the model does not control). Final n will be reported on the live tracking page after the conclusion of each NFL week.

Success criterion

What counts as the forward test passing.

For each tier, the season passes if, at Week 18 (the final week of the NFL regular season plus playoffs), the lower bound of the 95% Beta-Binomial credible interval is at or above the break-even threshold of 52.4%. This is the same bar internal HVP audits hold historical claims to.

Point-estimate parity with the backtest is the secondary goal but not the primary commitment. Realized CLV haircuts, regime drift, and ordinary single-season variance can move the point estimate while the lower bound holds. The lower bound is what we cite.

Failure criteria

What counts as the forward test failing.

Three pre-registered checkpoints. Each tier is evaluated independently. Responses escalate with severity.

CheckpointConditionResponse
Week 4 · yellowCI lower bound < 50% at n ≥ 10 (per tier)Public yellow flag on /performance. 24-hour postmortem published explaining the early divergence. No tier change.
Week 9 · amberCI lower bound < 52.4% at n ≥ 20 (per tier)Public amber flag. Full postmortem with HVP rule-by-rule audit. The tier may be flagged "in calibration" on the leaderboard pending full-season result.
Week 18 · redCI lower bound < 52.4% at season endThe claim is withdrawn from the public leaderboard. Full forward-test failure postmortem published within 7 days. Tier is moved to "in calibration" until a new HVP-passing backtest justifies re-promotion.

If both tiers trigger the same flag in the same window, that is a systemic signal (model-wide drift, ingestion fault, regime shift) and the postmortem will treat it as such rather than as two independent events.

Publication commitments

What we will publish, on what schedule.

Discipline commitments

What we will not do during this season.

Audit trail

How to verify this document.

This page is served from the open-source victoryar Next.js application. The signing commit hash above is the git commit that first published this content. Subsequent edits are visible in the repository commit log.

The Honest Validation Protocol referenced throughout is version 2026-04-30. A public companion page is forthcoming at /methodology/hvp. Until then, the eight HVP rules summarized in this document are accurate representations of the standard.

Questions or citation requests: @xVictoryarx on X.

Changelog
← Back to Methodology