Honest Validation Protocol
VAR's eight-rule canonical spec that every published win-rate, ROI claim, and live-tier promotion validates against. The audit-defensible standard the public numbers have to clear. Version 2026-04-30. Published at /methodology/protocol.
The Honest Validation Protocol (HVP) is the eight-rule canonical spec VAR uses to ship audit-defensible claims under real-money exposure. Each rule exists because the platform caught itself doing it the wrong way at least once. The rules are: walk-forward across at least three independent test seasons; Beta-Binomial 95% credible interval with the lower bound as the planning figure; pre-registered ship gates before looking at test data; independent-sample discipline (no inflating n via Monte Carlo realizations); empirical (not heuristic) closing-line-value haircut measured per market; block-bootstrap bankroll simulation at the slate level; memory hygiene on superseded claims; production-code-path verification end-to-end. The protocol is uniform across leagues; the resulting quantitative claims are tighter at higher-rigor leagues and bounded with explicit disclosure at lower-rigor ones.
- Most published sports analytics claims would not survive a serious audit. The protocol is the structural answer to that: a public, versioned spec that defines what a claim has to clear before it ships.
- Real-money exposure forces the validation discipline to be honest. A model that ships under HVP has had its tier thresholds pre-registered, its credible intervals cited, and its drift checkpoints scheduled before the live deployment began.
- The protocol is the spec the pre-registration is signed against. A reader who wants to verify that VAR's NFL forward test is audit-defensible should read the protocol first, then the pre-registration as the live application.
- Methodology-transparent disclosure is the moat that licensed-data consolidation can't acquire its way past. The protocol is the artifact that operationalizes that posture.
Apply each of the eight rules to a candidate published claim in order: (1) is the result aggregated across at least three independent walk-forward test seasons? (2) does the cite carry a Beta-Binomial 95% credible interval lower bound? (3) was the tier threshold pre-registered before the test data was touched? (4) is the reported sample size in independent betting decisions, not MC realizations? (5) has the CLV haircut been measured empirically per market? (6) are bankroll-growth estimates from slate-level block-bootstrap? (7) is there a superseded-claim ledger? (8) does the audit run against the same code path that ships predictions? A claim that fails any of the eight does not ship.
VAR's PRIME spread NFL tier passes HVP as follows: (1) walk-forward across 2023, 2024, 2025 test seasons; (2) cited as 62.83% with 95% CI [56.36%, 68.87%]; (3) threshold pre-registered on /methodology/2026-27-predictions before any 2026-27 game; (4) n=226 independent games; (5) CLV haircut measured empirically on prior-season graded bets; (6) slate-level block-bootstrap for bankroll projections; (7) prior 63.2% ATS headline marked superseded in BRAND.md with the date and reason; (8) audit code path and live tracker share the same production tier.
- Passing seven of eight rules and shipping anyway. Each rule exists to close a specific failure mode; partial compliance leaves the claim exposed to whichever mode the missed rule covers.
- Citing the protocol without applying it. A claim that says 'validated under HVP' but does not present the per-rule evidence is asserting compliance rather than demonstrating it.
- Treating the protocol as fixed forever. Substantive revisions are appended as new dated versions; the protocol itself is append-only. The current version (2026-04-30) is active until a new dated version is published.
Where does the protocol live?
The canonical spec is at /methodology/protocol with the version date and an append-only changelog. The plaintext mirror for AI ingestion is at /llms-full.txt. Both are public and stable.
Why eight rules and not fewer?
Each of the eight closes a specific, observed failure mode in sports analytics validation. Removing any rule re-opens the corresponding failure mode. The list is the minimum that produces audit-defensible claims under real-money exposure, not the maximum that would be ideal.
How does HVP apply at lower-rigor leagues?
The protocol is uniform; the data available is not. Public-data density and aggregate-check sources are dense at NBA and NFL, thinner at WNBA and CBB, sparsest at CFB and Women's College Basketball. The framework applies the same way at every level; the resulting quantitative claims get tighter at higher-rigor levels, and the methodology-transparent disclosure of the rigor ceiling is more important at lower-rigor levels, not less.