Does This Strategy Have
a Real Edge, or Just Luck?
Before any signal reaches deployment, it must pass four independent statistical gates. Each test addresses a different type of false discovery risk: from data leakage and overfitting to pure randomness. All four must pass simultaneously.
Gate Conditions
All four must pass simultaneously. Failure on any single check blocks signal deployment.
No lookahead bias detected across 2,847 signals. T+1 fill delay enforced. 45-day 13F filing lag applied.
98.5% CUSIP resolution rate via SEC exchange tickers + EDGAR company_tickers_exchange.json fallback.
SHA-256 checksums match across 3 independent seed=42 runs. Deterministic HDBSCAN + Gaussian HMM confirmed.
PBO 23.4% across C(16,8)=12,870 CSCV combinations. Well below 40% overfitting threshold. Bailey et al. (2016).
The observed Sharpe is a biased statistic when you've tested multiple configs. DSR applies four simultaneous penalties (multiple testing, skewness, fat tails, and serial correlation) to produce a conservative, publication-grade estimate. A DSR above 1.0 means the edge survives all adjustments.
CSCV splits the backtest into 16 equal partitions and evaluates all 12,870 C(16,8) combinations. For each, it asks: does the in-sample best strategy also win out-of-sample? PBO is the fraction of cases where it does not. Below 40% = acceptable.
Monte Carlo Robustness Tests
Three independent null hypothesis tests, each with N=1,000 simulations. Each asks a different question: "Could this result have been generated by chance?" All three must return p < 0.05.
Walk-Forward Validation
10 expanding-window folds (2010-2024). Each fold trains on all prior data and tests on one unseen year. Hover a cell for details.
Unlike a single backtest, walk-forward validation tests whether the strategy generalises across time. Each fold uses only data that would have been available on the day; it is a simulation of actually trading year-by-year. Stable Sharpe across all folds is strong evidence against regime-specific overfitting.