Skip to content

Phase 11b — Pennant strategy backtest across 5 detection scenarios

Stock-only backtest of the production pennant strategy applied to each of the five detection criteria scenarios from Phases 11a / 11a-2 / 11a-3. Detection events came from the cached parquets under ab_test/; no detection runs were re-executed. No production code, config, or data modified.

Equity curves

1. Setup

Five scenarios — Baseline (current production: pennant 5–15, flagpole 1–10), V1 (pennant 10–20, flagpole 1–5), V2 (pennant 7–17, flagpole 1–5), V3 (pennant 6–17, flagpole 1–3), V4 (pennant 6–17, flagpole 1–2) — were each simulated independently. Each scenario started with $10,000 cash, took $500 nominal per trade at the anchor close, sold half on a +15 % move, trailed the remainder by 3 % from the in-trade high, hard-stopped at −7 % from entry, and time-stopped after 30 trading days. The regime gate skipped entries on days where SPY < SPY 200-SMA and VIX > 35 (240 such days in the calendar). Friction: $0.50 per fill commission, ±5 bp slippage. Cash was held un-invested; fractional shares allowed; concurrency capped only by available cash. Detection events were taken in chronological order; if cash on the entry date was < $500 the candidate was skipped ("no_cash"), not back-filled.

All five scenarios share the same strategy mechanics, prices, calendar, and regime gate — the only difference is the detection parameters that produced the event population.

2. End-of-year equity

Year Baseline V1 V2 V3 V4
2007 start $10,000 $10,000 $10,000 $10,000 $10,000
2007 EOY $10,956 $9,988 $10,687 $10,306 $10,320
2008 EOY $7,864 $8,831 $9,426 $9,106 $9,476
2009 EOY $9,640 $9,915 $10,939 $8,249 $9,959
2010 EOY $11,936 $12,042 $12,662 $10,584 $11,873
2011 EOY $10,896 $12,362 $11,575 $10,278 $11,614
2012 EOY $11,441 $14,533 $12,352 $11,129 $11,833
2013 EOY $18,692 $20,071 $18,989 $17,391 $16,766
2014 EOY $19,978 $20,398 $20,382 $18,291 $17,898
2015 EOY $19,413 $20,280 $20,620 $17,612 $17,995
2016 EOY $21,353 $21,036 $23,630 $19,702 $19,301
2017 EOY $25,242 $25,010 $27,536 $22,403 $21,323
2018 EOY $23,851 $23,711 $26,683 $21,732 $21,433
2019 EOY $23,874 $23,927 $26,818 $20,664 $20,556
2020 EOY $29,101 $25,870 $28,795 $23,098 $20,919
2021 EOY $39,232 $31,005 $35,856 $29,278 $26,185
2022 EOY $33,488 $28,418 $32,553 $26,605 $24,725
2023 EOY $33,089 $28,206 $33,035 $26,840 $25,060
2024 EOY $36,010 $27,667 $32,107 $28,545 $25,336
2025 EOY $39,865 $31,565 $36,779 $31,334 $28,504
2026 YTD (2026-05-11) $40,398 $30,858 $35,651 $31,240 $28,615

3. Headline metrics

Scenario Final equity Total return CAGR Max DD Sharpe Trades Skipped no-cash Win rate
Baseline $40,398 +304 % 7.50 % −38.1 % 0.543 8,842 6,573 43.7 %
V1 $30,858 +209 % 6.01 % −20.4 % 0.609 4,533 591 45.1 %
V2 $35,651 +257 % 6.80 % −21.4 % 0.617 6,108 1,272 44.7 %
V3 $31,240 +212 % 6.08 % −32.3 % 0.547 5,200 1,070 45.0 %
V4 $28,615 +186 % 5.59 % −21.2 % 0.580 4,101 511 45.4 %

Regime gate blocked 24 – 106 entries per scenario (0.5 – 1.2 % of candidates) — the gate is a small effect at this timescale.

4. Interpretation

Absolute return: Baseline wins, then V2. Baseline finishes at $40,398 vs V2's $35,651, with V1 and V3 closely tied around $31k and V4 last at $28,615. Baseline simply gets to take more trades — 8,842 filled trades vs V2's 6,108, vs V4's 4,101 — and on $500 fixed sizing that compounds into more raw return. Roughly $9.5k of Baseline's $10.4k lead over V4 traces directly to taking 4,741 more entries.

Risk-adjusted: V2 wins, then V1. Baseline pays for its higher volume with the worst drawdown in the field (−38 %, vs V1's −20 % and V2's −21 %) and the lowest Sharpe (0.543). V2 captures most of Baseline's compounding (CAGR 6.80 % vs 7.50 %) with about half the drawdown — Sharpe 0.617 is the best of the five. V1 is close behind at 0.609 with the tightest drawdown (−20.4 %). V3 lands at Baseline's Sharpe but with a noticeably worse drawdown than V2/V4.

Cash utilization tells the story. Baseline's 6,573 no-cash skips (42 % of candidates!) show capital is constantly fully deployed — buying every $500 slot it can. V2 skips only 1,272 (17 %), V4 skips just 511 (11 %). The variant detectors send fewer candidates so cash sits idle more often; that idle cash is the drag against Baseline on absolute return, but it's also the buffer that produces the smaller drawdowns.

Year-by-year, the ranking flips repeatedly. In 2008 the order was V4 > V2 > V3 > V1 > Baseline — the variants survived the crash with less capital deployed in falling stocks. By 2012, V1 led the field (+45 % vs Baseline +14 %). V2 had the lead from 2016 through 2020 (stronger pole-quality screen helped through choppy mid-2010s). The 2021–2024 bull rotated Baseline back to the front as raw count mattered more than per-trade quality. V3 has trailed V1 for most of the run and never recovered the early 2009 drawdown gap. V4 has trailed throughout, consistent with the Phase 11a-3 finding that flagpole.max = 2 trims too aggressively.

Win rate alone is not the discriminator. All five scenarios sit between 43.7 % and 45.4 % — a 1.7 pp spread. V4 has the highest win rate but the lowest CAGR, because its trades pay less per win (consistent with §6 of the Phase 11a-3 report: V4's 30-day endpoint mean is the only one below baseline). The lever moving the equity curve here is trade count × average profit, not win-rate.

The decision the numbers frame is whether to optimise for absolute compounding (Baseline) or for risk-adjusted compounding (V2). V2 gives up ~$4.7k in final equity over 19 years (~$250/year) in exchange for cutting peak-to-trough drawdown nearly in half. El Don decides.


Artifacts under ab_test/: backtest.py, plot_backtest.py, backtest.log, backtest_summary.json, backtest_eoy.csv, plus backtest_trades_<scenario>.parquet and backtest_equity_<scenario>.parquet for all 5 scenarios.

Chart: charts/pennant_strategy_backtest_equity_curves_2026-05-11.png (300 dpi).