2026-04-29 Forge Finding

Test Fixtures Must Look Different From Production

From session S8-2026-04-29 · task T8-1 · pillar P11 Technical Craft × P12 Production Discipline · analogy FINANCE-KNIGHT-CAPITAL

#forge-finding
#test-isolation
#knight-capital
#naming
#production-discipline

The lesson

Test fixtures must be unambiguously separate from production targets. When the fixture and the target share the same name, namespace, or folder, the next person can’t tell which is which — and production ends up running on test assumptions.

The labels on the surfaces matter more than the documentation describing them. A tired engineer at 3am looking at a path or flag should be able to tell which is which without reading any code.

Analogy — RAILWAY: Switch Lever Differentiation

A switch routing trains to the maintenance track must be physically distinct from the production siding. Old Southern Pacific yards in Texas used the same lever shape for both; tired switchmen at 3am routed revenue trains onto the wrong track. The fix wasn’t more training — it was a different lever shape. The fixture and the target had to look different from outside, before anyone touched anything.

Same principle inside a binary, inside an env-var file, inside a folder structure: if a tired engineer at 3am can’t tell the test thing from the real thing without reading documentation, the labeling is broken.

How it landed in T8-1

Knight Capital Group, August 1, 2012. A routine deployment pushed new code to seven of eight servers; the eighth still carried a residual flag from a 2003 test setup that re-activated old test code internally named “Power Peg.” For 45 minutes, Power Peg flooded NYSE with millions of orders at unfavorable prices. Knight lost $440M in those 45 minutes, was acquired by Getco a week later, and ceased to exist as an independent firm.

Root cause: test code (Power Peg) and production code lived in the same binary using the same flag namespace. Nobody could tell from outside which was supposed to be running where. The names didn’t tell you — both lived in the same module. The flags didn’t tell you — both used the same env variable. The deployment didn’t catch it — eight servers, eight chances to leave one bit out of sync, and one was enough to ruin the firm.

T8-1 moved the journal entries that mattered into their own folder nobody could mistake for the test scratchpad. The synthetic 5-entry test vault used to verify the Mirror end-to-end during S7 stays around for future verification work without polluting the real diary. The labels on the folders (workhorse-vault for production, mentor-mirror-test-vault for fixtures) make it impossible to cross the streams.

A five-minute change. The discipline it demonstrates is what stays.

The wider pattern

T8-1 extends two prior lessons: S2-T2-2 (test helper never reads VAULT_PATH; uses fs.mkdtemp under os.tmpdir() only) and S6-T6-2 (synthetic test vault for Type B kept separate from any real vault). Where those covered the fixture side, T8-1 closes the production-target side. Now both ends of the test/production boundary are correctly labeled.

The cost wasn’t the test code itself. The cost was that the test code was indistinguishable from production code at the place where it mattered: at runtime, on a deployment.

The lesson

Analogy — RAILWAY: Switch Lever Differentiation

How it landed in T8-1

The wider pattern

Sources