When the Brake Handle Isn't Connected to the Brakes
- #forge
- #building-in-public
- #testing
- #architecture
The task was supposed to be verification, not a fix-forward. Yesterday’s session had shipped a record button on the dashboard. The state machine had cleared every static check. The integration tests passed. Today’s job was to drive it in a real browser, hit the backend with real audio, and watch the vault write a real file. Instead, the first speech upload hung for one hundred and twenty seconds and returned an empty body. The second hung the same. The third surfaced something interesting in the server log: a sixty-second timeout that had been declared, written down in a wrapper called withTimeout, threaded through every Whisper and Anthropic call site, and never once actually connected to the underlying network request.
The wrapper looked correct in isolation. Create an AbortController. Set a setTimeout to fire its abort method after sixty seconds. Wrap the SDK promise. If the controller signals abort, throw a typed timed-out error. Pre-existing code that nobody had touched in a while because it never had to fail. The bug was the line that wasn’t there. The OpenAI SDK accepts a per-request signal as the second argument to its create method. The Anthropic SDK does the same. Neither call site passed it. The controller fired its abort after sixty seconds. The signal was a real AbortSignal. Nothing was subscribed to it. The SDK kept going until the next layer of timeouts noticed something was wrong, which on this stack meant Express’s own one-hundred-and-twenty-second request timeout finally cut the connection. From outside, it looked like the timeout was configured for two minutes. From inside, it looked like the timeout was configured for sixty seconds and just didn’t work. Both were wrong. The truth was that the sixty-second timeout was a comment in the code that didn’t reach the wire.
Two other defects came out of the same verification run. An empty Whisper transcript, on a silent recording, was being passed straight to Anthropic, which rejected it with a four-hundred because the messages API does not accept empty user content. The four-hundred surfaced as a five-hundred to the dashboard. A non-empty transcript, on a normal recording, was getting back a JSON-shaped response from Anthropic that occasionally arrived wrapped in markdown code fences despite the system prompt asking for raw JSON. JSON.parse threw. The catch wrapped that into another five-hundred. None of the three were regressions in yesterday’s diff. All three were in code that had been there for weeks and had simply never been exercised against the live SDK. The grep checks couldn’t see them. The unit tests couldn’t see them, because the SDKs were mocked. Only a real browser hitting a real server hitting real APIs could surface them, and a real browser hitting a real server is what verification is for.
When the Brake Handle Isn’t Connected to the Brakes
The right analogy for the AbortController bug is a freight train’s air brake. The engineer pulls the brake handle. The trainline, a single hose running the length of the train, vents pressure. Each car has its own local brake valve, which detects the drop and applies that car’s brakes mechanically. The system works because every car’s brake valve is connected to the trainline. If a car has a closed angle cock or a disconnected hose, the engineer can pull the handle as hard as they want and that car keeps rolling. The intent is right. The wire is missing. Modern locomotives run an end-of-train continuity test before departure that exists for exactly this class of failure. Pull the handle a fraction. Watch the EOT report the pressure change at the rear of the consist. If the pressure didn’t drop at the back, the train doesn’t leave the yard.
The withTimeout wrapper is the brake handle. The AbortSignal is the trainline. The OpenAI SDK request and the Anthropic SDK request are the cars. Pre-fix, every car had a closed angle cock. The handle pull was real. Nothing downstream observed it. The fix was two lines per call site, threading the controller’s signal into the SDK’s request options. The signature change to withTimeout itself was the more interesting move. Instead of accepting a Promise that may or may not honor the signal, the new signature accepts a function that receives the signal as a parameter and returns a Promise. The compiler now refuses to let a call site forget. What had been a comment-only convention is now type-checked. The continuity test is built into the signature, not added as a step.
The pattern generalizes past this one bug. Anywhere a wrapper enforces a side-channel concern—a timeout, a retry cap, a circuit breaker, a rate limit, an audit log, a kill switch—the safest signature requires the wrapped operation to receive the side-channel object as a parameter. Closure-captured side channels rely on the inner function knowing they exist. If the inner function is written by a different person on a different day, that knowledge is the brake hose nobody connected. The Therac-25 radiation therapy machine in the late nineteen-eighties killed six patients with the same shape of bug. The operator’s UI showed a treatment-configured interlock. The interlock was a software flag. The flag could be set independently of the actual physical state of the magnetic deflection hardware. A fast operator typing the prescription quickly could set the flag while the magnets were still in the wrong physical configuration, delivering raw electron beam at X-ray intensity to a patient positioned for a low-dose treatment. The Nancy Leveson investigation report from nineteen-ninety-three named the failure mode precisely. The safety check was UI-level. The protected resource was hardware-level. They weren’t connected. The same pattern, four decades apart, on a freight train and in a radiation therapy room and in a record button.
The three fixes shipped today are small. The record button shipped yesterday. Tomorrow’s session takes the verification back into the browser against a backend that finally answers two-hundred for the right reasons and four-hundred for the right reasons and never just hangs. The button isn’t the proof. The end-to-end pass is the proof. What today’s session bought was the difference between an artifact that looked complete and an artifact that actually was.