Taming AI’s Known‑Unknowns: What Great Risk Shops Teach Us

vinit sahni
Sep 27
3 min read

Every AI program has a moment where someone asks, “What are we missing?” The uneasy answer: not mysteries from outer space, but known‑unknowns hiding in plain sight. We can name them—model updates, tool scope changes, new retrieval sources, a fresh sub‑processor two hops downstream, evals that quietly go stale—but most teams don’t have a way to govern those changes at the speed they arrive. That gap between “this works” and “we’re allowed to use it” is trust latency, and it’s where momentum dies.

If you’ve ever watched a high‑velocity trading shop manage risk, you’ve seen a different posture. Firms like D.E. Shaw and Citadel don’t wait for the quarterly memo; they run live. They assume uncertainty, map exposures to the things that actually move, and act automatically when assumptions break. They don’t try to predict the exact future tick; they manage sensitivities, set sane limits, and keep a hand on the kill switch. That discipline translates cleanly to AI.

Start with look‑through. Markets don’t stare at the ticker; they look at the underlyings—rates, spreads, liquidity. In AI, the underlyings are your models, tools and their permissions, retrieval sources, data classes, runtimes, and sub‑processors. Write them down. Make an “assumptions ledger” that says, out loud: our yes depends on this model version, these tool scopes, those retrieval sources, no training use of this data, and precisely these sub‑processors. Now changes have something to bump into.

Next, measure what moves you. In trading, risk teams watch the drivers—volatility spikes, correlation shifts, liquidity vanishing at the wrong moment. In AI, the drivers are dull but deadly: version deltas, scope escalations, new content entering retrieval, boundary shifts in sensitive data, or a supplier adding a sub‑processor. You don’t need to forecast the exploit; you need to catch the mover.

Then wire triggers to those movers. In markets, a limit breach or a widening spread might auto‑hedge or cut risk. In AI, a model update above a threshold should automatically run task‑relevant evals; a new tool scope should flip you to sandboxed mode; a new retrieval source should narrow access until evidence catches up. No committees, just a time‑boxed check aimed at the delta. Green: proceed. Amber: proceed with mitigations. Red: pause until the gap closes. Hours, not weeks.

Keep a portfolio view. Traders care about concentration—too much exposure to a single issuer, theme, or factor. AI portfolios hide similar clustering: many suppliers depending on the same model family, the same vector store, the same data broker. If a single failure mode hits that shared dependency, your whole strategy wobbles. Make concentration visible on purpose—then diversify or raise the bar where it counts.

And make it contextual. Markets price risk by use and impact; position size shrinks as stakes rise. Do the same in AI. The same supplier can be green for internal knowledge search and amber for customer decisions. Risk isn’t a moral judgment; it’s fit for purpose.

Finally, value cadence over heroics. Great risk shops don’t celebrate big saves; they prevent drama with small, constant adjustments. In AI, that means evals that stay fresh for the tasks that matter, approvals tied to current assumptions, and overrides that are precise enough to avoid alarm fatigue. If you can’t sketch the control on a whiteboard in five minutes, it probably won’t fire when you need it.

Here’s a tiny Tuesday vignette. A supplier bumps their model at 9:14. Your system notices the delta, remembers that yesterday’s approval assumed the prior version, and runs the evals that matter for one customer‑facing flow. A behavior regresses, so that flow flips to amber with a scope restriction. By lunch, the supplier ships a fix; evals pass; green again. No 200‑page PDF. No drama. Just governance at the speed of change.