How to Map a Feature's Failure Modes in 30 Minutes
Most teams discover how an AI feature breaks after a customer does. This is a 30-minute ritual that surfaces the failures first. You walk out with a one-page failure signature.
Table of contents
Here's a deal. Give me thirty minutes and you'll walk out with a one-page failure signature for an AI feature: the specific ways it breaks, ranked by how likely and how damaging each one is.
Most teams never spend those thirty minutes. They ship, and the failure modes introduce themselves later, through a support ticket, on a bad day, in front of a customer. Let's not do that.
What you'll produce: a one-page table listing each way the feature fails, the real input that triggers it, what the user sees, and a likelihood × harm score. Pin it to the launch ticket. That's the deliverable.
A quick note on where this comes from
"Failure modes" isn't a new-AI phrase. It's borrowed from reliability engineering, where FMEA — Failure Mode and Effects Analysis — has helped aerospace and car companies not kill people since the 1940s. The method is brutally simple: list every way a thing can fail, score each by likelihood and severity, fix the worst ones first.
Marily Nika adapted that instinct for AI features, and her one-line version is the whole point: push the model into its failure modes before your users do. What follows is FMEA, shrunk to thirty minutes and pointed at a probabilistic feature.
What you need
A feature (shipped or about to ship), the actual product open in front of you, and a blank doc. That's it. Don't invite the whole team for the first pass — it's faster solo, and you socialize the output afterward.
Step 1: Write the intended behavior (5 min)
Before you can name how it fails, write down what "working" means, in one or two plain sentences. No hedging.
When a user pastes a customer email, the feature drafts a reply that's accurate to our policy, matches our tone, and needs only light editing.
That's your reference line. Every failure is just a deviation from it.
Step 2: Brainstorm failures against the messy world (10 min)
Now list the ways it deviates. The trick — and it really is the trick — is to imagine the messy input, not the clean demo one. Walk these four prompts and write whatever each surfaces:
- Ambiguous input. The email is vague, contradictory, or asks two things at once.
- Out-of-domain input. It's about something the feature was never meant to handle.
- Adversarial input. Someone's actively trying to make it say something dumb.
- Scale and edge. What breaks on the 10,000th call that didn't on the 10th? Rare names, other languages, huge inputs, empty inputs?
Don't filter yet. Quantity wins here. Ten rough failure modes beat three polished ones.
Step 3: Force them to actually happen (10 min)
This is the step everyone skips, and it's the one that pays. Go to the live feature and try to trigger each failure. Feed it the ambiguous email. Paste the out-of-domain question. Send the empty input. Be a little cruel about it.
Two things happen, and both are useful. Some predicted failures don't reproduce — cross them off. And the feature fails in ways you never predicted, which are the most valuable findings of the whole exercise, because those are the exact ones that would've reached a customer. This is chaos engineering, scaled down to a coffee break.
Keep the receipts. Paste the exact input and the bad output next to each failure mode. A failure with a screenshot gets fixed. A failure described in the abstract gets debated, then forgotten, then shipped.
Step 4: Rank by likelihood and harm (5 min)
Score each confirmed failure on two axes — low / medium / high:
- Likelihood: how often will real usage hit this?
- Harm: how bad is it when they do? A clumsy sentence is low. A confidently wrong refund policy is high.
High-likelihood, high-harm failures are launch blockers. High-harm but rare ones need a guardrail or a quality floor. Low-low ones go on a list and you move on with your life. Now you're deciding with a map instead of a feeling.
The artifact: your failure signature
One page, one table. Steal this:
| Failure mode | Trigger (real input) | What the user sees | Likelihood | Harm | Action | |---|---|---|---|---|---| | Wrong policy answer | "can I get a refund after 40 days" | Confident, incorrect "yes" | Med | High | Block + guardrail | | Tone too casual | angry complaint email | Chirpy reply, reads as dismissive | High | Med | Prompt fix before launch | | Empty on non-English | input in Spanish | Returns a blank draft | Low | Med | Backlog |
That table is the work. It turns "the AI feels kind of risky" into a specific, rankable, fixable list anyone on the team can act on.
Why this builds a skill, not just a doc
Run it on every feature for a month and something shifts. You start predicting in Step 2 the failures you used to only discover in Step 3. That gap closing is literally the new product sense developing in real time.
It's the same muscle behind the kill test: spend a few deliberate minutes attacking your own work, and you walk into the launch knowing exactly where the bodies are buried. Thirty minutes now, or a 2 a.m. escalation later. I know which one I'd pick.
Next: turn these failures into a real eval systemRelated posts
Product Sense Now Means Knowing How Your Model Fails
The old product sense was empathy for users. The new one is fluency in failure. Meta added it to the PM interview loop. Here's what it actually is and how to build it.
The Kill Test: Spend Ten Minutes Trying to Kill Your Own Feature.
Four counter-evidence checks to run before any roadmap pitch. If the idea survives, you walk into the review bulletproof. If it dies, it cost you ten minutes instead of a quarter.