This week I went back into a two-month-old post about testing and added a Monte Carlo simulator next to one sentence. One sentence. The post had been live since April, people had read it, and the line I kept thinking about was this one: a 3% flake rate is not a 3% problem.
I believed it when I wrote it. I had watched my own CI turn red on changes that touched nothing. But I could feel how a reader files that sentence: under exaggeration, next to every other blogger rounding up for effect. The math behind it is a one-liner, 0.97^40 ≈ 0.30, and that's exactly the problem. Nobody feels an exponent. You can print the formula in bold, you can graph it, and the reader's eyes slide over it at reading speed, which is the one speed where compounding never looks scary.
So instead of defending the sentence with two more paragraphs, I built it a machine.
The sentence nobody believed
Here's the machine. Each dot is a test; each test independently flakes at the rate on the left. Set the suite to 40 tests and watch the no-retry bar. Then drag the suite bigger, the way real suites only ever get.
That collapse from 97% green to roughly 30% is the exponent, felt. No paragraph I write gets you there, because reading is a spectator activity and the claim only lands when you're the one who made the bar fall. The retry bars then sneak in the second, more uncomfortable point: retries pull the build back to green without making a single test less flaky. The reader discovers that by dragging, before I've said it. An argument you arrive at yourself needs no defending.
This is the entire idea behind the component library on this site. Not decoration. Not "engagement". A specific tool for a specific failure mode: the claim that is true, that matters, and that prose physically cannot carry.
Twenty years of prior art
None of this is my idea, and pretending otherwise would be silly when the lineage is this easy to trace.
- 2011
Explorable Explanations + Ladder of Abstraction
Bret Victor names the genre in one essay and performs it in a second.
- 2016
Distill launches
Machine-learning research published as interactive articles.
- 2017
The Pudding
A publication built on the premise that some essays are played, not read.
- 2020
Communicating with Interactive Articles
Distill surveys the evidence: interaction measurably improves engagement and understanding.
- 2021
Distill goes on hiatus
Producing at that standard was exhausting for everyone involved. The cost is real.
- 2026
This post
An ordinary engineering blog tries to close the gap.
The clearest statement is Bret Victor's Explorable Explanations from 2011: a reactive document, he wrote, lets the reader "play with the author's assumptions and analyses, and see the consequences". His Up and Down the Ladder of Abstraction is the same argument performed instead of stated — you learn the design of a steering algorithm by scrubbing through every level of it. Fifteen years later both essays still read like a to-do list the web mostly ignored.
Mostly, not entirely. Distill spent five years publishing machine-learning research as interactive articles, and its 2020 piece Communicating with Interactive Articles is the best survey of why this works that I know of. It walks through the evidence that interaction improves engagement and understanding, connects it to active learning research, and is itself interactive, so the argument demonstrates itself while you read it. Distill went on hiatus in 2021, partly because producing articles at that standard was exhausting for everyone involved. That detail belongs in this post, not in a footnote at the end: the cost is real and I'll get to it. Nicky Case collects the whole genre at explorabl.es, and The Pudding built a publication on the premise that some essays need to be played, not read.
What strikes me about that list is what's missing from it: ordinary engineering blogs. The genre lives in journalism, in ML research, in standalone art projects. Personal technical blogs — the place where someone explains worktrees or flaky tests to a colleague — still mostly paste screenshots. I don't think that's because bloggers disagree with Victor. I think it's because a screenshot costs nothing and a simulator doesn't.
A screenshot shows the reader what you saw. A component lets them check whether you're right.
What a component actually costs
I'll put numbers on the cost, because the honest version of this post admits it's the whole reason the genre is thin.
This site currently runs 30 components. The four newest — the flaky-suite simulator above, a step-through git graph, an agent-loop player, a draggable easing-curve editor — took a full working session to build, and that was the cheap part. The expensive part came after: every one of them needs keyboard navigation, reduced-motion behavior, dark mode, a Spanish version of every label including the ones generated client-side, and re-initialization when the site's page transitions swap the DOM out from under them. None of that is visible in a demo GIF, and all of it is the difference between a component and a toy.
And then there's the part I'd skip if this post were marketing. The dot grid in that simulator — the one part whose entire purpose is being seen — shipped invisible. Astro scopes component styles with an attribute selector, the dots are created from a client script, client-created nodes don't carry the attribute, so every dot rendered at height zero. The bars worked, the verdict text worked, the math was right, and the centerpiece was a hole. I only caught it days later when a Playwright screenshot came back with an empty band in the middle of the figure. The fix was one :global() wrapper, two minutes. Finding it required actually looking at the rendered page instead of the code, which I had been confidently not doing. I won't pretend that was elegant. I built a machine for making things visible and didn't notice it was invisible.
Here is the entire fix:
1
−
.flaky-dot {
2
width: 12px;
3
height: 12px;
4
border-radius: 3px;
5
opacity: 0;
6
animation: fkPop 0.25s var(--ease-spring) forwards;
7
}
1
+
/* Dots are created from the client script, so they don't carry
2
+
Astro's scope attribute — style them through :global. */
3
+
.flaky-grid :global(.flaky-dot) {
4
width: 12px;
5
height: 12px;
6
border-radius: 3px;
7
opacity: 0;
8
animation: fkPop 0.25s var(--ease-spring) forwards;
9
}
That bug earns its place here because it's the strongest argument for the whole approach that I have. I had read that component's code several times and believed it worked. Reading was not enough — for me, the author, on my own code. That is exactly the reader's position on every claim you publish, and it's why "just describe it well" keeps losing to "let them run it".
The rule that keeps it honest
Thirty components is also how you end up with a blog that looks like a component vendor's demo reel, so there's one rule that decides whether something gets built: a component exists only when a specific post needs it for a claim prose couldn't carry. The git graph exists because the worktrees post tells a story about three branches not colliding, and "not colliding" is a sequence, not a picture. The easing editor exists because a post about game feel was asking readers to take four bezier numbers on faith.
Every component documents that reason on its own page, in a section called "why this exists", in both languages. The whole library runs live at /lab — same styles, same dark mode, same transitions a post gets, nothing mocked. If a component's "why" paragraph ever reads as "it looked cool", it shouldn't have been built.
The corollary cuts the other way too, and it's the part I have to keep relearning: most claims don't need a component. The majority of every post on this site is still paragraphs, because most sentences are doing fine on their own. The simulator above exists for the one sentence per post that readers file under exaggeration. Spending a working session on that sentence only makes sense if you actually want to be believed — which is, I suspect, the real filter, and a less comfortable one than build cost.
I know which side of it I'm on. The dots were invisible for three days, and reading my own code had me convinced they weren't. You're in that exact position with this post right now. Scroll back up and drag the slider.
Discussion
Comments are hosted on GitHub Discussions — sign in with GitHub to reply.