
Feature flags are the interface between AI speed and human judgment. Build at AI speed, release at human speed.
Nicholas Khami
Engineering Manager, Mintlify

A 3-hour overnight outage caused by a single unguarded query turned into eight standing kill switches — converting future incidents from all-night click-ops sessions into a single toggle.
Flag-related pull requests grew from 5 a month to 80–100, and 18 of 25 engineers now create flags, with a 23% cleanup rate keeping the count healthy rather than just growing.
Because Unleash is API-first, coding agents like Claude Code can build and wire experiments in about an hour, while the rollout decision stays human. Build at AI speed, release at human speed.
Mintlify helps companies build developer documentation sites, and those sites are visited by around 40 million people a month. With a team of roughly 25 engineers behind that reach, the blast radius of any mistake is enormous. Mintlify adopted Unleash in October 2025, but the moment that turned feature flags from a nice-to-have into a discipline came at 12:38 one morning in May.
This is how a single overnight outage reshaped how the whole engineering org ships.
On May 20, at 12:38am Pacific, Nick Khami’s phone went off with a P0. Every MongoDB replica was down, which took search, the dashboard, the WYSIWYG editor, the docs update pipeline, and every AI feature with it.
The cause was almost mundane. A marketing leaderboard page ranked how agent-ready popular documentation sites were, and every page load enqueued a refresh job for each site, each running a heavy aggregation query. Someone refreshing or scraping that page over and over requeued tens of thousands of jobs. At peak there were more than 32,000 aggregation queries in flight, and the database tipped over. The team got back up at 1:45am, then went down again at 2am, for about three hours of overnight outage.
Recovery was the kind of thing no one wants in a runbook. A code fix to remove the offending queries, force-merged at 2am with no real review, while Nick restarted containers by hand in the AWS console and a teammate emergency-scaled the database cluster. The whole episode pointed at one root problem.
“With AI, we can write and ship code incredibly fast. But we can’t stop anything without FeatureOps. Our only controls that night were git push and the AWS console.” — Nick Khami, Mintlify
The leaderboard jobs lived on the back end, and Mintlify had only adopted Unleash on the front end so far. A back-end flag would have made the fix a toggle instead of an all-nighter.
“Deploys were our release mechanism, our rollback mechanism, and our incident response, all the same lever. That’s slow, and it overloads what should be a simple action.” — Nick Khami, Mintlify
Mintlify had chosen Unleash the way good tooling usually gets chosen. An engineer, Catherine, had used it at Pitch and vouched for it.
“That’s how most good tooling decisions happen. Someone you trust has lived with it before. The best marketing for developer products is almost always word of mouth.” — Nick Khami, Mintlify
What they were after was specific: a control plane they could automate against, rather than a seat-priced dashboard. They wanted something API-first, with open-source roots, and pricing that made sense for a small team. That API-first quality turned out to matter far more than they expected.
After May, Mintlify extended flags across the back end too: how jobs are queued and coordinated, API route changes, new query types. Adoption climbed, and the team treated that as the real signal of success. Flags went from a handful a month to over 20, flag-related pull requests went from about 5 a month to between 80 and 100, and 18 of roughly 25 engineers started creating flags.
“FeatureOps has to be a team sport. If only one or two people drive all the usage, you’re not set up for nearly as much success as when it’s everyone.” — Nick Khami, Mintlify
The team also made gradual rollouts the rule and on/off flags the exception, moving features from 0% to 100% over several hours, with constraints by customer segment, by usage, and by organization ID. And they took archiving seriously, because the alternative is not FeatureOps at all.
“If your flag count only goes up, you don’t actually have FeatureOps. You just have tech debt with a UI.” — Nick Khami, Mintlify
Nine days after the incident, an engineer shipped a flag that pauses the update pipeline so jobs queue safely instead of overwhelming the database. That was the first of eight standing kill switches Mintlify now runs, covering updates, the caching layer, billing circuit breakers, and parts of the worker like indexing and GitHub sync.
“The next time I’m up at midnight, it’s just a toggle to flip, instead of an all-nighter click-opsing AWS and the MongoDB Atlas dashboard.” — Nick Khami, Mintlify
Mintlify uses coding agents like Claude Code and Codex across most of its pull requests, and that changed the math. Writing software stopped being the constraint, so each daily deploy carried far more change, and the old deploy-as-release cadence broke down.
“The question is no longer ‘can we build it.’ It’s ‘who should we build this for, and is right now the right time?’ That’s a flag decision, not a deploy decision.” — Nick Khami, Mintlify
Because Unleash is API-first, the agents can operate it directly. Nick gave Claude Code a skill describing how to use Unleash and a development-environment API key, then had it run a full A/B test on whether an editor panel should sit left or right. The agent built the feature, created the flag, set up a 50/50 rollout sticky on organization IDs, wired it through the Next.js app, and attached the assignment to PostHog analytics. The human pulled the lever to go to production, and a near-even split landed within 48 hours.
“This used to take a week of standing up experimentation infrastructure every time. Now it’s an hour of me going back and forth with Claude. Experimentation has gone from a bespoke event to almost everything we ship.” — Nick Khami, Mintlify
The flag is what makes that speed safe, because it lets agents move fast while a human keeps the exposure decision.
“Feature flags are the interface between AI speed and human judgment. Build at AI speed, release at human speed.” — Nick Khami, Mintlify
Mintlify’s advice maps cleanly onto how they now work. Build levers as flags before you need them, and create a standing kill switch for every risky feature. If your incident runbook says “redeploy” anywhere, you are one bad night away from click-opsing your infrastructure, so replace it with a flag. Make flags the default with gradual rollouts on everything, and archive aggressively so the count goes both up and down. And give your agents the Unleash API for the development environment, letting them build the experiments while the rollout, the on switch, and the archiving stay human decisions.
“It’s one control plane accessible to both humans and AI. That’s very powerful for our team.” — Nick Khami, Mintlify