Unleash

Google to Engineers: “It’s Time to Make Feature Flags Mandatory”

On June 12th, a single policy change inside Google Cloud triggered a global outage that lasted more than three hours.

The cause? A new quota check introduced a null pointer exception in a core backend service. The blast radius was enormous: 503 errors across Google APIs, outages in Artifact Registry, BigQuery, Cloud Run, and disruptions for businesses relying on GCP around the world.

In Google’s own words: “The issue with this change was that it did not have appropriate error handling nor was it feature flag protected. “

It was the kind of incident that quietly costs millions, millions in SLA penalties, millions more in productivity and lost trust.

Google’s own postmortem summed it up in one devastating sentence:

“If this had been flag protected, the issue would have been caught in staging.”

Let that sink in. The difference between staging and global failure was one missing feature flag.

DevOps Isn’t Enough to Prevent Downtime. You Need FeatureOps.

This wasn’t a DevOps failure. In fact, the rollout itself was a technical success, code deployed smoothly and consistently across global regions. The problem was that the teams had no runtime control once it was live.

The rollback process tells the story clearly:

  • The root cause was identified within 10 minutes. 
  • A code rollback was prepared and deployed within 40 minutes. 
  • But the full incident lasted 3 to 4 hours, due to systemic recovery delays and backlog clearing. 

In other words: DevOps got the code out fast. But it couldn’t bring the system back fast.

That’s why FeatureOps is emerging as a necessary pillar of enterprise reliability. It extends DevOps with runtime control, separating deployment from release, and allowing for surgical rollback of behavior without waiting for redeployments to propagate.

Feature Flags Are No Longer Optional

If there’s any doubt that Google uses feature flags, the postmortem puts it to rest, they do. But not for this particular change. Why?

Because the logic in question was inside a “critical binary,” and likely not seen as a traditional “feature”. And that’s exactly the problem.

In 2025, every backend system is a user-facing system, whether it has a UI or not. Users depend on stability, and if a backend bug takes down your API, they don’t care if it’s visible or not. Everything is a feature when uptime is the product.

It’s like requiring 2FA for frontend access but allowing engineers to SSH into backend systems without audit. Or treating frontend deploys as sacred while backend services get pushed manually. That sounds absurd, but that’s the current state of flag usage in many enterprises.

Google now says it will enforce feature flag coverage for all critical binaries, with default-off posture. That’s not a suggestion, it’s a mandate grounded in operational reality.

Google: “We will enforce all changes to critical binaries to be feature flag protected and disabled by default.”

Make Runtime Safety Policy, Not Preference

Any organization with software critical enough to cause material risk, outages, SLA penalties, and security exposure, should be enforcing this same standard.

  • Feature flags are not just an A/B test tool. 
  • They’re not just for frontend UX. 
  • They are runtime safety infrastructure. 

And when the cost of failure runs into the millions, the cost of enterprise-grade feature management is trivial in comparison.

This is the moment for CIOs and engineering leaders to act. Enforce flags on every backend and frontend change. Set your PR checks accordingly. And adopt runtime controls that let you contain risk before it reaches customers.

Because if Google can go down from one missing flag, so can anyone. Get started with Unleash or talk to our team about what it looks like to adopt FeatureOps at scale.

Share this article