Spec-Driven feature flags: How Kiro and Unleash turn requirements into safe releases
Most feature flags get created too late. A developer finishes a risky change, opens a pull request, and someone in the review asks: “Shouldn’t this be behind a flag?” The code gets reworked. The flag gets added as an afterthought. Naming is inconsistent. The rollout plan is vague.
Now imagine the same scenario with an AI coding assistant. The assistant writes the code faster than ever, but it still has no idea which changes are risky, what your team’s flagging policies are, or whether a flag for this feature already exists. The result: more code, same governance gap, less time to catch it.
What if risk were identified before any code was written?
That is the idea behind Kiro’s spec-driven approach to AI development. And when you connect it to your feature flag system, something interesting happens: the flag becomes part of the plan, not a patch applied after the fact.
Specs identify risk. Flags act on it.
Kiro is an AI-powered IDE built by AWS. Its distinguishing feature is structured development through specifications. Instead of jumping from a prompt to the generated code, Kiro guides teams through three phases: requirements, design, and implementation tasks.
During the requirements phase, a developer describes what they want to build. Kiro generates structured requirements using the EARS notation (Easy Approach to Requirements Syntax), breaking the idea into user stories with clear acceptance criteria. The design phase analyzes the codebase and produces architecture decisions, data models, and sequence diagrams. Finally, Kiro generates implementation tasks, sequenced by dependencies.
This structure changes when risk becomes visible. A task that modifies payment processing or authentication flows is not buried in a 400-line diff. It is identified explicitly in the design phase, with a description of what it touches and why.
That signal is exactly what a feature flag system needs.
The Unleash MCP server connects to Kiro through MCP (Model Context Protocol), an open standard for AI tool integration. When Kiro begins implementing a high-risk task, the Unleash MCP server evaluates whether the change needs a feature flag. If it does, the server creates the flag, enforces naming conventions, and generates framework-specific wrapping code. All of this happens during task implementation, not after the code is reviewed.
The difference is timing. In traditional workflows, feature flags are often treated as a code-review artifact. With Kiro and Unleash, they are a requirements artifact. Risk surfaces in the spec. The flag follows automatically.
Why timing matters
In June 2025, a missing feature flag caused a global Google Cloud outage lasting over three hours. Google’s postmortem was blunt: if the configuration change had been flag-protected, the issue would have been caught in staging. Recovery would have taken seconds, not hours.
This is not unusual. Teams deploy risky changes without flags all the time. Not because they do not believe in feature flags, but because the decision to flag a change happens too late in the process. By the time someone realizes a flag is needed, the code is already written, tested, and queued for deployment. Adding a flag at that point feels like rework.
With AI assistants accelerating the pace of development, this problem gets worse. The DORA State of AI-Assisted Software Development report found that delivery stability tends to decrease as AI usage increases. More code ships faster. But the governance practices that keep software stable have not kept up.
Kiro’s specs change the equation. If you know at the requirements stage that a task is high-risk, you can plan for the flag before the first line of code. No rework. No afterthought. The flag is part of the implementation from the start.
How the integration works
The Unleash MCP server exposes nine tools to Kiro. Four of them handle the core workflow:
evaluate_change analyzes a code change and determines whether it should be behind a feature flag. The tool considers the type of modification, the files involved, and the risk level. A documentation update probably does not need a flag. A payment integration almost certainly does.
detect_flag checks whether a suitable flag already exists in Unleash. This prevents duplication. If your team already has a flag for payment processing, Kiro will suggest reusing it rather than creating a new one.
create_flag generates a new feature flag with proper naming, typing, and documentation. The Unleash MCP server enforces your team’s conventions. No more ad-hoc names cluttering the codebase.
wrap_change produces framework-specific code that guards the new feature behind the flag. Whether you work in React, Django, Go, Spring Boot, or any of the languages Unleash supports, the wrapping code follows the patterns in your codebase.
Supporting tools handle the rest of the lifecycle: set_flag_rollout configures rollout strategies, get_flag_state retrieves flag metadata, toggle_flag_environment controls per-environment activation, remove_flag_strategy cleans up strategies, and cleanup_flag guides safe removal of stale flags. Feature flags are temporary by design. The Unleash MCP server treats the full lifecycle, from creation to cleanup, as part of the workflow.
Kiro’s four layers of automation
What makes the Kiro integration different from other AI assistants is not just the Unleash MCP server. It is how Kiro’s native features layer on top of it.
Steering files encode your policies. Kiro reads steering files from .kiro/steering/ and applies them automatically. You can write a file that says: “All changes in the payments domain require feature flags. Use kill-switch type for external provider integrations. Follow the naming convention payments-{feature}-{variant}.” Set the steering file to activate whenever files in src/payments/ are touched, and Kiro will follow those rules without being prompted. Different domains get different policies. The rules travel with the codebase.
Hooks trigger evaluation automatically. Kiro hooks fire on IDE events: file saves, file creations, manual triggers. A File Save hook on src/payments/**/*.ts can automatically call the Unleash MCP server’s evaluate_change tool whenever a developer saves code in the payments directory. The developer does not need to remember to ask. The hook catches the change. If a flag is needed, Kiro offers to create it right there.
Powers bundle everything for distribution. Kiro Powers package MCP tools, steering files, and hook definitions into a single installable unit. You can build an Unleash Power that includes the Unleash MCP server configuration, your team’s FeatureOps conventions, and domain-specific hooks. New team members install it with one click. New projects get the same policies without manual setup. Powers activate dynamically based on keywords in the conversation, so they load only when relevant and do not clutter the context window.
Specs surface risk before code exists. This is the foundational layer. When a spec identifies a task as high-risk during the design phase, Kiro can call evaluate_change before writing any implementation code. The flag becomes part of the task, not a follow-up item. This is the key difference from other AI assistants, where flag evaluation usually only happens after code has been generated.
These layers stack. A developer creates a spec for a Stripe payment integration. The design phase identifies it as high-risk. When Kiro starts implementing the task, steering files provide naming conventions. The Unleash MCP server creates the flag and wraps the code. A hook ensures that any subsequent saves in the payments directory are evaluated. The Unleash Power distributes this entire setup across the organization.
What this looks like in practice
A developer creates a spec: “Integrate Stripe as a payment provider for the checkout flow.” Kiro generates requirements, then a design. The design identifies two high-risk tasks: the payment processing integration and the webhook handler.
As Kiro begins implementing the payment task, it calls evaluate_change. The Unleash MCP server confirms the change needs a flag and suggests payments-stripe-integration. Kiro calls detect_flag to check for duplicates. None found. It creates the flag and wraps the new code:
if (unleash.isEnabled("payments-stripe-integration", context)) {
return stripeService.processPayment(request);
} else {
return legacyPaymentService.process(request);
}
Now the flag exists in Unleash, disabled by default. The team tests in staging, enables for internal users, then rolls out gradually to production. If error rates spike, the flag is disabled instantly. No deployment required.
Two weeks later, the feature is stable at 100%. A developer tells Kiro: “Clean up the payments-stripe-integration flag.” Kiro calls cleanup_flag, which returns every file and line where the flag is used, and useful guidance based on your code base language, framework, and conventions. Kiro removes the conditional code, preserves the Stripe path, and runs the test suite. The flag is deleted from Unleash. No technical debt left behind.
The full release picture
Creating flags during implementation is just the beginning. Unleash provides the infrastructure for the entire release lifecycle.
Rollout strategies let you target specific users, regions, or percentages of traffic. Start with internal users, expand to beta testers, then gradually increase to 100%. Release templates define reusable rollout patterns with automatic pauses if error rates spike.
Impact Metrics collect production signals (request rates, error counts, latency) directly from your application via the Unleash SDK. These metrics tie directly to feature flags, so you can see exactly how a new feature affects your system. Automated progression advances the rollout when metrics are healthy and pauses when they are not.
Today, the Unleash MCP server focuses on flag creation, wrapping, and cleanup. We are building toward tighter integration with release templates and Impact Metrics, so that Kiro can configure rollout strategies and respond to production signals within the same workflow.
Getting started
Setting up the integration requires a Kiro workspace, Node.js 18+, and an Unleash instance with a personal access token.
Create .kiro/settings/mcp.json in your project:
{
"mcpServers": {
"unleash": {
"command": "npx",
"args": ["-y", "@unleash/mcp@latest", "--log-level", "error"],
"env": {
"UNLEASH_BASE_URL": "${UNLEASH_BASE_URL}",
"UNLEASH_PAT": "${UNLEASH_PAT}",
"UNLEASH_DEFAULT_PROJECT": "${UNLEASH_DEFAULT_PROJECT}"
},
"autoApprove": ["get_flag_state", "detect_flag", "evaluate_change"]
}
}
}
Commit this file to version control. Credentials come from environment variables, so secrets stay out of the repository. The autoApprove property lets Kiro invoke read-only tools without asking for confirmation each time. Before the server connects, enable MCP in Kiro’s settings and approve the environment variables when prompted. Kiro requires explicit approval before expanding variables in MCP configs.
The same MCP server also works with Claude Code and GitHub Copilot. The Kiro integration adds specs, steering files, hooks, and Powers on top, but the underlying MCP server is shared across all platforms.
For the full setup guide, including steering file templates, hook examples, and a complete Unleash Power definition, see our Kiro integration documentation.
From requirements to safe releases
The pattern we see across the industry is clear: AI-assisted development is accelerating code production but not the governance practices that keep software stable. Feature flags are the control layer. But their value depends on when they enter the workflow.
If flags are an afterthought added during code review, they are likely to be incomplete and inconsistent. If they are part of the plan from the requirements stage, they become a natural part of how software is built.
Kiro’s spec-driven model makes this possible. Specs identify risk. Steering files encode policies. Hooks automate evaluation. Powers distribute the setup. And the Unleash MCP server provides the tools that tie it all together.
The result is a workflow where AI writes the code, governance is enforced automatically, and releases are safe by default. Not because developers remember to follow the rules, but because the rules are built into the process.