Unleash

GitHub Copilot Meets Progressive Delivery: A Guide for Engineering Leaders

The main promise of AI coding assistants is speed. GitHub Copilot and its peers are delivering on that promise, as three out of four developers now use them daily. Boilerplate disappears. Refactors happen in seconds. Features that once took days can be prototyped in hours.

But speed without control creates a different kind of problem.

The latest DORA State of AI-Assisted Software Development report found that as AI usage increases, delivery stability tends to decrease. Code ships faster, but bugs, inconsistencies, and security vulnerabilities ship with it. For engineering leaders, the question is not whether to adopt AI assistants. It is how to adopt them without sacrificing the stability your users depend on.

The answer is progressive delivery. And the key is connecting GitHub Copilot directly to your feature flag system.

The Control Gap in AI-Assisted Development

When a developer accepts a GitHub Copilot suggestion, it enters their local working code with a single keystroke. The speed is the point. But that speed can outpace understanding. A developer might accept a suggestion that works but introduces subtle issues they did not catch because they did not write it themselves. Code reviews help before deploying to production, but reviewers often assume the author understood what they committed. When AI generates the code, that assumption breaks down.

Consider what can go wrong:

  • A developer asks GitHub Copilot to implement a new payment flow. Copilot generates working code, but it does not know your team’s conventions for error handling. The code ships. An edge case causes failures in production. Rolling back requires a new deployment.
  • Another developer on a different team asks Copilot for help with a similar feature. Copilot generates a slightly different implementation. Now you have two payment flows with inconsistent behavior. Neither follows your established patterns.
  • A third developer creates a feature flag to guard a risky change. They name it “temp_flag_v2” because they are in a hurry. The flag works, but three months later nobody remembers what it controls. The codebase accumulates technical debt.

These scenarios share a common thread: new code bypassing the governance practices that keep software stable.

Progressive Delivery: The Missing Control Layer

Progressive delivery is the practice of releasing changes incrementally, with control mechanisms at each stage. Feature flags are the core enabling technology. They provide exactly what AI-generated code needs: a way to test, contain, and roll back changes without slowing development.

When you wrap AI-generated code in a feature flag, you gain several capabilities:

  • Instant rollback: If something goes wrong, you disable the flag. No deployment required. The problematic code path is immediately inactive.
  • Gradual rollout: Instead of exposing all users to new code at once, you can release to 5% of traffic, monitor for issues, then expand. If errors spike, you pause the rollout automatically.
  • Environment control: Enable the feature in staging for testing while keeping it disabled in production. Validate that AI-generated code works correctly before real users see it.
  • Cleanup guidance: Once a feature is fully rolled out and stable, remove the flag and its associated code paths. Feature flags are temporary by design.

The challenge is getting AI assistants to actually follow progressive delivery practices. Left to their own devices, they will create flags with arbitrary names, duplicate existing toggles, or skip flags entirely for changes that need them.

 

This is not theoretical. In June 2025, a missing feature flag caused a global Google Cloud outage lasting over three hours. Google’s postmortem: “If this had been flag protected, the issue would have been caught in staging.” The deployment worked, but recovery took hours. If a team with Google’s engineering resources can be burned by a missing flag, teams moving fast with AI-assisted code are at least as exposed.

Bringing Progressive Delivery into Copilot

We built the Unleash MCP server to solve this problem. MCP stands for Model Context Protocol, an open standard that lets AI assistants interact with external tools and services, often in the form of API wrappers. The Unleash MCP server exposes feature flag management capabilities and battle-tested guidance directly to GitHub Copilot.

For example, when a developer works with Copilot in VS Code, the MCP server provides tools that help the AI assistant follow your FeatureOps practices:

  • evaluate_change analyzes a code change and determines whether it should be behind a feature flag. The tool considers the type of change, the files involved, and the risk level. A CSS tweak might not need a flag. A payment integration almost certainly does.
  • detect_flag checks whether a suitable feature flag already exists. This prevents duplication. If your team already has a flag for payment processing, Copilot will suggest reusing it rather than creating a new one.
  • create_flag generates a new feature flag with proper naming, typing, and documentation. The MCP server enforces your conventions. No more “temp_flag_v2” names cluttering the codebase.
  • wrap_change produces language- and framework-specific code snippets that guard the new feature. Whether you are working in React, Django, Go, Rust, or any of the languages and frameworks supported by Unleash, the wrapping code follows your conventions and matches patterns already in your codebase.

 

The workflow becomes natural. A developer can explicitly ask Copilot to evaluate whether a change needs a flag, or you can configure project-level rules (via .github/copilot-instructions.md or AGENTS.md) that instruct VS Code to call evaluate_change automatically whenever new features are implemented. Either way, Copilot checks for duplicates, creates the flag if needed, and generates the wrapping code. The AI assistant is no longer improvising. It follows your team’s established practices automatically.

What This Looks Like in Practice

Imagine a developer adding Stripe payment processing to an e-commerce application. They ask Copilot:

“Evaluate whether the new Stripe integration should be behind a feature flag.”

Copilot calls the evaluate_change tool with a description of the change. The MCP server recognizes this as a high-risk modification and recommends a flag. It suggests the name “stripe-payment-integration” following your kebab-case convention.

Copilot then calls detect_flag to check for existing flags. No duplicates found. It proceeds to create_flag, generating a release-type flag with proper metadata.

Finally, Copilot calls wrap_change. The MCP server detects that this is a Node.js project using Express and returns appropriate wrapping code:


const unleash = require('unleash-client');

app.post('/checkout', async (req, res) => {
  const context = { userId: req.user.id };
  
  if (unleash.isEnabled('stripe-payment-integration', context)) {
    // New Stripe payment flow
    const result = await stripeService.processPayment(req.body);
    return res.json(result);
  } else {
    // Existing payment flow
    const result = await legacyPaymentService.process(req.body);
    return res.json(result);
  }
});

The developer reviews the code, accepts it, and commits. The feature flag is automatically created in Unleash, disabled by default. The team can now test the Stripe integration in staging, roll it out gradually to production, and disable it instantly if issues arise.

All of this happened within the developer’s normal workflow. No context switching. No manual flag creation in the Unleash UI. The AI assistant followed governance rules without the developer needing to remember them.

The Full Progressive Delivery Picture

Creating feature flags is just the first step. The real power of progressive delivery comes from combining AI-assisted development with automated release controls.

Unleash provides additional capabilities that extend the value of this integration:

  • Rollout strategies let you target specific users, regions, or percentages of traffic. Start with internal users, expand to beta testers, then gradually increase to 100%.
  • Release templates define reusable rollout patterns. A standard release might progress from 10% to 50% to 100% over three days, with automatic pauses if error rates spike.
  • Impact Metrics collect production signals directly from your application. Request rates, error counts, latency percentiles. These metrics tie directly to feature flags, so you can see exactly how a new feature affects your system.
  • Automated progression and safeguards use these metrics to make release decisions automatically. If errors stay below your threshold, advance to the next rollout milestone. If latency spikes, pause and alert the team.

Today, the Unleash MCP server focuses on flag creation and wrapping. We are actively building toward tighter integration with release templates and Impact Metrics. The vision is a workflow where AI assistants not only create flags but also configure rollout strategies and monitor production health.

You can read more about this direction in our blog post on automated FeatureOps with Impact Metrics.

Governance Without Friction

A common concern with adding governance to AI-assisted development is that it will slow things down. If developers have to jump through hoops to create feature flags, they will skip the process entirely.

The Unleash MCP server addresses this by making governance invisible. Developers interact with Copilot the same way they always have. The MCP server runs in the background, enforcing rules and providing guidance. There is no extra UI to learn, no separate workflow to follow.

Platform teams retain control over the policies. Naming conventions, flag types, rollout rules, and metrics thresholds are defined in Unleash. The MCP server inherits these settings and applies them automatically. AI assistants cannot exceed the privileges granted to their tokens.

This approach also preserves Unleash’s privacy architecture. Feature flag evaluations still happen locally within your applications, using our SDKs or Unleash Enterprise Edge. No user data is sent to the Unleash server. The MCP server only handles flag management. It does not change where or how flags are evaluated.

Getting Started

GitHub Copilot runs in several environments, and the Unleash MCP server works with all of them. VS Code is the most common setup, so we will use it as an example here.

The same MCP server also integrates with GitHub Copilot in JetBrains IDEs (IntelliJ, PyCharm, WebStorm), Visual Studio, and Neovim. Configuration differs slightly between environments, but the core workflow remains the same.

To set up the integration in VS Code, add a configuration file to your project:


{
  "servers": {
    "unleash-mcp": {
      "name": "Unleash MCP",
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@unleash/mcp@latest", "--log-level", "error"],
      "envFile": "~/.unleash/mcp.env"
    }
  }
}

Save this as .vscode/mcp.json in your repository. The envFile points to a shared configuration file outside your project directory, keeping your Unleash credentials secure and out of version control. This also makes it easy to reuse the same credentials across multiple projects and AI assistants.

Your ~/.unleash/mcp.env file should contain:


UNLEASH_BASE_URL=https://your-unleash-instance.getunleash.io/api
UNLEASH_PAT=your-personal-access-token
UNLEASH_DEFAULT_PROJECT=default

VS Code will start the MCP server automatically and make Unleash tools available to Copilot.

For detailed setup instructions and environment-specific setup guides, see our GitHub Copilot integration guide or the GitHub repository.

The Path Forward for Engineering Leaders

AI coding assistants are transforming how software gets built. The productivity gains are too significant to ignore. But speed without control is a false economy. Shipping faster only matters if what you ship actually works.

Progressive delivery provides the control layer that AI-generated code needs. The Unleash MCP server brings progressive delivery directly into the GitHub Copilot workflow. Developers get the speed of AI assistance. Organizations get the delivery stability they require. And users get software that works reliably.

The DORA research shows that AI adoption correlates with stability drops. But correlation is not causation. Teams that combine AI-assisted development with progressive delivery practices can break that pattern. You can ship faster and more safely.

That is the opportunity in front of engineering leaders today. Not choosing between AI and stability, but achieving both through progressive delivery.

Share this article