How to automate feature flag configuration with Terraform
Infrastructure as Code (IaC) and feature management are two powerful methodologies that occasionally work at cross-purposes. Terraform is designed for stability, predictability, and slow, deliberate changes to infrastructure state. Feature flags are designed for speed, runtime control, and enabling product owners to toggle functionality without a deployment. When you try to manage one using the tool designed for the other, you introduce friction that can either slow down your release velocity or cause drift in your infrastructure state.
However, automating feature flag configuration with Terraform is often necessary for strict compliance environments, disaster recovery, or maintaining a “single pane of glass” for system configuration. The challenge lies in defining the boundary: strictly defining what belongs in your Terraform files and what belongs in your runtime dashboard.
TL;DR
- Managing feature flags in Terraform implies treating them as long-lived configuration rather than ephemeral release toggles.
- Two primary providers exist: the official Unleash provider for instance scaffolding (projects, environments) and the community-owned provider for flag logic.
- State drift is the primary risk; if you toggle a flag in the UI, the next Terraform apply will revert it unless you use lifecycle ignore rules.
- The hybrid model (automating project creation and access control via Terraform while leaving flag states to the API/UI) usually offers the best balance of speed and governance.
The tension between IaC and feature management
Before implementing Terraform for feature flags, you must resolve the architectural conflict between static infrastructure and dynamic runtime control.
Terraform operates on a plan-and-apply model. A change requires a pull request, a code review, a CI pipeline execution, and a state lock. Standard IaC pipelines run for minutes or hours depending on organization velocity, too slow for a hotfix or an instant toggle.
Feature flags, conversely, are often used for emergency kill switches or gradual rollouts that need immediate feedback. If a feature causes a production incident, you cannot wait 20 minutes for a Terraform pipeline to flip a boolean value.
Therefore, effective automation requires categorizing your flags. Long-lived flags, such as those used for architectural configuration, circuit breakers, or managing permanent entitlement access, could be good candidates for Terraform. Short-lived release flags intended to live for a few weeks during a rollout are poor candidates for IaC, as they introduce unnecessary churn to your state files.
Choosing the right Terraform provider
If you use Unleash, you have two distinct paths for automation, each serving a different scope of the feature flag lifecycle. It is common to use both providers side-by-side, aliased in your configuration, to handle different aspects of the system.
The official Unleash provider
The official Unleash Terraform provider focuses on instance configuration and governance. It is built to automate the “container” that holds your flags rather than the flags themselves.
Use this provider to manage:
- Projects and environments
- API tokens and access provisioning
- User roles and permissions (RBAC)
- Segments and strategy definitions
This aligns with the philosophy that the platform configuration should be code, but the flag state should be runtime data. Using the official provider ensures that when a new team is onboarded, they immediately have the correct project structure, segments, and API keys provisioned without manual intervention.
The community provider
To define actual feature toggles, strategies, and variants within Terraform code, you will likely need the philips-labs/unleash provider. The community maintained this provider to bridge the gap between infrastructure and application logic before the official provider released wider support capabilities.
You can use resources specifically designed for unleash_feature and unleash_feature_v2 to define the flag name, description, type, and enabling strategies directly in HCL (HashiCorp Configuration Language). While the official provider handles the “house,” the Philips-Labs provider handles the “furniture.”
Implementing feature flags in Terraform
To manage the lifecycle of a flag via Terraform, you first configure the provider to authenticate with your Unleash instance. You need an API token with appropriate permissions for the project you are managing.
Provider configuration
terraform {
required_providers {
unleash = {
source = "philips-labs/unleash"
version = "2.0.1"
}
}
}
provider "unleash" {
api_url = "https://app.unleash-hosted.com/hosted/api"
auth_token = var.unleash_api_token
}
Defining a feature flag resource
Once the provider is configured, you can define a feature flag. The resource requires a name, a project context, and a type.
resource "unleash_feature" "billing_circuit_breaker" {
name = "billing-system-kill-switch"
description = "Kill switch for the external billing API integration"
project_id = "finance-platform"
type = "kill-switch"
archived = false
}
Declaring the resource ensures the flag exists in the specified project, but existence does not equal state. A flag existing is different from a flag being enabled in the production environment.
Managing strategies and environments
To enable the flag using Terraform, you must define strategies. Complexity increases here because you are defining logic rules in HCL that teams traditionally manage in a UI.
The unleash_feature_v2 resource (currently experimental) allows you to nest environments and strategies directly.
resource "unleash_feature_v2" "new_checkout_flow" {
name = "new-checkout-flow"
project_id = "ecommerce-core"
type = "release"
environments {
name = "production"
enabled = true
strategies {
name = "flexibleRollout"
parameters = {
rollout = "50"
stickiness = "userId"
groupId = "new-checkout-flow"
}
}
}
}
By defining the strategy (flexibleRollout) and the parameters (50%), you are codifying the rollout rules. If you run terraform apply, this configuration becomes the source of truth.
Handling state drift and conflicts
The most common operational failure in managing flags via Terraform is state drift. This occurs when the reality of your application state (in Unleash) diverges from the defined intent in your configuration files (in Terraform).
Analyzing a drift scenario
Imagine a scenario where a DevOps engineer defines the flag above with a 50% rollout. Later that day, a Product Manager logs into the Unleash dashboard and increases the rollout to 100% because the release is going well. The application is now serving the new feature to everyone.
Two days later, the DevOps engineer runs a terraform apply for unrelated changes, perhaps adding a database index or updating a security group. Terraform refreshes its state and detects a discrepancy. It sees that new_checkout_flow is defined as 50% in the code, but reads 100% from the API.
The output will look something like this:
~ resource "unleash_feature_v2" "new_checkout_flow" {
~ environments {
~ strategies {
~ parameters {
~ rollout: "100" => "50"
}
}
}
}
Plan: 0 to add, 1 to change, 0 to destroy.
If the engineer does not carefully review this plan, they will unwittingly revert the rollout back to 50%. This “correction” can cause significant confusion for users who suddenly lose access to a feature, or worse, re-introduce a bug that was fixed by a hotfix toggle change in the UI.
Avoiding reversions with lifecycle rules
To prevent Terraform from overwriting manual changes made in the dashboard, you can use the lifecycle meta-argument to ignore specific fields. This instructs Terraform to initialize the resource but stop tracking specific attributes for future updates.
resource "unleash_feature_v2" "new_checkout_flow" {
name = "new-checkout-flow"
# ... other config ...
lifecycle {
ignore_changes = [
environments,
strategies
]
}
}
Applying this configuration creates the feature flag if it is missing but ignores subsequent changes to the strategies or enabled status. You effectively treat Terraform as a catalog for flag definitions (names, types, descriptions) while ceding control of the toggle state to application operators.
Best practices for automation
If you choose to proceed with Terraform for feature flags, establish governance rules to prevent velocity bottlenecks.
Use Terraform for ‘guardrail’ flags only
Resist the urge to manage every ephemeral experiment in Terraform. Limit IaC management to:
- Kill Switches: Permanent flags intended to disable third-party integrations during outages.
- Permissioning: Flags that act as entitlements (e.g., enabling “Beta Features” for internal users).
- Operational Toggles: Flags that control backend configurations like cache timing or log verbosity.
Short-lived feature flags should be created via the API or CLI during the build process and archived once the rollout is complete.
Validate compliance with configuration-as-code
For regulated industries, Terraform offers a distinct advantage: a paper trail for auditors. By defining project access and environment environments in code, you create a verifiable history of who has access to production toggles.
For example, Prudential used this configuration-as-code approach to onboard over 1,000 developers while keeping auditors satisfied. The key is automating the permissions that surround the flags, rather than requiring a pull request for every toggle change.
Adopt a GitOps workflow for permanence
When you manage long-lived flags in Terraform, you inherently adopt a GitOps workflow. Changes to a circuit breaker or global kill switch undergo the same peer review and version control process as your database schemas. This adds latency, which is a feature rather than a bug regarding critical stability flags, and ensures that no single user can accidentally disable a critical billing integration without a second pair of eyes. Platform teams should enforce this workflow for any flag that is expected to live longer than 40 days.
It is worth noting that Unleash provides this same review-and-approval discipline natively through change requests. Change requests require designated approvers to sign off before any flag modification reaches production, creating an audit trail without leaving the Unleash UI. If your primary goal is compliance rather than infrastructure-level version control, built-in change requests may be sufficient on their own. The GitOps approach via Terraform becomes most valuable when you need flag governance tightly coupled with your broader infrastructure pipeline—for example, when a flag change must ship alongside a database migration or network policy update.
Automating project scaffolding (the hybrid approach)
A highly effective pattern is to use the official Unleash provider to manage the ecosystem around the flags, rather than the flags themselves.
In this model, Terraform is responsible for:
- Project Creation: Ensuring every team has a dedicated project.
- Environment Consistency: Ensuring dev, staging, and prod exist with correct API keys in every project.
- RBAC: Automating user groups and assigning permissions.
This ensures that when a developer creates a new flag in the UI, the environment is secure and compliant by default. You verify compliance at the infrastructure level, but you enable velocity at the feature level.
Clean up your state
Standardize on a “destroy” process. If you manage a flag in Terraform, deleting the line of code should archive the flag in Unleash. The philips-labs provider includes an archive_on_destroy (defaulting to true) setting. Ensure your team understands that removing code archives the flag, which effectively turns it off for all users if the code path has not been cleaned up yet.
Integrating configuration into the lifecycle
The goal of using Terraform with feature flags isn’t just to use a specific tool, but to bring order to how features are released. Whether you manage the flags directly or just the underlying access controls, the integration should reduce total risk. Unleash acts as a FeatureOps control plane that sits comfortably alongside your specific infrastructure choices.
By using Terraform to enforce governance (managing projects, environments, and access controls), you establish the guardrails your organization needs while letting product teams manage the daily rollout of features through the Unleash UI or API. A hybrid approach ensures your infrastructure remains immutable and auditable while your software delivery remains agile.
FAQS about Terraform feature flags
Can I manage Unleash feature flag strategies using Terraform?
Yes, using the community philips-labs/unleash provider, you can define strategies, variants, and environment states. However, the official Unleash provider does not support managing individual feature flags or strategies, focusing instead on system configuration like projects and API tokens.
How do I prevent Terraform from overwriting changes made in the Unleash UI?
You should use the lifecycle { ignore_changes = […] } block in your Terraform resource definition. By ignoring fields like environments or strategies, Terraform will create the flag if it is missing but will not revert manual changes made to the rollout percentage or status in the UI.
What is the difference between the official Unleash provider and the community provider?
The official Unleash provider is designed for admin-level configuration, such as creating projects, environments, and managing RBAC access controls. The Philips-Labs community provider includes resources for defining and managing the feature flags themselves, including their strategies and activation status.
Should all feature flags be managed in Terraform?
No, it is generally recommended to only manage long-lived, operational flags (like kill switches or global permissions) in Terraform. Short-lived release flags that change frequently are better managed through the Unleash UI or API to avoid state file bloat and deployment delays.
How does managing flags in Terraform affect audit logs?
When you manage flags via Terraform, the “actor” in the audit logs will usually be the service account or API token associated with the Terraform runner. This can sometimes obscure which specific human triggered the change, unlike changes made in the UI which are attributed to the logged-in user.