Feature flags for high-traffic sites

Melinda Fekete

Documentation Lead

May 19, 2026

The industry treats feature flags as the emergency brake for high-traffic events like Black Friday. You flip a switch to degrade non-essential services and disable heavy database queries. This keeps the core application running while millions of users hit your checkout flow simultaneously.

But if the flag evaluation itself needs a network hop, that emergency brake can become a performance bottleneck. When the central flag server struggles under sudden load, the application can stall waiting for a response, which risks exhausting connection pools and degrading the user experience. The tool meant to save the system can inadvertently contribute to its instability.

Scaling feature flags is an architecture problem, not a marketing claim. High traffic needs local evaluation, where Software Development Kits (SDKs) evaluate flags in-process and edge nodes handle client-side checks. The evaluation engine must move to the application boundary so checks scale linearly with the application without adding network hops.

TL;DR

Remote polling for flag updates overwhelms gateways and authentication services under load.
To solve this, local evaluation downloads the ruleset into memory to eliminate network round-trips entirely.
Edge architecture provides fail-static resilience if upstream connections drop.
Evaluating flags locally keeps sensitive user context data within your application boundary.
Finally, batching metrics at the edge reduces database transactions and infrastructure costs.

The architectural fork: local versus remote evaluation

When an application makes an API call to a central server for every flag check, latency compounds. A single page load might need checking 10 different feature flags to render the correct user interface, determine pricing tiers, and route backend requests. If each check needs a separate network request, the application experiences performance degradation. The architecture fundamentally limits how fast the application can respond to users.

The remote polling penalty

High-traffic systems expose the limits of remote evaluation architectures. Periodic polling for feature flags increases the load on gateways and authentication services. Every client device or server instance asks the central control plane for updates. During a traffic spike, this polling behavior can saturate your own infrastructure with internal traffic. The system spends more compute cycles verifying flag states than serving actual customer requests.

Sequential evaluation compounds latency. A service might wait 50 milliseconds for a flag response before proceeding to the next logic branch. 5 sequential checks add a quarter-second of delay to the user experience. In high-frequency trading, real-time bidding systems, or e-commerce events, that delay can result in direct revenue loss. The network hop exhausts connection pools, delays DNS resolution, and adds TLS handshake overhead every time it evaluates a flag.

The local evaluation model

In the local evaluation model, the SDK downloads the full set of flag rules into memory and evaluates them locally. The application makes zero network calls for the actual evaluation. The ruleset lives alongside the application code.

When the code reaches a feature flag, the SDK checks the local memory structure. The response time drops from milliseconds to microseconds. The central flag server only handles the initial ruleset download and background updates. Following best practices means decoupling the read path from the write path. The application reads from local memory, while the SDK writes telemetry data back asynchronously.

Pushing evaluation to the edge

While local evaluation works for server-side applications where you control the environment, you can’t send the entire ruleset to a browser or mobile app without exposing internal business logic and unreleased features to the public internet.

Client-side constraints

Mobile devices and web browsers have limited memory and processing power. Downloading a configuration payload degrades the initial load time of the application. Worse, sending the full ruleset exposes targeting logic that might reveal upcoming product launches or internal testing parameters.

The solution is moving the evaluation engine to the network edge. Edge nodes sit between the client device and the central flag server. They hold the ruleset and perform the evaluation geographically close to the user. The client makes a single request to the edge node, and the edge node returns only the evaluated results for that user.

Resilience through caching

Edge architecture provides resilience for high-traffic systems. High reliability needs layered resilience mechanisms, including fail-static behavior and cold-start configuration bundles. If the delivery pipeline fails, services must continue operating. A container booting up during a network partition must know its flag state to function correctly.

Teams use Unleash Enterprise Edge as a caching layer that starts and operates even without an active connection to the upstream server. Unleash Enterprise Edge is also distributed across 35+ cloud regions worldwide for low-latency global evaluation. If the central control plane goes down during a traffic spike, the edge nodes continue serving flag evaluations using their cached configurations. The application remains stable. The edge nodes act as a buffer, absorbing the traffic spike and protecting the primary database from connection exhaustion.

Benchmarking the local evaluation model

Network bandwidth is the wrong constraint to prioritize. When you shift to local evaluation, the constraint moves from network latency to local compute resources. You must understand the hardware requirements of your feature flag infrastructure and plan accordingly.

Resource budgeting

You must budget CPU time and memory allocations for feature flags. Storing a large flag configuration on a device needs planning capacity deliberately. A ruleset with thousands of targeting conditions consumes memory. The application must parse this JSON payload and store it in a data structure for lookups.

The memory footprint grows as the organization adds more flags. You must monitor the size of the configuration payload and set hard limits. The SDK must evaluate the ruleset without blocking the main execution thread. For example, a Go application handling a large configuration struct in memory must manage garbage collection carefully to avoid latency spikes during evaluation.

Linear scalability

Despite the memory requirements, the performance ceiling rises dramatically with local evaluation. Server-side SDKs scale linearly with the application because they evaluate flags locally on the server. Adding more application instances automatically adds more evaluation capacity. The central server doesn’t need to scale proportionally with the application traffic.

A single small Unleash instance can evaluate 7.5 trillion flags per day. In a published Locust test, 5 Unleash Edge instances on 1 CPU core each served 10,000 concurrent users at more than 11,000 requests per second with zero failures. Review the Unleash scalability benchmarks to see how local evaluation removes the artificial limits imposed by remote API rate limiting. The application scales based on its own compute resources, allowing teams to handle traffic events without provisioning additional database clusters.

Controlling infrastructure costs at scale

While local evaluation shifts the performance bottleneck to local compute, it also fundamentally changes the financial cost equation. Pricing models based on individual flag evaluations or monthly active users penalize high-traffic applications. When every check triggers a network call and a database read, infrastructure costs scale linearly with traffic. A successful marketing campaign that drives millions of users to your site can result in an unexpected bill from your feature flag vendor.

This pricing volatility leads some teams to hardcode flags or limit their usage, defeating the purpose of the tool. Practitioners often refer to this as a pricing rugpull, where the cost of the tool starts to outweigh its utility because the vendor charges per evaluation or per monthly active user.

Batching metrics at the edge

Batching metrics at the edge breaks this cost curve. Codeium deployed an edge read-replica between their SDK instances and the main server to batch metric writes. Batching metric writes decreased the number of transactions per second in their Postgres database by more than 100x. The edge node collects telemetry data from thousands of local evaluations and sends a single aggregated payload to the central server. The database only processes the summary, not the raw event stream.

Cost predictability under traffic spikes

Wayfair replaced a homegrown system with Unleash to handle peak loads. On a typical busy day, the platform manages over 20,000 requests per second at sub-5ms latency. The transition cost 1/3 of their previous in-house solution, saving millions of dollars annually. Read the Wayfair feature flag case study to understand how decoupling evaluation from the central database reduces total cost of ownership. You pay for the management plane, not the raw volume of evaluations.

The privacy advantage of local evaluation

Beyond performance and infrastructure costs, keeping evaluation local solves a compliance risk. High-traffic applications often rely on targeting rules. You might want to roll out a new checkout flow only to users in specific geographic regions. These users must have a premium subscription and a history of high-value purchases. This needs evaluating flags against detailed user profiles.

Sensitive context data and remote evaluation

The evaluation context provides ambient information used as the basis for targeting. This context often contains personally identifiable information, session identifiers, and customer data. In a remote evaluation model, the application must send this sensitive context data across the network to the third-party flag server for every check. This complicates compliance. Sending user data to external vendors complicates General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA) compliance, needing extensive legal review and data processing agreements.

Targeting that stays inside your boundary

Local evaluation solves this privacy risk as a byproduct of solving the performance bottleneck. Unleash server-side SDKs periodically fetch toggle configurations and store them in memory. The evaluation happens privately within your systems. No context data is sent back to the central server. The SDK applies the ruleset to the local context object, determines the flag state, and discards the context. You achieve high-performance targeting without expanding your data processing footprint or exposing sensitive information to third parties.

Managing toggle debt

The advantages of adopting feature flags — runtime control, targeted exposure, progressive rollouts, kill switches, and instant rollback — far outweigh the technical debt that can accumulate over time. That said, managing the debt remains important. Local evaluation scales the infrastructure, but if teams leave flags in the codebase for hundreds of days, the application can degrade. Fortunately, technical debt can now be cleaned up using Unleash’s stale flag lifecycle visibility and the Unleash MCP server, which fetches lists of stale flags so developers can prompt cleanup as part of their normal workflow.

Feature toggle removals can lag behind additions in major open-source projects. A 2026 longitudinal study found that in some large codebases, toggles stick around for a median of around 700 days, and a small percentage become permanent parts of the code. These permanent toggles often represent abandoned experiments and forgotten migrations that became permanent fixtures in the codebase.

The impact of stale flags

High-traffic systems benefit from managing lifecycles aggressively. Stale flags clutter the local configuration payload. A bloated ruleset consumes unnecessary memory and CPU cycles during evaluation. The SDK must parse a larger JSON file every time the configuration updates, which can slow down the initialization phase of new application instances. The network must transmit more data to every edge node and application instance, wasting bandwidth on dead code paths.

The performance benefits of local evaluation diminish if the ruleset grows infinitely. Cleanup is easier to keep on top of when it is part of the workflow. The Unleash MCP server can fetch lists of stale flags ready for removal, but developers still need to review and prompt the cleanup of flags that reach 100 percent exposure. Maintaining performance benefits from clear deprecation policies. The infrastructure can handle the traffic, but the organization handles the code.

The network hop is the bottleneck

The emergency brake no longer takes down the system. By moving evaluation to the edge, the network hop disappears. The application evaluates flags in microseconds using local memory, while edge nodes absorb client-side traffic spikes without overwhelming the central database. The architecture scales with your application instances, keeping latency low and infrastructure costs predictable.

If your feature flag needs a network call to evaluate, it can become a vulnerability under heavy load. Local evaluation allows your application to scale based on its own compute resources. Caching layers protect execution paths and stabilize infrastructure costs.

FAQs about feature flags for high traffic

How do I budget memory for local flag configurations?

High-traffic systems must allocate CPU and memory for flag storage because large rulesets can impact performance. Practitioners should monitor the size of the configuration payload and set hard limits, as a 100MB configuration stored on-device can slow down initialization and consume significant local resources.

Can I handle client-side evaluations without exposing internal rulesets?

Yes, by moving the evaluation engine to the network edge rather than the browser. Edge nodes geographically close to the user hold the full ruleset and perform the targeting logic privately. The client device makes a single request to the edge and receives only the final evaluated results, keeping sensitive targeting rules hidden from the public internet.

How does metric batching reduce database load?

Metric batching aggregates telemetry data from thousands of local evaluations into a single payload before sending it to the central server. For example, Codeium used an edge read-replica to batch writes, which reduced Postgres database transactions by more than 100x. This prevents high-traffic SDKs from overwhelming infrastructure with individual event streams.

Share this article