Understanding GrowthBook’s A/B Testing Features

June 18, 2025

Article by Justin Dunham

GrowthBook is a warehouse-native A/B testing and feature flagging platform that enables teams to run experiments while maintaining control over their data. Built as an open-source solution, GrowthBook serves thousands of product teams who need reliable, scalable experimentation capabilities. This article explores GrowthBook’s A/B testing features in detail, providing marketing and marketing ops practitioners with a comprehensive understanding of the platform’s capabilities.

Core approach

GrowthBook structures A/B testing around two primary concepts: feature flags and inline experiments. This dual approach gives teams flexibility in how they implement and manage tests.

Feature flags serve as the foundation for most experiments in GrowthBook. These flags control which features users see and include conditional logic called “rules” that determine experiment participation. Each feature flag can have multiple rules that target different user segments or enable experiments for specific portions of traffic.

The platform uses deterministic hashing to assign users to variations. This means the same user will always receive the same variation as long as the experiment settings remain unchanged. You can run experiments across multiple pages or applications while maintaining consistent user experiences without storing additional state or cookies.

GrowthBook’s warehouse-native architecture means experiments work with your existing data infrastructure. Instead of requiring you to send event data to a third-party service, the platform connects directly to your data warehouse or analytics tools like Mixpanel and Google Analytics. This approach maintains data control while avoiding vendor lock-in.

The platform supports both server-side and client-side experimentation. Server-side experiments make assignment decisions on your backend, avoiding flicker issues and enabling complex tests that span multiple application components. Client-side experiments handle assignment in the browser, making them ideal for testing visual changes.

Setup and configuration

Creating an experiment in GrowthBook starts with defining your hypothesis and selecting your implementation method. You can build experiments using feature flags, inline code implementations, or the visual editor for front-end changes.

For feature flag experiments, you create a new feature and add an experiment rule. This rule defines the percentage of traffic to include, the targeting conditions, and the variations to test. The platform automatically handles user assignment based on your specified hashing attribute, typically a user ID or session identifier.

When setting up experiments, you define your target audience using targeting attributes. These attributes can include user properties, device information, geographic location, or any custom data points you pass to the SDK. The targeting system allows complex conditional logic to ensure experiments reach the right users.

Traffic allocation happens at the experiment rule level. You can set the overall percentage of users to include in the experiment, then split that traffic across your variations. GrowthBook supports uneven splits, so you might allocate 50% of experiment traffic to control and 25% each to two treatment variations.

The platform includes safety features like mutual exclusion groups to prevent users from seeing conflicting experiments. You can also set up activation metrics to filter out users who were assigned to an experiment but never actually exposed to the treatment.

Targeting and personalization

GrowthBook’s targeting system operates through attributes you define and pass to the SDK. These attributes can be simple values like user ID or complex objects containing multiple properties. The platform evaluates targeting conditions locally within the SDK, ensuring fast performance without external API calls.

Targeting granularity extends beyond basic demographic splits. You can target based on user behavior patterns, subscription status, feature usage history, or any data points available in your application. The conditional logic supports operators like equals, not equals, greater than, less than, and pattern matching for strings.

Segmentation methods include both pre-experiment targeting and post-experiment analysis segments. Pre-experiment targeting ensures only relevant users enter the experiment, while analysis segments let you examine results across different user cohorts after the experiment runs.

The platform supports dynamic targeting that updates as user attributes change. For example, you might target users who have made more than three purchases, and users will enter or exit the experiment as they meet or no longer meet this criteria.

Geographic targeting works through IP-based location detection or custom location attributes. You can target by country, region, city, or timezone, enabling region-specific experiments or rolling out features gradually across different markets.

Experiment types

GrowthBook supports traditional A/B tests with two or more variations. There’s no technical limit on the number of variations you can test, though statistical considerations suggest keeping the number reasonable to maintain adequate sample sizes per variation.

The platform handles multivariate testing through feature flags that control multiple elements simultaneously. You can test combinations of different page elements by creating separate feature flags for each component and analyzing their interactions.

Split URL testing works through the visual editor or by using feature flags to control routing logic. You can direct different user segments to entirely different page versions while maintaining proper experiment tracking.

Feature rollouts represent another experiment type, where you gradually increase the percentage of users seeing a new feature while monitoring key metrics. This approach helps identify issues before full deployment while collecting performance data.

Holdout experiments let you measure the long-term impact of features by permanently excluding a small percentage of users from seeing certain improvements. This approach helps quantify the cumulative value of your optimization efforts.

Variation management happens through the GrowthBook interface for feature flag experiments or directly in code for inline experiments. The visual editor generates variation code automatically when you make changes through the WYSIWYG interface.

Data collection and tracking

GrowthBook’s warehouse-native approach means it works with your existing event tracking infrastructure. The platform supports direct connections to SQL databases including PostgreSQL, MySQL, BigQuery, Snowflake, Redshift, and others. It also integrates with analytics platforms like Google Analytics, Mixpanel, Amplitude, and Segment.

Event handling relies on your existing tracking implementation. When users interact with your application, events flow to your normal analytics destination. GrowthBook then queries this data to calculate experiment results, joining experiment assignment data with conversion events.

The platform includes built-in support for 15 common data sources with customizable data source configurations for unique setups. You define metrics by writing SQL queries that specify how to calculate success measures from your event data.

Conversion tracking works by identifying events that occur after users are exposed to experiments. GrowthBook automatically handles attribution windows and can track multiple conversion events per experiment. The platform supports both conversion rate metrics and revenue-based calculations.

Data quality checks include Sample Ratio Mismatch (SRM) detection to identify assignment problems and other statistical tests to ensure result validity. These checks run automatically and alert you to potential data issues that could affect experiment conclusions.

Statistical analysis

GrowthBook offers multiple statistical engines to match different organizational preferences and requirements. The platform supports Bayesian, Frequentist, and Sequential testing approaches, with Bayesian statistics as the default method.

The Bayesian engine provides probability-based results that answer questions like “What’s the probability this variation is better?” This approach gives more intuitive results for business stakeholders and handles multiple metrics naturally without correction factors.

Frequentist analysis uses traditional hypothesis testing with p-values and confidence intervals. This approach aligns with academic statistics training and provides familiar significance testing frameworks.

Sequential testing enables peeking at results during the experiment without inflating false positive rates. This method adjusts for multiple comparisons automatically, letting you make decisions as soon as sufficient evidence accumulates.

The platform includes advanced techniques like CUPED (Controlled-experiment Using Pre-Experiment Data) to reduce variance and detect smaller effects. Multiple comparison corrections handle situations where you’re tracking many metrics simultaneously.

Test duration recommendations appear in the interface based on your traffic levels and minimum detectable effect requirements. The platform calculates sample size requirements and provides guidance on when experiments have sufficient power to detect meaningful differences.

Reporting and insights

GrowthBook’s results dashboard displays experiment performance across all tracked metrics. The interface shows statistical significance, confidence intervals, and effect sizes in a clear, accessible format that non-statisticians can understand.

Visualization includes standard metric charts showing performance over time, distribution plots for understanding variation in results, and segmentation views for examining results across user cohorts. The platform automatically generates charts that highlight important patterns in the data.

Results update automatically as new data flows into your warehouse. You can set refresh schedules to balance data freshness with system performance. The platform shows both cumulative results and daily snapshots to help identify trends.

Data export capabilities include CSV downloads for further analysis and API access for integrating results into other tools. The API provides programmatic access to experiment data, enabling automated reporting workflows.

Alert systems notify stakeholders when experiments reach statistical significance or when unusual patterns emerge in the data. You can configure notifications through email, Slack, or webhook integrations.

Collaboration and workflow

Role-based permissions control who can create, modify, and view experiments. GrowthBook includes roles for administrators, editors, and viewers, with granular permissions for different platform areas. You can restrict access to specific projects or environments based on team structure.

The platform includes commenting systems on experiments and features, enabling team discussions directly within the interface. These conversations create an audit trail of decision-making and help preserve institutional knowledge.

Version control tracks changes to experiments and feature configurations. You can see who made changes and when, with the ability to revert to previous configurations if needed. This versioning extends to metric definitions and data source configurations.

Integration with project management tools happens through API connections and webhook notifications. You can sync experiment status with tools like Jira, Notion, or custom internal systems. The API enables building custom workflows around experiment lifecycle management.

Draft an

Unleash’s Approach to Experimentation

Unleash is a feature management and experimentation platform widely adopted by the largest enterprises in the world. Unleash is built for full-stack experimentation, because modern product teams need more than surface-level insights.

Most experimentation tools stop at frontend metrics like clicks and conversions. Unleash goes further, enabling teams to measure impact across the entire system: user experience, backend performance (latency, error rates, infrastructure costs), and even voice-of-customer signals. This ensures the “winning” variant isn’t just the one with the highest click-through rate, but the one that actually drives sustainable business outcomes.

A critical reason teams adopt Unleash is data ownership. Many platforms require sending raw event data into a black-box system, where success metrics are predefined and rigid. But product decisions are rarely that simple. As Architect Adam Kapos explained during his session at UnleashCon, his team:

“wanted to use our own data pipeline. We didn’t want a solution where we would basically need to send all our data to it and then it would tell us whether the A/B tests are successful. We wanted our data scientists to analyze our results from our data warehouse.”

That flexibility matters. Maybe your definition of retention isn’t just “opened the app on day seven,” but whether users engaged with a specific workflow, feature, or value-driving behavior. Maybe a backend optimization improves conversion but adds unacceptable infrastructure costs. Unleash makes it possible to test these scenarios with your own metrics, in your own warehouse, using the analytics tools you already trust.

This model reflects how high-performing teams actually experiment. Unleash handles the hard part — consistent user bucketing across frontend and backend components — while letting you stream impression data into any analytics platform, from Google Analytics and Mixpanel to Grafana and Snowflake. That way, your experiments remain reproducible, consistent, and analyzable with the definitions that matter to your business.

The result is experimentation that is:

Holistic: Measure frontend adoption, backend reliability, and business outcomes together.
Warehouse-first: Keep data in your pipelines so your data scientists can apply the right definitions of success.
Safe: Progressive rollouts and kill switches ensure experiments never put stability at risk.
Aligned: Product, engineering, and business teams share the same full-stack view of results.

Unleash turns experimentation into a strategic advantage: faster insights, safer releases, and smarter decisions — all across your stack.

Share this article