From the community: Front-end testing and feature flags
Strategies for writing tests that give you confidence before you deploy your feature flags in production.
If you Google “testing”+”feature flags,” you’ll find many resources about the ability to do “testing in production,” but not so many about how to integrate it into your existing automated testing stack at earlier stages of the software development life cycle.
This is a practical guide on how to write front-end tests for components that are consuming feature flags, directly or indirectly.
There are two underlying principles in our approach; tests should resemble the way our software is used and developers should be spending more time writing test logic instead of doing test setup.
The code examples below focus on the tech stack we’re using at Harbr*: TypeScript, React, Jest, Storybook and feature flags provided by Unleash.
Note: For some added context, Harbr is a white label platform that our customers heavily customize and use to interface with internal and external users.
Because of the diversity of technical environments and the commercial needs of our customers, it’s important for us to carefully control the deployment and availability of features across the various customer-branded Harbr platforms. Unleash is the best way for us to accomplish this.
React Component Tests
Unleash Proxy React Client uses React context to allow easy consumption of a feature flag anywhere in the app. A provider, ‘FlagProvider’, initializes the client that performs requests to Unleash proxy server and, for the basic scenario, a hook, ‘useFlag’, consumes values from React context and filters them by name.
The most obvious way of testing ‘useFlag’ would be to mock its implementation in every test that’s using it. There are two issues with this approach; it leaves parts of the application uncovered (any layers between the provider and the hook) and it also makes our tests susceptible to break if an implementation detail changes.
A better way is to wrap the entry point of our test with FlagProvider. To do this without repeating configuration, we create our own FeatureFlagProvider that wraps the original and internally injects configuration (clientKey, url, etc) as well as default values (bootstrap) for all our feature flags.
Those defaults serve as fallbacks in production and as a baseline for our tests and stories.
Now that our tests are wrapped with a provider, they are calling the Unleash proxy server. Similarly to other API requests, we use msw.js to intercept them on the network level and mock their responses.
We can effectively disable the API by simulating a 304 response and let the test render based on the default values provided by the provider. Alternatively, we can simulate a successful response but in this case, the provider’s defaults will be superseded.
The setup above will have to be repeated across many tests. In fact, if you’re writing integration tests in which there are layers upon layers of components under the entry point, you may have to repeat it across almost every test.
In a hypothetical scenario, if we wanted to consume a flag in one of the core components like a ‘Button’, we’d immediately have to modify the setup of more than 500 tests!
To prevent that from happening we created a custom render method that wraps the component with all the providers we may need.
We use that custom render for all our integration tests, regardless of whether there’s a feature flag already consumed directly in a component, indirectly via a nested component, or not at all; we want our tests to be “feature flag ready”. Taking that one step further, we moved the msw.js request interceptor to a global lifecycle method in our setupTests.js file.
At this point, we are in a position to write an integration test without having to do any setup for feature flags. We can now do this without worrying whether one of the nested components is using or may use one in the future.
Testing multiple permutations
How do we test both states of a feature flag? Do we need to test all combinations of flags in all components?
The answer ultimately comes down to good component architecture. If the flag controls logic that is safely encapsulated within the component’s boundaries then no, we don’t need to run all combinations for all components. We only need to test the flag closer to the component that’s directly consuming it.
To set the value of a flag to something other than the default, we can override the globally mocked request interceptors with local ones or override the default values we pass to custom render.
Storybook & visual regression tests
Storybook is a very popular library which we use as internal component documentation, interactive development environment and as the base of our visual regression tests.
We write stories for our components, starting from building blocks like a Button to more complex ones like an InfoDialog and even full pages that interact with the api like a `UserProfileEditForm`.
The challenges we face here are similar to the ones in our tests and so is the solution. We wrap all our stories with the FeatureFlagProvider we created and intercept requests using msw.js. The only difference is that in this case we’re using Storybook’s decorators.
Similarly to how we test feature flags in Jest, whenever we want to write a story that renders the non-default flag value, we add a local decorator that supersedes the global one.
Taking a button as an example again, should we choose to put a change to its primary color behind a feature flag, we’ll add a story for the new color next to the old one. That will result in two screenshots in our visual regression tests for the button while every other component that’s using it will render only the one we chose as default.
Feature flags are a powerful tool that allow the running of experiments and tests in production but can make writing automated tests more cumbersome.
To integrate them in a way that maximizes the confidence you get from your tests while also maintaining a seamless developer experience one must consider the following:
- Avoid testing implementation details, mock the boundaries of your application
- Invest in your test setup with globally available, sensible defaults
- Override defaults when testing components closer to the ones using feature flags
About the Author
Ioannis Papadopoulos is a software developer with a background in electrical and computer engineering. He has spent more than a decade specializing in enterprise software applications and is particularly passionate about finding ways to increase development productivity and efficiency through the use of good software architecture and effective tooling.
Ioannis works at Harbr, a data commerce platform based in London. Harbr is a white label storefront and collaboration environment that its customers use to build and operate a data commerce business. Harbr platforms include DataHub by Moody’s Analytics and Discovery Platform from CoreLogic.