From the community: How my team uses Unleash to iterate on our AI code assistant


Codeium is a free AI-powered code assistant. I’m a software engineer on the team building it. We use Unleash to iterate quickly in one of the fastest moving spaces in tech today. 

At Codeium, we leverage generative large language models to provide a developer toolkit integrated directly into our users’ IDEs. This includes code autocompletion, natural language code search, powerful chat capabilities, automated refactoring and more! 

In the nascent and rapidly evolving AI space, it’s been crucial to have an agile and efficient development process. We need a data-driven approach to landing new features and changes.

How do we use Unleash?

This is right where Unleash comes in. 

Unleash has allowed us to experiment with new features through A/B tests. Our tests range from using new code completion models, to including different types of data into the input prompts. 

For example, we developed a new fill-in-the-middle model that takes into account the code context both before and after the user’s cursor. We used Unleash to gradually roll out this feature to our user base. 

When combined with our internal metrics for evaluating performance, we were able to initially deploy to a small fraction of our users. We verified the new model performed better, then scaled out the feature until all users were migrated over. 

This was all possible without deploying any incremental updates. Changing the rollout percentage was as simple as dragging a slider in the Unleash dashboard.

How is Unleash deployed?

In terms of deployment, Unleash was a great fit with our existing stack. 

We used the Unleash Helm Chart to spin up the necessary components in our Kubernetes cluster. 

The only external requirements were a Postgres database to back the Unleash server, plus a method to expose the server external to our cluster for our clients to hit. 

We already had a Postgres instance and ingress controller. This just required creating a new database and ingress rule.

How is Unleash integrated?

Integrating Unleash into the various components in our system was one of the more intricate parts of the setup due to our system’s design. 

Because our code assistant runs as an extension in the user’s IDE (i.e. VSCode, IntelliJ, NeoVim, etc.), we have code running in both the extension on the user’s machine as well as in our API server running in a Kubernetes cluster. 

Adding another layer of complexity, the extensions also spin up a separate binary known as the Codeium language server. This contains the core reusable logic between all the different IDE extensions and runs alongside the extension on the client’s machine. 

We have various experiments in all 3 of these components. Our integrations included using the Unleash Node SDK in our VSCode extension and the Unleash Go SDK in both the client-side language server and cloud API server.


Unleash has for sure helped us iterate faster and make informed decisions when shipping new features. 

After a straightforward deployment and integration period, we had simple and centralized control of our experiments in a single dashboard. 

Unleash has enabled us to scale up and down experiments across our stack without having to release new software. This is especially useful due to the complexities of our system. 

Up next: Insights on the challenges we faced with scaling, plus how we used Unleash Edge to overcome them. 


Share this article