Unleash

What does AI-assisted development look like in a big open source project?

With the advent of LLMs and AI coding assistants, we’re seeing more and more reports of companies claiming “x% of our code is written by AI“. Most of those reports are pure claims with no way to verify the extent to which AI coding assistants are being used.

Do you use it as an elaborate auto-completion engine? Or for building major features? Or maybe fixing small bugs? Do you use it to write code based on very detailed instructions or does it need to figure things out from a brief one-sentence prompt? Also, most projects are not open source, so we can’t see the actual code with detailed labelling for AI usage.

At Unleash, we’re in a unique position because a significant part of our code is open source. Our main repo is a full-stack Typescript application that’s more than 10 years old, actively worked on by 10+ developers who make 30-50 commits every week and deploy to production multiple times per day.

Since our customers often ask us about our experience with AI coding assistants, we thought it would be useful to share our findings.

How we measured things

At the end of September 2025, we started labelling all our pull requests (PRs) with an AI usage labelling scheme, based on Addy Osmani’s Beyond Vibe Coding book.

We settled on the following notation:

It’s important to note that we’re considering AI usage labels for designing code and setting boundaries for a solution, not writing actual code.

If you create a very detailed problem description in English that almost matches the programming language level of precision, then we consider it mostly-human. On the other end of the spectrum, all-ai means you gave the AI a one-sentence prompt like “remove flag X” and it followed existing patterns to remove the flag. If you need to give it a few more hints, then it turns into mostly-ai. When you end up in a 30-minute pairing session with your coding assistant going back and forth, then it’s half-half.

It’s also important to note that none of these categories means “vibe-coding”. It’s all AI-assisted engineering, and even with all-ai solutions the human is always reviewing the final PR. However, having these labels can set some expectations for the code reviewer.

Writing this article is part of the gamification process to encourage more developers to participate.

Finally, this report is not intended to measure individual developer performance or check productivity, since software development is paradoxically not about producing code.

Findings

Once each PR is labelled with AI usage, conventional commits, and an explicit author, we started asking the following questions:

  • What percentage of the PRs fall into each of the categories?
  • What type of work falls into each of the categories?
  • How does the AI usage affect PR size?
  • What are AI usage archetypes among our developers?

Here are our answers from 3 months of analysis.

AI usage by assistance extent

Complete AI Usage Analysis: 285 PRs marked with AI labels since Sept 2025.

AI Usage Distribution:

  • 42.8% all-human (122 PRs)
  • 14.7% mostly-human (42 PRs)
  • 22.8% half-half (65 PRs)
  • 16.5% mostly-ai (47 PRs)
  • 3.2% all-ai (9 PRs)

Key Findings:

  1. 57.2% of PRs use some form of AI assistance (163 out of 285).
  2. Collaboration over automation: when developers use AI, they prefer collaborating with them rather than full delegation.

AI usage by work type

We’re using conventional commits to label different types of work: features, fixes, refactoring, chores, documentation.

Which type of work is attracting humans and which is attracting AI?

Work Type AI Adoption Dominant Pattern
Refactoring 88.9% mostly-ai (44.4%)
Documentation 88.9% mostly-human (77.8%)
Features 75.4% half-half (39.3%)
Fixes 41.3% all-human (58.7%)
Chores 28.2% all-human (71.8%)

Key Findings:

  1. Clear specialization: AI was the most used for new features and refactoring. Humans retain debugging and maintenance.
  2. Features favor collaboration: 39.3% half-half, the most collaborative work type.
  3. Refactoring is the most AI-friendly: 61.1% use heavy AI assistance (mostly-ai or all-ai). Mechanical transformations suit AI assistants well.
  4. Bug fixes stay human: Debugging often required context and intuition that AI assistants lacked.

AI usage and batch size

Our work is organized around PRs. We try to create small, short-lived PRs that we merge early and often (low holding cost) and deploy continuously (low transaction cost) to get feedback from production instantly and reduce risk of large breaking changes.

We wanted to find how AI usage affects PR sizes.

AI Category PRs Avg Lines Median Lines Avg Files Median Files
Mostly-Human 43 453 103 16.0 4
Half-Half 65 140 85 4.7 3
Mostly-AI 46 123 58 5.3 4
All-AI 9 100 74 3.6 3
All-Human 122 65 19 3.3 2

Key Findings:

AI sweet spot: 58-85 median lines

  • half-half and mostly-ai cluster around medium complexity
  • Predictable, well-scoped tasks suit AI assistants best

Humans handle both extremes

  • Tiny fixes (1-line changes): all-human
  • Massive refactors (10,000+ lines): mostly-human with AI assistance

AI usage by developer

Here we shift our focus from work to the people doing the work.

Do certain developers use AI differently? How is someone using AI differently from me so that we can compare our workflows and learn from each other? Why is the other person a skeptic?

By identifying how different people use it, we can find who we should talk to and learn from.

Rank Developer Total PRs All-Human Half-Half Mostly-AI Mostly-Human All-AI AI %
1 A 111 47 35 19 8 2 57.7%
2 B 60 46 2 0 9 3 23.3%
3 C 35 4 17 10 3 1 88.6%
4 D 18 11 1 3 3 0 38.9%
5 E 15 0 5 7 3 0 100.0%
6 F 13 4 3 0 6 0 69.2%
7 G 11 2 0 1 8 0 81.8%
8 H 7 1 2 2 2 0 85.7%
9 I 6 5 0 1 0 0 16.7%

Key Findings:

Wide variance: 16.7% to 100% (weighted: 8.8% to 58.3%)

  • AI usage is preference-driven, not mandated

Four archetypes emerge:

  • AI First (76-100%): 4 developers
  • AI Balanced (51-75%): 2 developers
  • AI Cautious (26-50%): 1 developer
  • AI Minimal (0-25%): 2 developers

Is this transferrable to other projects?

We ran the same analysis on our private repo with enterprise features (102 PRs).

The core patterns transfer: collaboration over automation holds in both repos, with similar work type patterns and developer archetype distributions.

Executive Summary

How much AI-assisted coding?

  • 57.2% of PRs use AI (163 of 285)
  • Collaborative use dominates over full delegation

What type of work is AI- vs human- centric?

  • AI-Centric: Refactoring (88.9%), Documentation (88.9%), Features (75.4%)
  • Human-Centric: Bug fixes (41.3%), Chores (28.2%)

Developer Archetypes:

  • AI First (76-100%): 4 developers
  • AI Balanced (51-75%): 2 developers
  • AI Cautious (26-50%): 1 developer
  • AI Minimal (0-25%): 2 developers

PR size with and without AI:

  • Humans handle both extremes (tiny fixes and massive changes with minor AI help)
  • AI assistants are best for predictable, medium-complexity tasks

Note: This article itself was written by a human, edited by AI (mostly-human), while reports were generated in half-half mode on 2 different machines.

What’s next?

Here are some ideas for future investigation:

  • Long-term trends: Do we use AI assistance more over time? Or do we go back to more human-written code?
  • Do developers want to participate in these experiments? Or is AI usage tracking destined to become as boring as IDE usage?
  • How does AI usage differ across codebases in different programming languages (e.g. in our SDKs)?

Share this article