How we split our codebase down the middle


By far the biggest code improvement we made to Wave was to split our codebase in half.

Our first product was building faster and cheaper money transfer to Africa, by delivering funds directly to M-Pesa and similar systems. That business grew incredibly quickly, but eventually hit a wall: most countries in Africa didn’t have a system like M-Pesa. We realized that this roadblock was actually an opportunity. Instead of just international money transfer, why not build our own mobile money systems in the countries that didn’t have them yet?

To build a mobile money system, we’d need a network of agents throughout each country that users could deposit or withdraw funds at. We decided to bootstrap this agent network using international money transfers, then build the rest of the mobile money system on top of it once it was working. So we made a few tweaks to our existing international money transfer app, adding a “ledger” (keeping track of how much each recipient could withdraw) and “agents” (special Wave user accounts that could process withdrawals for transfer recipients). When it was time to add domestic money transfer, we repurposed our US-to-Africa smartphone app to support Africa-to-Africa as well.

Over the next year we learned a few different things:

Before long, we’d rebuilt our entire user interface, business logic, and infrastructure for domestic transfers, all the way down to the transport layer. But it was still unhappily shackled to the international code base, through our shared abstractions for things like a “user” and “money transfer.” Over time, these started to cause more and more pain:

A couple years after starting the domestic mobile money project, we reorganized the company to split international and domestic transfers into completely different business units, with separate leadership, separate engineering teams, and so on. At this point, it became obvious that they should be separate code bases so that the two teams could work on them with complete independence.

What surprised me about this transition was actually just how separate they already were. I removed huge chunks of our code that were only needed for international transfers, and one of my colleagues on the other team removed all of our domestic code from their codebase. At the end, we compared notes and sloccount output and discovered that the combined total lines of code between the two codebases was lower than the original total! In other words, on net the two codebases were not just not sharing any code, but were actively interfering with each other.

The size reduction was a huge benefit, but the second-order benefit was even bigger: it became way easier to make codebase-wide improvements. Not only did we have less than half as much code to migrate, but we also had less than half as many developers to train on whatever migration was taking place. For the domestic mobile money team, splitting up the codebase unlocked a huge number of other code improvements that ultimately made a huge difference to our tech quality and velocity.

In retrospect, we clearly should have split up our code earlier. The more interesting question is, when exactly should we have done it? What lesson should we actually learn from this? I have a few different ideas.

The biggest lesson for me is actually one of business strategy. In retrospect, we probably could have predicted that international transfers weren’t the best entry point into the domestic money transfer market. If we’d dug into the exchange rate issue more, we would have discovered how hard it would be to convert users from black market remittance to Wave. Not only would this have saved us from viewing our domestic mobile money tech as “international with a few tweaks,” it also would have let us skip months of iterating on a product that wasn’t what we cared most about anyway.

(Why didn’t we see this at the time? I think we weren’t confident enough in our vision. It seemed much less crazy to work on mobile money if it was a small offshoot of our existing business, rather than a completely different product that had a small integration point with our existing business. But fundamentally, it was completely different, and we shouldn’t have convinced ourselves otherwise.)

Given the business strategy we followed, I think starting out with a single codebase was the right move, but we could have made the transition to a split codebase easier in two ways.

First, we could have been more worried about polymorphism and less worried about duplication. Even if we had started with shared User and Transfer abstractions, it became clear almost right away that they had nothing in common. At that point, we should have been way more willing to split apart the concepts into international and domestic versions. In architecture reviews, I’ve learned to look out for this, and bias against reusing existing code or tables for something that seems similar today but likely to diverge in the future.

Second, we could have drawn sharper interface boundaries between modules that would eventually split up. If we had started duplicating instead of polymorphizing, we would have noticed fairly quickly that the international and domestic codebases had a very small interface of shared behavior (in particular, a single endpoint for delivering an international money transfer to a domestic user). At that point, we could have drawn a hard interface boundary and forbidden dependencies between international and domestic code, so that when we did want to split them apart, “all” we had to do was turn some function calls into API endpoints.

Of course, keeping interface boundaries as sharp as possible is mostly just a subset of the general good practice of having cohesive, decoupled modules. But I think it’s useful to have a few particular interface boundaries where we take extra care not to add coupling across them, because we know that they’re especially likely to become a service boundary in the future. “Extra care” can mean things like configuring our ORM to refuse to join across the relevant tables by default (so the database schemas aren’t coupled), or adding a lint rule to forbid the main monolith from importing the service to be split off.


We work on Wave because we think it’s an extremely effective way to improve the world. If that’s how you want to spend your career too, come work with us!