Engineers Do Not Get To Make Startup Mistakes When They Build Ledgers
Not losing track of money is the bare minimum for fintech companies.
And yet, I used to work for a startup that, on every transaction, simply lost track of a couple of cents. As if they fell from our pockets every time we pulled out our wallets.
At this startup, a stock trading platform, the engineering team had followed the mantra of “make it work, make it right, make it fast”; we refused to build a double-entry accounting system.
If you’ve never worked at a startup, you might be shocked. But it’s perfectly normal. Engineers at startups have to buy time from wherever they can, to make room for the key design decisions.
It’s just that…it goes without saying that fintech companies should know better.
We could’ve taken the time to build it right. We could’ve done things better. But we didn’t.
Soon after we launched, we started noticing those few cents. Our vendor acknowledged one amount, and we acknowledged an amount that was almost exact. Just a few cents away.
We were full of excuses: numbers were almost right. They were only a few cents here and there. We called them dancing cents.
Stories like this don’t get aired very often because they’re embarrassing. But I believe it happens all the time.
The problem wasn’t the few cents. Startups are willing to burn way more than that if they can grow exponentially.
The problem was that users were furious. When they bought $5 of Apple stock, and saw an order for $4.98, they instantly hit our customer support chat.
Hell I am pissed off when I get charged a custodial fee on my brokerage account, and that’s a standard fee. Imagine if you got charged for no apparent reason!
Furious users weren’t recommending us. Our startup wasn’t growing as a result.
So what did we do? We bit the bullet. Our CEO ordered customer support to manually compensate those cents when a wrong transaction happened.
We even built a Slack bot.
On desperately trying to fix the dancing cents, I learned a lot. It was an intense course on accounting and engineering, at the highest stakes possible.
And in this week’s article, I’ll tell you all the lessons I took away from it.
I’m Alvaro Duran, and this is The Payments Engineer Playbook. Scroll for while on Youtube and you’ll find tons of tutorials that show you how to pass software design interviews that use payment systems. But there’s not much that teaches you how to build this critical software for real users and real money.
The reason I know this is because I’ve built and maintained payment systems for almost ten years. I’ve been able to see all types of interesting conversations about what works and what doesn’t behind closed doors.
And I thought, “you know what? It’s time we have these conversations in public”.
In The Payments Engineer Playbook, we investigate the technology that transfers money. All to help you become a smarter, more skillful and more successful payments engineer. And we do that by cutting off one sliver of it and extract tactics from it.
Today’s post is about how to design a ledger.
Ledgers are systems that track down money. But money is a pain in the ass to track down. Why? Sam Denby from Wendover Productions put it very eloquently: money is created when it moves.
Most people don’t think too much about it. It’s one of those things that just works. But money is just a way to keep the score. Without an accounting system, that score is meaningless.
Cash is only needed when you need to keep track of value you expect to get or to give in the future. Money, conceptually, is assets in the future.
If you knew nothing about accounting, you’d probably start tracking down what you sell to your customers. This is what we did at the startup: this user paid $5, that user paid 6$, and so on. We naively thought that we only needed to monitor the money coming in, and going out.
What a huge mistake.
I bet that you’ve already noticed that bank transfers aren’t immediate. They’re pretty slow, by Internet standards. Most banks clear transfers on the next business day. Payments online are fast, but transfers aren’t.
This became a headache for us. See, once a payment went through, we were certain that we were going to get some money someday. And we had to use that future money to buy shares on the stock market now.
The way we were tracking money didn’t account for that. We had to have some pending amount somewhere that would clear a few days later, and another amount had to go out immediately to our broker to buy some shares.
Rolling everything back when something went wrong was very difficult. In some corner cases, we didn’t even try.
Single-entry systems make normal use cases very complex. That’s because money is the natural consequence of designing a double-entry system, and not the other way around.
Think about it: It is impossible to decouple the need to record the promise of future assets from the amount that is owed to the owner of those assets.
Not only is it money that makes debt possible: money and debt appear on the scene at exactly the same time. Some of the very first written documents that have come down to us are Mesopotamian tablets recording credits and debits, rations issued by temples, money owed for rent of temple lands, the value of each precisely specified in grain and silver. Some of the earliest works of moral philosophy, in turn, are reflections on what it means to imagine morality as debt-that is, in terms of money.
— David Graeber, Debt: The First 5000 Years
A double-entry system is an accounting method that tracks money at both its source and destination.
There’s been many accounting for developers guides published already (like Beancount’s, Martin Kleppmans’s and Modern Treasury’s). In this article, I’ll get you up to speed, so that you can read those references later with much more context. I’ve included them at the end.
But before I start, I want to highlight the importance of context when building ledgers.
In The Most Important Book In Payments Is a Data Systems Book, I challenged the generalists: Those who believe that they are ready to tackle any kind of engineering problem, particularly money software, with generic engineering practices.
Ledgers fly in the face of the generalists. Building a ledger correctly is:
-
Straightforward in the beginning
-
A business enabler down the line
-
Impossibly difficult to retroactively “make them right”
Single-entry ledgers, like the one we rushed to build at my startup, can provide information about the flow of funds, but not why those flows happened.
If we wanted to know why a particular movement of money happened, we had to stitch together data from different models. Sometimes, it wasn’t even possible.
Single-entry ledgers are undebuggable:
I think [there] is a very kind of important difference [between] the way we view debugging versus the way it’s been viewed traditionally. Debugging is not merely the act of making bugs go away. It is the act of understanding, gaining knowledge, new knowledge about the way the system works.
— Bryan Cantrill, Debugging Under Fire: Keep your Head when Systems have Lost their Mind
On the other hand, double-entry ledgers give you the what and the why. Adding an extra bit of complexity, grounded in thousands of years of use, you can see money flowing from the point of view of any account. Single-entry ledgers journals; double-entry ledgers are complete maps.
That’s why dancing cents were impossible to fix in our single-entry system. Once gone, we didn’t know where they were anymore. It could’ve been the foreign exchange, or the broker’s rounding to even mechanism, or the FINRA TAF fees that were collected at the end of the day.
We couldn’t understand the way the system works. And so, we couldn’t make bugs go away.
Had we had a double-entry accounting system, each and every cent would’ve been accounted for. That’s because, under a double-entry system, money always moves from one account to another.
That’s what context gives you: the ability to gain knowledge.
And with that said, let’s discuss the data model of a double-entry ledger.
The very first thing that many engineers do when they want to track money is to do that within their domain models. I call that the balance as property approach: when the Order has a price attribute, or when the expenses table has an amount column, you’re doing exactly that.
Like many bad designs, this approach works best when you’re starting out. It lets you go fast. But, over time, reporting becomes more sophisticated and slow, and payment processing and analysis becomes more difficult.
If your overnight report job takes multiple hours to run, using the balance as property approach is probably the underlying cause.
If you’re about to build a ledger from scratch, I suggest you treat it as a completely separate data model from which you can derive any financial transaction in your system.
Getting the data model right is crucial in finance. It’s one of the reasons there are many companies that specialize in this.
Ledgers are conceptually a data model, represented by three entities: Accounts, Entries and Transactions.
Most people think of money in terms of what’s theirs, and Accounts are the representation of that point of view. They are the reason why engineers naturally gravitate towards the “simplicity” of single-entry systems. Accounts are both buckets of value, and a particular point of view of how its value changes over time.
Entries represent the flow of funds between Accounts. Crucially, they are always an exchange of value. Therefore, they always come in pairs: an Entry represents one leg of the exchange.
The way we ensure that Entries are paired correctly is with Transactions. Ledgers shouldn’t interact with Entries directly, but through the Transaction entity.
Not every ledger has Transactions, but I think they’re necessary. In Building a powerful Double Entry Accounting system, Nubank’s Lucas Cavalcanti says that they’ve designed Entries with references both to the credit and the debit accounts. That way, the balancing of accounts is enforced by design.
But I think that’s a mistake. Transfers can fail for many reasons, and such a design would make it very complex to represent partial failures.
Transactions are very helpful with that. The Saga pattern fits very well with this way of representing money flows in a ledger.
Sagas trade atomicity for availability. Rather than having a slow transaction locking many tables, you have smaller, discrete actions. That break down the same process, while exposing intermediate checkpoints along the way. At any of those points, another transaction can come in and do their work. That makes throughput better.
Accounting and Engineering are the two ledger systems.
The Accounting system is how the ledger is seen from the outside, the interface. Ledgers are meant to expose its data aggregated from multiple points of views: Reporting, Financial ratios, and Business Intelligence.
The Engineering system is how the ledger sees itself, the implementation. Ledgers are meant to have checks and balances in place, and must ensure the consistency and accuracy of the data. They’re for fintech companies what CRMs are to the sales team: a system of record, the source of truth.
This, by the way, is the key reason why ledgers are so difficult to scale: the Accounting system pulls towards high availability and low latency, whereas the Engineering system pulls towards strong consistency and schema-on-write checks.
Entries can be in one of three statuses: pending, discarded and posted.
Entries are always created with pending status, containing information on the value exchanged, the direction (credit or debit) and the account they reference.
A common mistake here is to use positive and negative numbers in the amount to represent the direction. I’ll tell you more about this in How Accounts work.
Entries are immutable, except for one thing: pending Entries can be discarded in order to create posted Entries.
There’s a clear alternative design here: To create reversal entries to post a previously pending one in order to maintain full immutability. But I think that Modern Treasury’s design is better.
While valid, this approach [create a reversal Entry to undo a pending one] leads to a messy Account history. We can’t distinguish between debits that were client-initiated and debits that are generated by the ledger as reversals. Listing Entries on an Account doesn’t match our intuition for what Entries actually comprise the current balance of the Account.
Discarding solves this problem by making it easy to see the current Entries (simply exclude any Entries that have discarded_at set), while also not losing any history.
— Matt McNierney, How to Scale a Ledger, Part II
In a double-entry system, the collective amount of all non-discarded Entries with type credit and the collective amount of all non-discarded Entries with type debit is the same. Conceptually, this means that no matter how you move money in your pockets, the amount will stay the same.
A few special accounts, those that represent the outside world (consolidated into the Profit and Loss statement) are exceptional: They can’t be balanced. How much money does the world have, anyway?
Entries are created in pairs, and we use Transactions to make sure that everything goes how it’s supposed to go:
-
A Transaction is posted only when its associated Entries are either posted or discarded (and have been replaced by the posted ones)
-
A Transaction that fails partially can be semantically undone with compensating Entries.
Again, this approach makes a lot of sense in the context of Sagas.
From the point of view of a single account, the ledger looks as if it implemented a single-entry system. It is associated with multiple Entries in a one-to-many relationship, and the total balance should match the aggregation of all its entries’ individual balances.
With one caveat: depending on the Accounts normal balance, the way to calculate the total is different.
Remember when I said that using negative numbers in an Entry was a mistake? This is because some accounts are meant to be “net credit” and others “net debit”. “Meant to be” is the key here: just because an Account is supposed to be net debit (e.g., the cash in the bank) doesn’t mean it cannot be negative (e.g., overdrawn).
Bundling the amount with the sign is a huge accounting no-no, because you’re left wondering if being negative or positive is the right state of affairs.
Instead of using negative numbers, Accounts have normal balance: normal credit balance literally means that they are normal when its associated entries with type credit have a total amount that outweighs its associated entries with type debit. The reverse is true for normal debit balance.
Ledgers are the clearest example of a hard computer science problem disguised as a non technical discipline. As many payments engineers know, and as the news will attest to, building ledgers are hard to get right without proper context.
Feel free to bookmark this article and use it as a handbook when you work with ledgers. It should give you enough to keep you going.
I’ve drawn from many references to put this article together. My favorite “Accounting 101 for Engineers” is this article by Anvil called An Engineer’s Guide to Double-Entry Bookkeeping, which includes some basic Python code to guide the conversation along. Django Hordak, the plug-and-play double accounting system Django package, has a thorough explanation on Double Entry Accounting For Developers. If you want to go in depth, Modern Treasury has a fantastic 6 part series on ledgers (Here’s part I).
I mentioned earlier that there are a few high quality “Accounting for Developers” guides in circulation. Beancount’s, Martin Kleppmans’s and Modern Treasury’s are all worth reading, especially because they try to explain accounting from different angles. See which one resonates with you the most.
Or maybe you want the full tutorial. In which case, Peter Selinger’s is the best
And last, companies like Uber, Square and Airbnb have published how they’ve implemented double-entry ledgers in their systems. Now that you know more, feel free to check them out.
This has been The Payments Engineer Playbook. I’ll see you next week.
PS: I want to ask you a question.
When I decided to write a post on ledgers, I already knew there were a few good resources out there to help me. But I’ve been reading and watching and reading again, and after more than 100 hours on it, I haven’t finished yet.
Granted, 100 hours is an estimation. But I would be surprised if the actual number was less than 99.
As you can imagine, I have enough to write a small book on How to Build a Future-Proof Ledger (title TBD). Such a book would make fintech startups like the one I used to work for ready to tackle much more valuable problems than keeping track of their users’ money. Or maybe do it at a much bigger scale, who knows.
Would you buy such a book if it existed?
You’ve made it this far in the article, I’m assuming the topic was interesting. The question that’s percolating in my mind is: would somebody buy a book on engineering ledgers?
I thoroughly enjoyed writing my first book, The Databases of Money, and it would be amazing if enough people express their interest in this new book. If so, I’ll be writing a few more posts on the engineering aspects of a scalable ledger, a “building in public” project kind of thing.
Otherwise, I will drop the topic altogether. No hard feelings.
So, if having a defined engineering strategy for building a double-entry ledger is valuable to you, you can do two things.
The first thing is to give a like to this article on LinkedIn. Not only helps other people find this newsletter, it also allows people who aren’t subscribers the opportunity to express their interest in my writing. A 👍 is all it takes.
The second thing is to tell a friend. I suggest you share this article with this convenient button below. But word of mouth works too!
And if someone you respect shared this article with you, do me a favor and subscribe. Every week I feel I’m getting better at this. That means that my best articles on how to build payment systems are probably yet to be written.
You can only find out if you subscribe to The Payments Engineer Playbook. I’ll see you around.