Ignorance may be Strength : ChainLedger

I'm going to write a blockchain thingy.

I've been aware of blockchain technology for a long while, and "involved" in some sense since 2014 when I met with the Ethereum founders shortly before they held their token sale that launched the public chain. As a financial tool, and in terms of the technology, I was not impressed, but what did impress me - and the team I was working with at the time - was the ability to have a globally mediated, public record of arbitrary facts. The key benefit here is "trust".

Who can you trust? This is always a very difficult question, and we usually introduce systems with "checks and balances" generally summarized by the Latin tag "Quis custodiet ipsos custodes?" (who watches the watchers?). With blockchain, everything happens out in the open, and we can all watch. In order to commit fraud on the blockchain, all miners must agree to commit the fraud, and even then anybody who is tracking the chain can see that it happened - even if they can't stop it. Essentially, while committing fraud on a public blockchain may just about be technically possible, it isn't feasible. It's a bit like trying to commit bank fraud when your every move is being reported live on Cable TV.

Around the same time in 2014, I was talking to a director of one of the main Wall St banks who was looking at ways to leverage blockchain technology to record contracts between the banks in a way that would allow the fact of the contract having happened to be a matter of public record without actually exposing the details of the contract. In the event of a subsequent dispute about the terms of the contract, the contract could be opened for judicial examination, and would have to match the data stored on the blockchain. In this way, the "trust" of the blockchain can be extended to private documents.

How would that work? Let's consider tax returns as an example. You want to submit a tax return to the government, but while you don't want to make it public, you don't really trust the government. On the other hand, they don't trust you, either. What you can do is sign your tax return with a digital signature, and then they can't fake it. But what if (for whatever reason) you end up submitting multiple versions? They can delete some of them and claim you only submitted the others. It would be good to have a record of what you submitted.

What could happen is that when you submit your return, the government redirects you to a webpage which contains your return. That webpage has a unique URL. You can see the document and validate its hash, check it thoroughly, and then both you and the government computer could sign the same entry on the blockchain, asserting that at this moment, you both see the same thing. Auditors, judges, whoever could later come back and inspect that blockchain and compare the contents of the webpage with the hash. If you submit multiple versions, each submission, along with its date, is right there on the blockchain. Cast-iron security.

So what is stopping us putting (a record of) all our most important documents on a blockchain? In short, performance: bitcoin can manage something like 10tx/s. To handle (say) 100m US tax returns in the month before the filing deadline, you would need something like 40tx/s - without considering peaks in the load. And that's just for one application.

A lot of things in computer science are trade-offs. But there are a handful of hard-and-fast rules, and possibly the most important (and annoying) of them all is the CAP Theorem, which is one of those "choose any 2" rules, which basically says you can have any two of consistency, availability and tolerating network failures. And bitcoin and ethereum are big on consistency - before a block can be mined, it must be agreed what the block is. Understandably enough.

What I am going to do here is choose the other two, and say "consistency, who needs it?" Or, to be more precise, I will work towards consistency, but at any given moment, latency in the system may mean that different nodes see different things. But there should be a point in the past where they all agree on what the situation was.

What we're doing here

OK, enough general background and waffle. Why am I doing this, and specifically, why am I doing this here, on a blog that is supposed to be about things I know nothing about?

Well, to be fair, I don't have a deep understanding of existing blockchains, but I don't think I'm going to understand it here.
I think I do understand the fundamentals of a blockchain, and I hope to prove that (to myself, if nobody else).
This idea has been bouncing around in my head for a decade now, and I think I have it sorted out, but the only way to be sure is to try it.
This place is as good a place as any to "work out loud" and I'd be interested in feedback from anyone who's following along.

I'm going to build this in Go, for "all the right reasons": it is fast; it compiles to a lightweight binary; and it is supported natively by AWS, so I can deploy it there. It is also the case that while I have used Go before, I haven't done that much with it, so this will be an experiment, and hopefully interesting to people out there (especially those that haven't used Go).

I have done a lot of work with truly distributed systems, but I'm aware that makes me one of an ever-smaller minority of software professionals. Computer systems have vastly different characteristics when they are:

just a simple program or script running in one place;
a multi-threaded program or script running in one process;
a set of programs that are working together, whether on one box or in a data center;
distributed around the world.

I am hoping to talk about that, along with some of the ways in which we can develop a system in Go on a single developer machine which still has all the characteristics of a system we can deploy around the world.

And then I'm hoping to actually deploy it around the world using AWS, and see if it does in fact work.

What am I planning to build?

So the plan is to build a distributed ledger with multiple nodes which can record "transactions" and then encode them all on a blockchain.

The set of nodes is "pre-defined" and each node knows that all of the others exist and can communicate with them.

The goal will be to have a system with 4 nodes deployed to AWS (using Lambda and DynamoDB) around the world and to have the system able to "keep up" with each node processing 1000tx/s.

Each transaction is a simple record which has three parts:

A URL which is supposed to point to the "content" of the transaction; ChainLedger is completely agnostic about this, except that it must be a well-formed URL: the URL does not need to point to an actual resource, and there is no requirement that it be readable;
A SHA-512 hash of the contents of the transaction document at the time the transaction is submitted; since ChainLedger has no access to the document, it cannot independently verify its accuracy but depends on the signatories all asserting the document to have the same hash;
One or more signatories, each of which consists of the id for a (previously registered) user and a signature for that user; the content that the users sign is a concatenation of the content URL, the hash and the user ids of all the signatories in collated order; ChainLedger must have access to the public keys for each of the signatories' signing keys and must assert that the signatures are valid before accepting a transaction.

The chain consists of:

Once accepted, each transaction is rehashed (with the signatures included, and with a timestamp when the transaction was accepted); this is the ID of the transaction. The accepting node will then sign this hash as a guarantee none of it can be changed.
Each node will divide the transactions it receives into blocks based on time. Each block will contain the ID of the previous block and the IDs of the individual transactions in ascending collated order.
Each node will produce a hash of the block and then sign that with its own signing key. The hash is the ID of the block.
Each node distributes all of its work to all the other nodes.
Each receiving node double-checks the work of all the originating nodes.
On a regular interval, each node will produce a summary record of the world as it believed it to be a short time in the past. All the nodes should produce the same summary records for the same set of nodes at the same point. However, in the presence of latency or network partitioning, the nodes may have access to different sets of blocks, in which case the summaries may diverge. Unlike transactions and blocks, which are only created once and exist exactly as they are, multiple versions of summary records may be produced when more data is available. However, no history will ever be updated; all the summaries for a given time will continue to exist but can be distinguished by the nodes whose data they contain.
It is an error for two nodes to disagree about a summary if they are working on the same data.

There is a lot to unpack there, but this is just an overview, not a waterfall specification. We will come back to all of this in a lot more detail later, and hopefully show how it is possible to consider - and discover - more error cases for a distributed system using automated, repeatable testing.

A Note on Performance

It is often the case with distributed systems that if you add more nodes, you add more performance because they share the work. In terms of the "nodes" I am discussing here, that is not the case because of the requirement for them to cross-check each others' work and duplicate it. In fact, the more nodes there are, the slower the system will go (or else, the more resources it will use).

However, internally, each node can be distributed in a more conventional way in which the work is shared and performance increases with the scale of an individual node. We will probably look into this in great detail as we progress on the journey.

The plan

I've divided this project into four phases, and hopefully I make it all the way through:

The first phase is a quick whip-through building up a client and a server that can build a non-distributed ledger. Inasmuch as we will slow down to catch breath, I will be talking about my experiences of Go, but basically I'm just going to build something out really quickly.
The second phase is to actually build a "distributed" system with cross-checking and a blockchain. But it's still all going to be on one machine and just looking at "the happy path".
The third phase is going to remain on one machine (and, I hope, just one process) but I'm going to introduce testing elements that enable us to simulate things going wrong, and then to make things go wrong automatically so that we can test that the blockchain is resistant to that.
The fourth phase will hopefully be to put some AWS infrastructure in place to deploy this around the world and see if it can actually deliver the desired performance.

Let's go!

Ignorance may be Strength

Wednesday, December 11, 2024

ChainLedger

What we're doing here

What am I planning to build?

A Note on Performance

The plan

No comments:

Post a Comment