Ignorance may be Strength : Resolving a transaction

Now we have received and unpacked a transaction on the server, the next step is to handle the fact that we need multiple clients to all send us "the same" transaction and then we need to reconcile those with each other and produce exactly one StoredTransaction.

But how do we know that two transactions are "the same"? Well, very simply, for two transactions to be the same, they must have:

the same content link (exactly the same URL, stray / or ? at the end of either will cause them not to match);
the same hash of the content (this is EXCEEDINGLY important, as it is the only evidence we have of what is in the document at the alleged URL);
the same set of signatories (that is, their URL identities, since each one will only have one signature).

This process is complete when we have received seen at least one signature for each of the signatories (it is fine if a client sends through the same request multiple times, even with different signatures, as long as the signature they provide each time is valid).

So, our basic process is going to be for each transaction that comes in, we are going to find (or create) an "in-progress" StoredTransaction corresponding to the unique features above. We will then update (or initialize) this with any valid signatures in the submitted transaction. If the signature block is then complete, we will send it on for further processing, otherwise we will store it in a temporary repository. In the fullness of time, we will need to decide on a "housekeeping" strategy for our temporary repository, probably cleaning out everything that is not complete after 24 hours. Given that we are offering "high performance", our node will be expecting to process > 50M transactions per day: if something like 2% of transactions don't get matched, we will be looking at collecting something like 1M "in-progress" transactions every day. On the other hand, it is to be expected that the actual process of signing these documents will probably be carried out by people, and it is thus unreasonable to expect all of the signatories to handle that in less than a day. And finally, before I get to the end of this paragraph, the client end should probably keep these transactions around and track whether or each of them has gone through and, if it was cleaned up, to resubmit it after checking with the other counterparties that they are still planning on signing it.

The Resolver

Let's create a new class (i.e. an interface, a struct and some methods) to handle all this logic, and then have the handler delegate all the work to that class. This is especially true as I don't like web handlers getting too long any more than I like main methods getting too long. "Unpack the arguments, then delegate" is my motto.

This is just about the minimal thing I can write:

package clienthandler

import (
    "github.com/gmmapowell/ChainLedger/internal/api"
    "github.com/gmmapowell/ChainLedger/internal/records"
    "github.com/gmmapowell/ChainLedger/internal/storage"
)

type Resolver interface {
    ResolveTx(tx *api.Transaction) (*records.StoredTransaction, error)
}

type TxResolver struct {
}

func (r TxResolver) ResolveTx(tx *api.Transaction) (*records.StoredTransaction, error) {
    return nil, nil
}

func NewResolver(store storage.PendingStorage) Resolver {
    return new(TxResolver)
}

RESOLVE_TX_INFRASTRUCTURE:internal/clienthandler/resolver.go

FWIW, I agonized for a while about whether to put this in clienthandler, or whether to create a new package and came down on the side of putting it in clienthandler; but this is not a lifetime decision, so I may well move it when I see some more things to move it nearer.

Then we need to call Resolve from the client handler. At the end of the handler in recordstorage.go:

    log.Printf("Have transaction %v\n", tx)
    if stx, err := r.resolver.ResolveTx(&tx); stx != nil {
        // TODO: move the transaction on to the next stage
        log.Printf("TODO: move it next stage")
    } else if err != nil {
        log.Printf("Error resolving tx: %v\n", err)
        resp.WriteHeader(http.StatusInternalServerError)
        return
    } else {
        log.Printf("have acknowledged this transaction, but not yet ready")
    }
}

RESOLVE_TX_INFRASTRUCTURE:internal/clienthandler/recordstorage.go

This assumes we have a field resolver in our object r which is a Resolver. It asks it to resolve the incoming transaction against its exising set of records awaiting resolution. It returns two values: if it fully resolves the transaction, it will return a StoredTransaction that can then be moved on to the next stage. On the other hand, if something goes badly wrong, it can return an error which we can send back to the client to let them know that the transaction request was not processed. And if both are nil, we assume that the resolver has noted that we have signed a request and it has kept a copy of that, and we have nothing left to do.

Where did that resolver field come from? Well, we changed the definition of RecordStorage and the NewRecordStorage function to store it during construction:

type RecordStorage struct {
resolver Resolver
}

func NewRecordStorage(r Resolver) RecordStorage {
return RecordStorage{resolver: r}
}

RESOLVE_TX_INFRASTRUCTURE:internal/clienthandler/recordstorage.go

Now this, in turn, expects to have someone handle a simple store (read database) for all the transactions that have been submitted so far by all the clients. Let's add a storage interface and sub-trivial implementation in storage/pending.go:

package storage

import "github.com/gmmapowell/ChainLedger/internal/api"

type PendingStorage interface {
    PendingTx(*api.Transaction) *api.Transaction
}

type MemoryPendingStorage struct {
}

func (mps MemoryPendingStorage) PendingTx(tx *api.Transaction) *api.Transaction {
    return tx
}

func NewMemoryPendingStorage() PendingStorage {
    return new(MemoryPendingStorage)
}

RESOLVE_TX_INFRASTRUCTURE:internal/storage/pending.go

Finally, we can tie all this together by creating things and passing them to the various constructors in the chainledger/main.go file:

func main() {
    log.Println("starting chainledger")
    pending := storage.NewMemoryPendingStorage()
    resolver := clienthandler.NewResolver(pending)
    storeRecord := clienthandler.NewRecordStorage(resolver)
    cliapi := http.NewServeMux()
    cliapi.Handle("/store", storeRecord)
    err := http.ListenAndServe(":5001", cliapi)
    if err != nil && !errors.Is(err, http.ErrServerClosed) {
        fmt.Printf("error starting server: %s\n", err)
    }
}

RESOLVE_TX_INFRASTRUCTURE:cmd/chainledger/main.go

Let's start testing!

It may seem that we haven't really acheived very much there for a lot of code. That's true, but in my experience, that's what building something out often looks like. You put a lot of plumbing in place, but the actual functionality isn't there. But what we have done is to detach all the bits of code that do something from all the bits that are hard to configure, set up and test.

One of the nice things about the way the Go infrastructure is designed (particularly with goroutines) is that it is quite easy to build a test case with a client and a server all running in the same test (spoiler: we will do that later). But if you don't want to test all that, there is still a lot of setup.

On the other hand, I generally like to build tests with more moving parts than is typical - after all this time, I still haven't come to a conclusion as to whether this is a genuine "preference" (I think it's better), or if I just still haven't figured out how to write tests down at the "unit" level properly so that they deliver the value I'm looking for.

Anyway, we are now at the point where I can see how to wire up a "proto-client" (the test case), the Resolver and a simple, in-memory "pending transaction store" to create tests in such a way that we will drive a number of things including the creation of the "pending transaction store", the resolution algorithm, and force the hashing algorithm to become more realistic.

So let's get started.

Automated tests in Go live alongside the code they test, but with an extension _test.go, and they are placed in the same package, but with _test on the end. On the whole, I like this. It has the advantage that all the code talking about the same thing is in the same place, it doesn't force you to think of a different (but similar) package name, the test has the ability to "pry" inside the source code if it wants to and when you are looking for tests for a particular function, you know where to look. On the other hand, if you want to understand the disadvantages, merely reverse all those statements. For me, the positives outweigh the negatives for me, and I took the same decision when designing my programming language.

Apart from the naming conventions, Go tests are just Go code. Each test has a special name and is passed a special argument of type testing.T which it can use to report results back to the test infrastructure, but there are no special "commands" to check anything: just write your normal tests and then flag up any errors.

So, we can write our first test:

package clienthandler_test

import (
    "net/url"
    "testing"

    "github.com/gmmapowell/ChainLedger/internal/api"
    "github.com/gmmapowell/ChainLedger/internal/client"
    "github.com/gmmapowell/ChainLedger/internal/clienthandler"
    "github.com/gmmapowell/ChainLedger/internal/storage"
    "github.com/gmmapowell/ChainLedger/internal/types"
)

func TestANewTransactionMayBeStoredButReturnsNothing(t *testing.T) {
    repo, _ := client.MakeMemoryRepo()
    s := storage.NewMemoryPendingStorage()
    r := clienthandler.NewResolver(s)
    tx, _ := api.NewTransaction("https://test.com/msg1", types.Hash([]byte("hash")))
    u1, _ := url.Parse("https://user1.com/")
    pk, _ := repo.PrivateKey(u1)
    u2, _ := url.Parse("https://user2.com/")
    tx.Signer(u1)
    tx.Signer(u2)
    tx.Sign(u1, pk)
    stx, err := r.ResolveTx(tx)
    if stx != nil {
        t.Fatalf("a stored transaction was returned when the message was not fully signed")
    }
    if err != nil {
        t.Fatalf("ResolveTx returned an error: %v\n", err)
    }
}

FIRST_TRIVIAL_TEST:internal/clienthandler/resolver_test.go

Because the code is in a _test package, you need to explicitly import the production code from the same directory, which feels a little strange to me, but I'm sure I'll get used to it.

This test has been carefully crafted to be the one that works with the "dummy" code that I put in the resolver. But it is a valid case: when you enter a first transaction with two signers and only one signature, the resolver is going to accept it without complaint, but is not going to return a StoredTransaction for further processing.

The test is named TestANewTransactionMayBeStoredButReturnsNothing. The Test here is an ongoing part of the naming convention in Go, which is (part of) how functions that are tests are distinguished from other (supporting) functions. The rest of the name describes what we are testing - that we can create a new transaction and when we pass it to the resolver we expect it to be stored, but we receive back neither a resolved transaction nor an error. Of course, at the moment the transaction is not stored, but we are not looking inside the pending store to test that - to drive that behaviour, we have to write our second test.

You'll notice that I'm just assigning all the errors to _ which means that basically I'm ignoring them. While testing, I tend to assume that the ONLY things I care about are the things that I'm explicitly testing. The error return values are there to tell you something was wrong with your input (e.g. url.ParseUrl: the URL is not valid). I hope I don't make those kind of mistakes in my tests, but if I do, I assume that the value I want back (e.g. u1) won't be valid, and that will break the test later on.

In order to run the tests in VSCode, it is necessary to add either a launch configuration or a task. I tried adding both; the launch configuration does not work for me at the moment (I'll keep working at that), but I added these task definitions; running testall does work:

{
    "version": "2.0.0",
    "type": "shell",
    "command": "go",
    "cwd": "${workspaceFolder}",
    "tasks": [
        {
            "label": "install",
            "args": ["install", "-v", "./..."],
            "group": "build",
        },
        {
            "label": "run",
            "args": ["run", "${file}"],
            "group": "build",
        },
        {
            "label": "testall",
            "args": ["test", "-v", "./..."],
            "group": "test",
        },
    ],
}

FIRST_TRIVIAL_TEST:.vscode/tasks.json

(The install and run tasks are there because I copied it from Stack Overflow and it didn't seem worth deleting them.)

Now we can run this task and see something like the following output:

* Executing task: go test -v ./...

? github.com/gmmapowell/ChainLedger/cmd/chainledger [no test files]
? github.com/gmmapowell/ChainLedger/cmd/ledgerclient [no test files]
? github.com/gmmapowell/ChainLedger/internal/api [no test files]
? github.com/gmmapowell/ChainLedger/internal/client [no test files]
? github.com/gmmapowell/ChainLedger/internal/records [no test files]
? github.com/gmmapowell/ChainLedger/internal/storage [no test files]
? github.com/gmmapowell/ChainLedger/internal/types [no test files]
=== RUN TestANewTransactionMayBeStoredButReturnsNothing
--- PASS: TestANewTransactionMayBeStoredButReturnsNothing (0.15s)
PASS
ok github.com/gmmapowell/ChainLedger/internal/clienthandler 0.469s
* Terminal will be reused by tasks, press any key to close it.

Now, I don't know if this is the intention, but that rather guilts me into feeling I should have written more tests. But I'm fairly sure that I don't want to write any tests in the cmd directories, and I don't think I can think of anything interesting to test down at the level of types, so I wonder if I can suppress the messages in "some" directories but not all? A quick Google did not reveal anything, but I will keep my eyes open.

Driving Development

Let's add a second test that verifies that if and when the second user comes along, the transaction is resolved. For this to pass, we are going to have to actually have most of the code working, so this will drive the development process at this point.

func TestTwoCopiesOfTheTransactionAreEnoughToContinue(t *testing.T) {
    repo, _ := client.MakeMemoryRepo()
    s := storage.NewMemoryPendingStorage()
    r := clienthandler.NewResolver(s)
    {
        tx, _ := api.NewTransaction("https://test.com/msg1", types.Hash([]byte("hash")))
        u1, _ := url.Parse("https://user1.com/")
        pk, _ := repo.PrivateKey(u1)
        u2, _ := url.Parse("https://user2.com/")
        tx.Signer(u1)
        tx.Signer(u2)
        tx.Sign(u1, pk)
        r.ResolveTx(tx)
    }
    var stx *records.StoredTransaction
    var err error
    {
        tx, _ := api.NewTransaction("https://test.com/msg1", types.Hash([]byte("hash")))
        u1, _ := url.Parse("https://user1.com/")
        u2, _ := url.Parse("https://user2.com/")
        pk, _ := repo.PrivateKey(u2)
        tx.Signer(u1)
        tx.Signer(u2)
        tx.Sign(u2, pk)
        stx, err = r.ResolveTx(tx)
    }
    if stx == nil {
        t.Fatalf("a stored transaction was not returned after both parties had submitted a signed copy")
    }
    if err != nil {
        t.Fatalf("ResolveTx returned an error: %v\n", err)
    }
}

FAILING_SECOND_TEST:internal/clienthandler/resolver_test.go

This is starting to get long and verbose, but we'll deal with that in a moment. When we run it, it fails:

=== RUN TestTwoCopiesOfTheTransactionAreEnoughToContinue
--- FAIL: TestTwoCopiesOfTheTransactionAreEnoughToContinue (0.41s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x2e08ef4]

goroutine 8 [running]:
testing.tRunner.func1.2({0x2ee2680, 0x3053130})
        /usr/local/go/src/testing/testing.go:1632 +0x230
testing.tRunner.func1()
        /usr/local/go/src/testing/testing.go:1635 +0x35e
panic({0x2ee2680?, 0x3053130?})
        /usr/local/go/src/runtime/panic.go:785 +0x132
crypto/rsa.SignPSS({0x2f19fa0, 0xc000080020}, 0x0, 0x0?, {0xc00001c1c0, 0x40, 0x40}, 0xc?)
/usr/local/go/src/crypto/rsa/pss.go:315 +0xb4
github.com/gmmapowell/ChainLedger/internal/api.makeSignature(0x0, {0x2f1ae60?, 0xc0003621c0?})
        /Users/gareth/Projects/ChainLedger/internal/api/transaction.go:89 +0x65
github.com/gmmapowell/ChainLedger/internal/api.(*Transaction).doSign(0xc00009eb00, 0xc0000c2510, 0x0, {0x0?, 0x0?})
        /Users/gareth/Projects/ChainLedger/internal/api/transaction.go:63 +0x4e
github.com/gmmapowell/ChainLedger/internal/api.(*Transaction).Sign(...)
        /Users/gareth/Projects/ChainLedger/internal/api/transaction.go:52
github.com/gmmapowell/ChainLedger/internal/clienthandler_test.TestTwoCopiesOfTheTransactionAreEnoughToContinue(0xc0000a2820)
        /Users/gareth/Projects/ChainLedger/internal/clienthandler/resolver_test.go:58 +0x485
testing.tRunner(0xc0000a2820, 0x2f17ce8)
        /usr/local/go/src/testing/testing.go:1690 +0xf4
created by testing.(*T).Run in goroutine 1
        /usr/local/go/src/testing/testing.go:1743 +0x390
FAIL github.com/gmmapowell/ChainLedger/internal/clienthandler 0.804s
FAIL

Leaving aside how messy this message is, the problem here is the one that I outlined above will happen if you don't test all those error return codes. I asked the repo for the primary key for u2 and it returned nil because it didn't have it along with an error about the fact that it didn't have it. By ignoring that, I ended up with a panic when I tried to use the nil pointer. I could, of course, test the error condition, but (in my mind) it's easier just to fix the problem I need to fix anyway and have the MakeMemoryRepo function add both users.

func MakeMemoryRepo() (ClientRepository, error) {
    mcr := MemoryClientRepository{clients: make(map[url.URL]*ClientInfo)}
    mcr.NewUser("https://user1.com/")
    mcr.NewUser("https://user2.com/")
    return mcr, nil
}

FIXING_SECOND_TEST:internal/client/repo.go

It's worth noting here that, until forced by press of circumstances to do otherwise, I'm going to ignore the fact that MakeMemoryRepo is in part a test function - it is putting test data in there along with creating the store. The curious reader might ask, "so when will you fix that?" and the answer is, of course, "when I want a version of it that doesn't have the test data in it". Since the MemoryClientRepository can't do anything useful without some users in it, that will be exactly when I come up with some other way of initializing it. In my head, that will be from some JSON configuration file at some point in the future, but I don't want to go there yet. I might also find that I want to write multiple tests that need different users installed, and I would have to deal with it then as well.

Moving on, the test now fails in the "desired" way:

=== RUN TestTwoCopiesOfTheTransactionAreEnoughToContinue
resolver_test.go:62: a stored transaction was not returned after both parties had submitted a signed copy
--- FAIL: TestTwoCopiesOfTheTransactionAreEnoughToContinue (0.37s)

Now we can write the code that trivially solves that. First off, in the resolver:

type TxResolver struct {
    store storage.PendingStorage
}

func (r TxResolver) ResolveTx(tx *api.Transaction) (*records.StoredTransaction, error) {
    curr := r.store.PendingTx(tx)
    complete := true
    for i, v := range tx.Signatories {
        if v.Signature != nil && curr != nil {
            curr.Signatories[i] = v
        } else if v.Signature == nil {
            if curr == nil || curr.Signatories[i].Signature == nil {
                complete = false
            }
        }
    }

    if complete {
        return &records.StoredTransaction{}, nil
    }

    return nil, nil
}

func NewResolver(store storage.PendingStorage) Resolver {
    return &TxResolver{store: store}
}

TRIVIAL_RESOLVER_IMPLEMENTATION:internal/clienthandler/resolver.go

We have completed the implementation of the NewResolver method; because the PendingStorage was not previously used, we didn't actually place it in the struct. That now happens.

The ResolveTx method first obtains a pending transaction (i.e. one that was submitted previously) from the store. It then looks at all the signatures on the transaction that has just been presented and:

If there is a signature AND there is a pending transaction, it updates the pending transaction with the signature;
If there is not a signature AND either there is NO pending transaction or the pending transaction does NOT have a signature for this signatory, then we say we will still need more signatures.

If we don't need any more signatures, we return a StoredTransaction and no error; otherwise we return nil for both results.

Likewise, we need a trivial implementation of MemoryPendingStorage:

type MemoryPendingStorage struct {
    store map[int]*api.Transaction
}

func (mps MemoryPendingStorage) PendingTx(tx *api.Transaction) *api.Transaction {
    curr := mps.store[0]
    if curr == nil {
        mps.store[0] = tx
    }
    return curr
}

func NewMemoryPendingStorage() PendingStorage {
    return &MemoryPendingStorage{store: make(map[int]*api.Transaction)}
}

TRIVIAL_RESOLVER_IMPLEMENTATION:internal/storage/pending.go

If you're expecting code that looks like it might be "plausible", this isn't it. At the moment, we are only considering the case where one transaction is being talked about. In that context, even using the map I've used here is overkill. We will soon need an ID to distinguish between pending transactions, but for now we just use entry "0" to indicate that we either do or do not already have a transaction.

Note that this code is making a whole bunch of assumptions that may or may not be true. They are true for our current test cases, so they pass; we will need to go back and add more test cases that probe these assumptions and fail when the assumptions are exposed. We will then be justified in fixing the code to remove those assumptions. We are also not yet filling in the values in the StoredTransaction because the test does not need it.

(While we are critiquing this code, we should also note that it is not thread safe and that we are updating an object returned from the store in memory: surely that will not work? Well, yes it does for now.)

But, for now, both tests pass:

=== RUN TestANewTransactionMayBeStoredButReturnsNothing
--- PASS: TestANewTransactionMayBeStoredButReturnsNothing (0.16s)
=== RUN TestTwoCopiesOfTheTransactionAreEnoughToContinue
--- PASS: TestTwoCopiesOfTheTransactionAreEnoughToContinue (0.20s)

Consistent Signers

The first assumption I'm going to deal with in the code above is that all the signers are in the same order. As it currently stands, the code will happily accept two signatures from u1 if the two transactions have the signatories listed in different orders.

To address this - something we will need to do anyway - we are going to require that the transaction automatically sorts the signatories as they are added to the transaction.

Even though we are adding it in order to have the resolution work properly, it is fundamentally a property of the Transaction, so we will add the test there.

package api_test

import (
    "net/url"
    "testing"

    "github.com/gmmapowell/ChainLedger/internal/api"
)

func TestTwoSignaturesAddedInCollatingOrderStayThatWay(t *testing.T) {
    tx, _ := api.NewTransaction("https://test.com", []byte("hashcode"))
    u1, _ := url.Parse("http://user1.com")
    tx.Signer(u1)
    u2, _ := url.Parse("http://user2.com")
    tx.Signer(u2)
    if tx.Signatories[0].Signer.Host != "user1.com" {
        t.Fatalf("the first signer was %s\n", tx.Signatories[0].Signer)
    }
    if tx.Signatories[1].Signer.Host != "user2.com" {
        t.Fatalf("the second signer was %s\n", tx.Signatories[1].Signer)
    }
}

func TestTwoSignaturesAddedInInverseCollatingOrderAreReversed(t *testing.T) {
    tx, _ := api.NewTransaction("https://test.com", []byte("hashcode"))
    u2, _ := url.Parse("http://user2.com")
    tx.Signer(u2)
    u1, _ := url.Parse("http://user1.com")
    tx.Signer(u1)
    if tx.Signatories[0].Signer.Host != "user1.com" {
        t.Fatalf("the first signer was %s\n", tx.Signatories[0].Signer)
    }
    if tx.Signatories[1].Signer.Host != "user2.com" {
        t.Fatalf("the second signer was %s\n", tx.Signatories[1].Signer)
    }
}

FAILING_COLLATION_TESTS:internal/api/transaction_test.go

These two tests are basically identical: in each case, a transaction is created and the same two signers are added. But in the first case they are added in "alphabetical" order, and in the second, the other order. In both cases, the test asserts that they have been sorted into alphabetical order, which is true in the first instance but (currently) false in the second. Let's fix that!

func (tx *Transaction) addSigner(signer *types.Signatory, err error) error {
    if err != nil {
        return err
    }

    for i, s := range tx.Signatories {
        if s.Signer.String() > signer.Signer.String() {
            tx.Signatories = slices.Insert(tx.Signatories, i, signer)
            return nil
        }
    }

    tx.Signatories = append(tx.Signatories, signer)
    return nil
}

ORDERING_SIGNATORIES:internal/api/transaction.go

The code we have added looks at each of the existing signatories in turn and checks if it is lexically greater than the one to be added. If so, the new one is added "here" (as indicated by the range parameter i) and we return immediately.

As I wrote this code, I considered the "equality" case and quickly realized that you shouldn't be able to add the same signatory twice. Let's add a test case to check that

func TestTheSameSignatoryCannotBeAddedTwice(t *testing.T) {
    tx, _ := api.NewTransaction("https://test.com", []byte("hashcode"))
    u1, _ := url.Parse("http://user1.com")
    tx.Signer(u1)
    err := tx.Signer(u1)
    if err == nil {
        t.Fatalf("we were allowed to add the same signer twice")
    }
}

NO_DUPLICATE_SIGNERS:internal/api/transaction_test.go

In this case, we do note the error return value, since we expect an error. Obviously this fails ...

=== RUN TestTheSameSignatoryCannotBeAddedTwice
transaction_test.go:44: we were allowed to add the same signer twice
--- FAIL: TestTheSameSignatoryCannotBeAddedTwice (0.00s)

So now we need to add another case to addSigner:

func (tx *Transaction) addSigner(signer *types.Signatory, err error) error {
    if err != nil {
        return err
    }

    for i, s := range tx.Signatories {
        if s.Signer.String() > signer.Signer.String() {
            tx.Signatories = slices.Insert(tx.Signatories, i, signer)
            return nil
        } else if s.Signer.String() == signer.Signer.String() {
            return fmt.Errorf("duplicate signer: %s", signer.Signer.String())
        }
    }

    tx.Signatories = append(tx.Signatories, signer)
    return nil
}

NO_DUPLICATE_SIGNERS:internal/api/transaction.go

Multiple Transactions

The next challenge to deal with is the idea that we have multiple transactions in progress simultaneously. This means each one needs to be able to be uniquely identified. In my opinion, the content URL should be enough: it seems a reasonable rule that every transaction should have a unique URL. But I am not prepared to be that draconian, nor indeed to insist that duplicate transactions are never allowed in the system. While I don't understand what the use case would be, it seems that some group of people out there might want to have two separate groups of people agree the same contract. (Even as I write this, I consider that it might be that multiple groups of people agree to sign the same contract at different times, in a way similar to how the U.S. States all ratified the same constitution over the course of a couple of years).

So for now, my definition of the same "pending" transaction is one with the same URL, content hash and current Signatories (their signatures will not be considered for obvious reasons). We have already guaranteed that the signers are in the same order every time.

This again feels like a property of the transaction, so we are going to add an ID method to a transaction, which will return a unique SHA based on building up a hash of these elements, each of the URLs being hashed in string form and terminated with a URL. The content hash will just be used in its binary form.

I'm going to be honest here: this is one of the kinds of code I find it difficult to test. What exactly do I want to test and how accurate does it need to be? I want to test that the code does what I have just described, but in fact I don't really care about that, and I certainly don't want it to be fragile if it depends on some implementation detail. What I really want is to be sure that "every" transaction has a unique ID, although obviously at some level that can't be true. So I'm going to settle for saying that a handful of transactions with superficial similarities, end up with different IDs.

To write this test, I'm going to imagine two values for each of the four properties, and then generate each of the transactions for the cross-product of those properties (that's 16 transactions) and check that all the transaction IDs are unique.

func TestTransactionsHaveDistinctIDs(t *testing.T) {
    all := make([]types.Hash, 0)
    options := [2]struct {
        l  string
        h  string
        u1 string
        u2 string
    }{
        {"https://test.com/tx1", "hash1", "https://user1.com", "https://user2.com"},
        {"https://test.com/tx2", "hash2", "https://user3.com", "https://user4.com"},
    }

    for i := 0; i < 16; i++ {
        tx, _ := api.NewTransaction(options[bit(3, i)].l, types.Hash([]byte(options[bit(2, i)].h)))
        u1, _ := url.Parse(options[bit(1, i)].u1)
        u2, _ := url.Parse(options[bit(0, i)].u2)
        tx.Signer(u1)
        tx.Signer(u2)
        fmt.Printf("idx %d: %v\n", i, tx)
        all = append(all, tx.ID())
    }

    for i := 0; i < 16; i++ {
        for j := i + 1; j < 16; j++ {
            if bytes.Equal(all[i], all[j]) {
                t.Fatalf("two transactions had the same ID: %d and %d", i, j)
            }
        }
    }
}

func bit(b int, v int) int {
    return (v >> b) & 0x1
}

TX_HASH_ID:internal/api/transaction_test.go

The function bit is used to select the value of the bth bit of a value v between 0 and 15. This is then used to select between the first and second rows of the options data set. All 16 transaction IDs are then added to a slice. The final nested loop compares all distrinct pairs of this slice to ensure that there are no duplicates.

The ID function itself is in transaction.go:

func (tx Transaction) ID() types.Hash {
    hasher := sha512.New()
    hasher.Write([]byte(tx.ContentLink.String()))
    hasher.Write([]byte("\n"))
    hasher.Write(tx.ContentHash)
    for _, s := range tx.Signatories {
        hasher.Write([]byte(s.Signer.String()))
        hasher.Write([]byte("\n"))
    }
    return hasher.Sum(nil)
}

TX_HASH_ID:internal/api/transaction.go

Right, so we're pretty much in a position to use that in MemoryPendingStorage, but first we need a new test. But, as I go to write (yes, I do mean copy-and-paste) a new test, I realize that I don't like how ugly, bloated and repetitive they have become, and I'm going to do a quick refactoring or two.

var repo client.ClientRepository
var s storage.PendingStorage
var r clienthandler.Resolver

func setup() {
    repo, _ = client.MakeMemoryRepo()
    s = storage.NewMemoryPendingStorage()
    r = clienthandler.NewResolver(s)
}

func maketx(link string, hash string, userkeys ...any) *api.Transaction {
    tx, _ := api.NewTransaction(link, types.Hash([]byte(hash)))
    var ui *url.URL
    for _, v := range userkeys {
        if vs, ok := v.(string); ok {
            ui, _ = url.Parse(vs)
            tx.Signer(ui)
        } else if vb, ok := v.(bool); ok && vb {
            pk, _ := repo.PrivateKey(ui)
            tx.Sign(ui, pk)
        }
    }
    return tx
}

func TestANewTransactionMayBeStoredButReturnsNothing(t *testing.T) {
    setup()
    tx := maketx("https://test.com/msg1", "hash", "https://user1.com/", true, "https://user2.com/")
    stx, err := r.ResolveTx(tx)
    if stx != nil {
        t.Fatalf("a stored transaction was returned when the message was not fully signed")
    }
    if err != nil {
        t.Fatalf("ResolveTx returned an error: %v\n", err)
    }
}

func TestTwoCopiesOfTheTransactionAreEnoughToContinue(t *testing.T) {
    setup()
    {
        tx := maketx("https://test.com/msg1", "hash", "https://user1.com/", true, "https://user2.com/")
        r.ResolveTx(tx)
    }
    var stx *records.StoredTransaction
    var err error
    {
        tx := maketx("https://test.com/msg1", "hash", "https://user1.com/", "https://user2.com/", true)
        r.ResolveTx(tx)
        stx, err = r.ResolveTx(tx)
    }
    if stx == nil {
        t.Fatalf("a stored transaction was not returned after both parties had submitted a signed copy")
    }
    if err != nil {
        t.Fatalf("ResolveTx returned an error: %v\n", err)
    }
}

REFACTOR_RESOLVER_TESTS:internal/clienthandler/resolver_test.go

Here I pulled the declaration and initialization out of the individual tests and into the main body of the class and the setup function respectively.

I then added a helper function maketx that does the heavy lifting of assembling a transaction from its constituent parts. It takes a variadic argument list of any which is either a string version of a URL userid or else a boolean - true indicates that the preceding user should sign the transaction.

Very good.

Let's add another test: that there can be two transactions in progress at once, and neither completes just because we have two signatures.

func TestTwoIndependentTxsCanExistAtOnce(t *testing.T) {
    setup()
    {
        tx := maketx("https://test.com/msg1", "hash", "https://user1.com/", true, "https://user2.com/")
        stx, err := r.ResolveTx(tx)
        checkNotReturned(t, stx, err)
    }
    {
        tx := maketx("https://test.com/msg2", "hash4", "https://user1.com/", "https://user2.com/", true)
        stx, err := r.ResolveTx(tx)
        checkNotReturned(t, stx, err)
    }
}

func checkNotReturned(t *testing.T, stx *records.StoredTransaction, err error) {
    if stx != nil {
        t.Fatalf("a stored transaction was returned when the message was not fully signed")
    }
    if err != nil {
        t.Fatalf("ResolveTx returned an error: %v\n", err)
    }
}

TWO_INDEPENDENT_TXS_TEST:internal/clienthandler/resolver_test.go

So, yes, I couldn't resist doing a further refactoring as I was adding this test: there was now quite a bit of duplication of the logic that checked that no "stored transaction" was returned and no error was raised. That is now in its own function.

The new test fails, of course:

=== RUN TestTwoIndependentTxsCanExistAtOnce
resolver_test.go:80: a stored transaction was returned when the message was not fully signed
--- FAIL: TestTwoIndependentTxsCanExistAtOnce (0.32s)

But the fix is fairly simple as we have been angling for it for the past few paragraphs. We just turn the int into a string and use the ID() of the transaction as the key.

type MemoryPendingStorage struct {
    store map[string]*api.Transaction
}

func (mps MemoryPendingStorage) PendingTx(tx *api.Transaction) *api.Transaction {
    curr := mps.store[string(tx.ID())]
    if curr == nil {
        mps.store[string(tx.ID())] = tx
    }
    return curr
}

func NewMemoryPendingStorage() PendingStorage {
    return &MemoryPendingStorage{store: make(map[string]*api.Transaction)}
}

STORE_PENDING_TXS_BY_ID:internal/storage/pending.go

You may ask "why do we convert tx.ID() into a string there?" The answer is because a slice in Go cannot be used for comparison purposes but a string, into which it is easily converted, is fine. I'm not exactly sure why this is. I do understand that to be put in a map, a key must fully implement equality. But I also believe that a string is just a slice of runes. Does that mean I could somehow implement equality on my Hash type? If so, I haven't yet seen how. Maybe I will find out one day.

Moving on

There are certainly more tests we could write, and there are improvements we could make to this code (for example, testing that the clients have not lied to us about their signatures), but for now I'm ready to move on.

Ignorance may be Strength

Wednesday, December 25, 2024

Resolving a transaction