Thursday, February 13, 2025

Client-side Routing in a Javascript Application

This is a small thing that I'm just doing here so that I can nail down some issues in my own mind. For me at least, there is nothing really "new" here, just a consolidation of things that I think I know into one place.

Client-side routing is key to having "single page web applications" behave like "normal web applications" in that: they should you what you expect when you visit a URL; the URL updates as you navigate through the app; you can bookmark individual pages within the app; and the back button works in a "logical" way, if not necessarily the way in which you would expect.

I don't think perfection can currently be achieved though. In particular, annoying as it is to me, if you specify a "brand new" URL (either by typing it in the address bar or clicking on a bookmark), it seems to me that the browser takes over and will not let the current application intercept the new path within its own context. If it turns out that I have just missed something - or if it changes - I will issue a retraction. In the meantime, let's get on with what I do know.

I have put a "live" version of this on a website, jshistory.gmmapowell.com, which has all the versions next to each other (called v1, v2, etc.). You can look at the examples I've shown here by visiting those pages. If you look at the git repo, it's all in one directory, but the tag (e.g. JSHISTORY_V3) will tell you which is the corresponding version on the website. I'm doing that somewhat manually, so hopefully it all stays in sync.

The Basic Setup

I'm not going to do anything too amazing. In particular, when we do something to "navigate" to a "page" (by chainging the URL), the actual page layout is not going to change. All that is going to happen is a log message is going to be put in the box in the middle of the screen. The box should just have more and more messages in it - if it ever goes back to just application loaded, that will be because the application has been reloaded from the server, which is something we are trying to avoid.

The HTML outlines our basic setup and page layout:
<!DOCTYPE html>
<html>
  <head>
    <title>Experiments with History and Paths</title>
    <link rel="stylesheet" type="text/css" href="css/history.css">
  </head>
  <body>
    <div class="main">
        <div class="title">Experiments with history and paths</div>
        <div class="logging"></div>
    </div>
    <script src="js/history.js" type="module" charset="UTF-8"></script>
  </body>
</html>

JSHISTORY_V1:jshistory/index.html

Apart from the boilerplate, this specifies that we want to load the CSS file css/history.css and the Javascript module js/history.js. I'm not actually using modules, but I've got into the habit now.

The actual HTML consists of a "main" frame, which is going to take up the whole screen and have a blue background; a title element across the top; and an inner frame with a lighter background for all the logging information to go in.

The CSS is CSS and basically passes without comment, except for a few notes:
body {
    margin: 0px;
}
.main {
    width: 100vw;
    height: 100vh;
    background-color: lightblue;
}
.title {
    text-align: center;
    font-size: 20px;
    height: 10%;
    padding-top: 10px;
}
.logging {
    position: relative;
    background-color: aliceblue;
    height: 80%;
    margin-left: 10%;
    margin-right: 10%;
    padding: 1px;
    overflow-y: scroll;
}
.message {
    margin: 10px 10px 0px 10px;
    border: solid black 1px;
    padding: 7px;
    font-family: monospace;
}

JSHISTORY_V1:jshistory/css/history.css

The body of an HTML document is set by default to have a margin of 16px because that gives white space around a "just text" document. But it's ugly if you want a solid background, so it's best to override it.

The logging class has a position of relative, not because we want to move it from its default position, but because we want it to become a "positioned" element so that it is the "positioned" parent of the elements we will add to it (I will explain why below). It also has a padding of 1px to avoid the collapsing margins problem in which the top margin of the first nested element is applied to the outer box. The overflow-y is set to scroll so that this box will scroll (rather than the whole page) when we have too many messages.

The message class is not used directly in the HTML but is applied to the messages we generate in JavaScript. It has a top border but not a bottom border since these will be collapsed into each other anyway, so it's better to be explicit about what we expect to happen.

Turning to the JavaScript code:
var logArea = document.querySelector(".logging");

function scrollTo(elt) {
    logArea.scrollTo(0, elt.offsetTop + elt.offsetHeight - logArea.clientHeight);
}

function write(msg) {
    var elt = document.createElement("div");
    elt.className = "message";
    elt.append(document.createTextNode(msg));
    logArea.append(elt);
    scrollTo(elt);
}

write("application loaded");

JSHISTORY_V1:jshistory/js/history.js

This first makes sure that it can find the element tagged logging in order to write to it. The scrollTo function enables us to scroll to any given element within logging. The offsetTop returns the distance from the "inside" of the first ancestor element which is explicitly positioned (which is logging because we specified its position: relative; in the CSS above) and the offsetHeight is the height of the element. The clientHeight of logArea is the inside height of the containing element and thus the calculation we do here should scroll the contents of the container to align the bottom of elt with the bottom of the container.

The write method takes a text message, creates an element and a text node, styles the element to be of class message, and then appends it to the logArea and scrolls the logArea to ensure that it is showing.

This all works fine as is, but in order to be able to specify "interesting" paths as URLs and still return the correct webapp HTML, we need to do a little monkeying. Since I'm using Apache, this goes in the .htaccess file using the rewrite_mod. I'm not going to explain this in detail here, but this is the file:
RewriteRule (js/.*) 1 [L]
RewriteRule (css/.*) 1 [L]
RewriteRule ^.*$ index.html

JSHISTORY_V1:jshistory/HTACCESS

The third line is the one that says "rewrite any path to index.html", but we don't want it to be any file. If the path is js/ or css/, we will want to return those files specifically. While this may seem "obvious", it's easily overlooked, and I did just that this morning and wasted half an hour trying to figure out why the mime type on my JS file was wrong and unacceptable (it wasn't, in fact, just the mime type that was wrong: it was the whole file).

The first two lines handle the js/ and css/ paths and basically say "serve the file the user asked for" and the third line says "anything else, send back index.html". This file goes under v1, etc. so doesn't apply to favicon.ico and is separate for each version of the webapp. Although I've put it in the repository as HTACCESS, my deployment script renames it to .htaccess.

You can check all this works properly by visiting these pages:

Adding Some Links

So we have an application, but it isn't very interesting. What we need is to be able to attempt to navigate around the site. So I'm now going to add a "button bar" to the top (like a toolbar or whatever), and I'm going to put four different gadgets in it.
<!DOCTYPE html>
<html>
  <head>
    <title>Experiments with History and Paths</title>
    <link rel="stylesheet" type="text/css" href="css/history.css">
  </head>
  <body>
      <div class="main">
        <div class="title">Experiments with history and paths</div>
        <div class="buttons">
            <button onclick="url('first')">first</button>
            <button onclick="url('down/again')">down/again</button>
            <a href="linkto">linkto</a>
            <span class="likelink" onclick="url('textlink')">textlink</span>
        </div>
        <div class="logging"></div>
    </div>
    <script src="js/history.js" type="module" charset="UTF-8"></script>
  </body>
</html>

JSHISTORY_V2:jshistory/index.html

Two of them are buttons, called "first" and "inner/again". These will attempt to link to "/first" and "/inner/again". Then I have a "regular" anchor link called "linkto". And then I have something that looks similar, called "textlink", which is styled to make it look like a hyperlink but is actually just text, for which I have added a click event.

Apart from the "regular" anchor link (linkto), these all give the appearance of working: the message appears in the logging area as you would expect. What doesn't happen is that the URL does not change. We need to fix that. The regular anchor link causes the page to refresh and go back to just showing "application loaded". This is always a clue we have done something "wrong". (This needs quite a bit more effort and we will return to that later; for now, ignore it.)

The CSS for this isn't interesting. And on the JavaScript side, all I've done is added the url function, which for now just logs the message:
function url(goto) {
    write(goto);
}

JSHISTORY_V2:jshistory/js/history.js

Introducing the History API

Everybody talks about the History API as if it is some amazing thing. It isn't, and it's very frustrating what it can't do. I understand that this is for security reasons (it doesn't want "this app" being able to see where else the user has been) but from the perspective of a webapp developer, it would be nice if it solved that problem by only telling you about the user history within your application, but told you everything within your application.

Winding back to the beginning, the History API is basically a programmatic version of the "back" and "forward" buttons on your browser. You can call history.back() and history.forward() to literally simulate the pushing of those buttons. You can also call history.go(n) where n is an integer which is negative to go back a number of pages, positive to go forward, or 0 to reload the current page (not something we want to do, obviously - we are trying to avoid reloads).

You can also see how big the history is using history.length, but this isn't as useful as it might first appear, since you cannot see where in that you are, and any number of entries (especially at 0 and history.length-1) might well not be in your application.

So that might all be useful in certain situations, but not for us. The final property on history is called state and by default is null. We will look at how to set it in a moment, and it may be useful to see the current state to decide on a course of action. What you can't do is see the state or url for any other page in the history stack.

But you can manipulate the stack. You can add items to it by calling pushState and change the current top item using replaceState. Because of the lack of transparency from the API (even in the debugger), it's hard to tell exactly what happens when you do these, however, and I don't feel the documentation makes it as clear as it could. But hopefully we can experiment and attempt to characterize the API. For example, when we push a new state, does it insert it or clear the rest of the stack and replace that with the new entry?

So let's make the url function we are already calling using pushState to try and update the URL (we should see the URL in the address bar change) and report on that and the new length of the stack. We'll also get it to store something in the state variable: an auto-incrementing ID that we can then read back. This will also appear in the tracing.

There are complications in JavaScript around dealing with relative paths and resolving them which I don't want to get into yet (although I may come back to it). So to avoid that, I am going to build an explicit path. To do that, I need to first figure out which version you are using, so I added this to the top of my JavaScript:
var pushid = 1;
var version = figureVersion(window.location.pathname);

function figureVersion(path) {
    var idx = path.indexOf('/', 1);
    if (idx == -1) {
        return path;
    } else {
        return path.substring(0, idx);
    }
}

JSHISTORY_V3:jshistory/js/history.js

This is my first attempt to implement url, and I was a little surprised at what happened:
function url(goto) {
    var url = window.location;
    url.pathname = version + '/' + goto;
    write("pushing " + pushid + " to " + goto + ": url = " + url);
}

JSHISTORY_V3:jshistory/js/history.js

You can try it for yourself by visiting jshistory.gmmapowell.com/v3 and clicking on first. That's right. The page reloads, which, as I've said before, is always a losing lottery ticket in the single-page webapp game.

The reason, of course, is that assigning to window.location forces a refresh. And even though I'd assigned window.location to another variable, it isn't what you call it that matters, but what value it has. It looks like a URL, but it's actually a more active thing, a Location. So let's try that again.
function url(goto) {
    var url = new URL(window.location.href);
    url.pathname = version + '/' + goto;
    write("pushing " + pushid + " to " + goto + ": url = " + url + "; stack = " + window.history.length);
    window.history.pushState({unique: pushid++}, null, url);
}

JSHISTORY_V4:jshistory/js/history.js

That's better. By copying the URL before we attempt to write to it, we have detached it from the Location object. We can now set the URL in the address bar by calling pushState. And we are done: all the buttons (except the anchor link) work in the way we expect and report an ever-increasing stack size.

Not that we're done yet.

(If you're wondering, the null middle argument to pushState is a weird beast. It is described as title but is not used as the window title. In fact, it doesn't seem to be used at all. So I always set it to null.)

Handling the Page Load

There are two things we need to consider on the initial page load. Recording where we are, and putting something on the stack to go back to.

In some ways, everything we're doing here is a classic "synchronization" or "caching" problem. We have two things that need to be kept in sync, but they are independent, and we need to make sure that we know which has changed and then change the other. Above, we looked at what happened when our app wanted to change the current route: and then we updated the address bar to follow suit. Here, we are looking at when the app first loads with a specific route set; then we need to update our app accordingly. We are just going to log a message about where we are, but a "real" application would obviously need to show the desired content for the specified route.

If the user selects a route "within" our app, it is perfectly possible that they would then expect that "going back" would take us somewhere else in our app (they might also expect to go back exactly where they came from; it's hard to know what users expect). We will assume that they want us to completely take over and have "back" go back to our home page. This means we need to put an extra element into the history stack, "below" as it were, the one that we're at. You can't actually do that, so what we are going to do is to "replace" this one with our home url, and then push the current url, and log both of these actions.

Of course, if a user selects the home page of our app, we just log that and we're done.

On the upside, there is no problem knowing when the page has loaded: it is the only time when it runs through all of our initialization JavaScript (which is why the "Application Loaded" message only comes out once).

Clear? Then let's begin by adding a new method call at the end to our "initialization" phase:
write("application loaded: " + version);
handleLoad(window.location.origin, window.location.pathname.substring(version.length));
On loading, we disassemble the window.location and pass the origin (https://jshistory.gmmapowell.com/) and the path (shorn of the version identifier) to a new method handleLoad:
function handleLoad(origin, route) {
    if (route == "/") {
        write("loaded with root; setting state to " + pushid + "; stack = " + window.history.length);
        window.history.replaceState({unique: pushid++}, null);
    } else {
        var nested = new URL(window.location.href);
        var root = new URL(window.location.href);
        root.pathname = version + "/";
        write("loaded with route " + route + "; replacing with " + root + " as " + pushid + "; stack = " + window.history.length);
        window.history.replaceState({unique: pushid++}, null, root);
        window.history.pushState({unique: pushid}, null, nested);
        write("pushing " + nested + " as " + pushid + "; stack = " + window.history.length);
        pushid++;
    }
}

JSHISTORY_V5:jshistory/js/history.js

This divides into two main cases: has the user asked for the "root" of the application (the home page, as it were) or a nested page?

Not much needs doing if this was the root page: we replace the state with a state object (in lieu of the default null) but the path stays the same.

On the other hand, if the user has selected a nested page, we want to replace the current state AND url with an appropriate state and the "root" url; and then push a new state and the requested url. I want to repeat that this is a choice: we could just update the state or we could push multiple in-between urls representing "the most common way" of getting to the request url. This is only really possible to determine in the context of both the application and the intended user experience.

One point of note: the moment you replace the state (or push a new one), the value of window.location is automatically updated, so it is possible to capture the value of nested before we perform the call to replaceState.

I also made a couple of minor changes to try and make the value of pushid always line up with the message displayed in the box.
function url(goto) {
    var url = new URL(window.location.href);
    url.pathname = version + '/' + goto;
    window.history.pushState({unique: pushid}, null, url);
    write("pushing " + pushid + " to " + goto + ": url = " + url + "; stack = " + window.history.length);
    pushid++;
}

JSHISTORY_V5:jshistory/js/history.js

Handling that Pesky <a> Link

It is, of course, possible to build your complete application without using any explicit <a> links. It is also possible to do it with them, and it would be good if our application could cope with that choice. Since we have an <a> link which doesn't currently work, let's try and wire it up.

Once processing of an <a> link has started, it's too late and there's basically nothing we can do to bring ourselves back from the brink. Instead, we need to target all the <a> links in the document and make sure that they call our function, not exhibit the default behaviour. We can do this by adding a click event handler onto the document, and then checking if it's a local <a> link. The code is not that hard:
function captureLocalAHref(origin) {
    document.addEventListener('click', (ev) => {
        var t = ev.target;
        if (t.tagName === 'A' && t.origin === origin) {
            ev.preventDefault();
            url(t.pathname.substring(version.length+1));
        }
    });
}

JSHISTORY_V6:jshistory/js/history.js

The listener checks that the link being clicked is an <a>, and that the origin in the referenced URL matches our origin (if it doesn't, we can't handle the link locally). If both these criteria are met, preventDefault is called on the event to make sure that the link is not automatically followed, and then our url method is called to perform the local routing.

We install this handler at the end of the script:
captureLocalAHref(window.location.origin);

JSHISTORY_V6:jshistory/js/history.js

Can I Just Go Back?

If you try pushing the back button on your browser, you should see the URL in the address bar change and, magically, the page does not reload. Once you've gone back, you should be able to go forward, and everything should behave just as you wanted it to. At least, as far as the address bar goes. But the logging area is not being updated. Since this is a proxy for updating the content on the screen, this means that while back and forward update the URL, they would not cause the content on the screen to change. Not ideal.

Let's fix that.

There is a window event called popstate which we can capture to handle this. We can add a handler at the end of our script like so:
capturePopstate();

JSHISTORY_V7:jshistory/js/history.js

This will then add the event handler onto the window, like so:
function capturePopstate() {
    window.addEventListener("popstate", (ev) => {
        ev.preventDefault();
        moveTo(new URL(window.location.href), ev.state);
    });
}

JSHISTORY_V7:jshistory/js/history.js

The event handler stops the "default" back action from happening, and then calls the moveTo function:
function moveTo(url, state) {
    write("moved to " + state.unique + " at " + url + "; stack = " + window.history.length);
}

JSHISTORY_V7:jshistory/js/history.js

Note that this is a new function, not the url function we have been using up until now. In the real world, these two functions will have a lot in common that should be extracted (in fact, moveTo might well be the extraction). Specifically, the url function wants to change the History stack; moveTo does not - it just wants to use it. Our implementation simply logs another message that it has moved up or down the stack and reports the unique id it found there along with the current url (which is passed in from window.location). It also reports the current length of the stack, which should never be different from what it was before the method was called.

(It is at this point where you are frustrated that you cannot see more information, even just how many frames are above and below you in the stack, let alone whether they are part of your app or not. The unique id in this example is intended as a proxy for that, and I think it could be extended to keep track of where you are in the stack if you were that way inclined.)

This is worth playing around with for a while, because it's really quite instructive. Start here and click on a couple of the links. You should see the URLs and unique numbers pop up. Press back a couple of times and check that the unique numbers match the same URLs as they did going up. Now press a link and then back. You'll notice that a gap has now appeared in the unique numbers, and that the stack size has gone down. "Forward" history is destroyed when you click on a new link.

Suppressing History

There are occasions when you don't actually want all of your history to be present. As an example, if you are in a "shopping cart" experience, you want the different pages to be available while you are making your selections, etc. but once you have finished with the cart you want the whole thing to be suppressed and erased from history and just be left with a confirmation screen. While I'm not going to change the overall shape of the application, I'm going to add another row of buttons that shows how this could work.

You can see this here. (For full disclosure, this is pointing to the fully working version, so that I don't lock up your browser. But v8 does exist if you want to look at that as well; but it will lock up on confirm.)

The idea here is that you push the cart button (on the main row) when you want to enter the cart experience. To make it somewhat realistic, it is only when you do this that the second row of buttons light up. You can then go between the details, payment and review steps to your heart's content, and back works all the way through this. But when you push the confirm button, all of those steps are collapsed and you can't get back to any of the intermediate steps. Pushing the back button takes you back to where you were when you pushed cart.

How does this work? Well, it all ends up being a little complicated. Certainly more complicated than I had expected.

Part of that complication is due to the fact that I want the buttons to light up and turn off at the appropriate moments, but part of it is because of the History api.

Let's start with the UI: we can add the extra buttons to the HTML:
        <div class="buttons">
            <button onclick="url('first')">first</button>
            <button onclick="url('down/again')">down/again</button>
            <a href="linkto">linkto</a>
            <span class="likelink" onclick="url('textlink')">textlink</span>
            <button onclick="launchCart()" class='launch-cart'>cart</button>
        </div>
        <div class="cart buttons">
            <button onclick="cartStep('details')" disabled>details</button>
            <button onclick="cartStep('pay')" disabled>pay</button>
            <button onclick="cartStep('review')" disabled>review</button>
            <button onclick="cartStep('confirm')" disabled>confirm</button>
        </div>

JSHISTORY_V8:jshistory/index.html

When the cart button on the original row is pressed, the launchCart method is invoked:
function launchCart() {
    cartStep("details");
    var s = window.history.state;
    s.launchCart = true;
    window.history.replaceState(s, null);
}

JSHISTORY_V8:jshistory/js/history.js

This does two things: first it "moves" to the cart step details, and then it marks this step as being the first one of the "cart experience" by adding the launchCart property to the history state and then replacing the state. This will (hopefully) enable us to identify the start of the cart experience when we come to leave the cart later.

cartStep is the function which responds to the buttons on the second row (as well as being called from launchCart). It handles moving to the relevant URL and making sure that the cart buttons are either enabled or disabled.
function cartStep(path) {
    if (path != "confirm") {
        url(path);
        cartEnabled(true);
    } else {
        replaceConfirm();
        cartEnabled(false);
    }
}

JSHISTORY_V8:jshistory/js/history.js

The confirm button is fundamentally different from the other three. The first three all just push a new state by calling our old friend url, and then make sure the cart buttons are enabled. The confirm button squashes the stack, replacing the existing entries with /confirm, and then makes sure the cart buttons are disabled, thus moving back into the non-cart mode.

The cartEnabled method is just vanilla monkeying with DOM:
function cartEnabled(enable) {
    var cartButtons = document.querySelectorAll(".cart.buttons button");
    for (var b of cartButtons) {
        if (enable) {
            b.removeAttribute("disabled");
        } else {
            b.setAttribute("disabled", "");
        }
    }
    var launchBtn = document.querySelector(".buttons .launch-cart");
    if (enable) {
        launchBtn.setAttribute("disabled", "");
    } else {
        launchBtn.removeAttribute("disabled");
    }
}

JSHISTORY_V8:jshistory/js/history.js

At the end of the file, we have to wire up the launchCart and cartStep methods so that we can reference them in the HTML:
window.launchCart = launchCart;
window.cartStep = cartStep;

JSHISTORY_V8:jshistory/js/history.js

All this just leaves us needing the replaceConfirm method. Here's my first cut, which is what is in v8 (again, beware, this hangs the browser window):
function replaceConfirm() {
    var url = new URL(window.location.href);
    url.pathname = version + '/confirm';
    while (true) {
        var top = window.history.state.launchCart;
        var prev = window.history.state.unique;
        window.history.replaceState({ unique: pushid }, null, url);
        write("cart confirmed at " + prev + " replaced with " + pushid + " for confirmation: " + url + "; stack = " + window.history.length);
        pushid++;
        window.history.back();
        if (top) {
            break;
        }
    }
}

JSHISTORY_V8:jshistory/js/history.js

I'm not very happy with it anyway (while (true) { ... if (...) break; }} never sits well with me), but the real kicker here is that window.history.back() does not take immediate effect. It only "does its thing" when it gets back to the event loop. That means that this code is tricky (impossible?) to debug. The moment you put it in the debugger, the event loop takes over and back() happens. So what's the problem? But the moment you let it rip, it hangs, just complaining that it's stuck in a loop.

On top of that, the moment we call back(), we are setting up for our popstate event handler to be called. This is not something we want on this occasion. So let's go back to the drawing board.

A More Complicated replaceConfirm()

We don't need to throw the baby out with the bathwater, but we do need to have replaceConfirm collaborate with popstate. That requires us to introduce a new (global) variable to track the fact that popstate needs to do something special.

After some monkeying around, I decided that the most general thing I could do would be to introduce a function variable. If null, we do the normal thing; if not null, we reset it to null and then call the function.

So here's the variable:
var popMode = null;
And here is the update popstate handler:
function capturePopstate() {
    window.addEventListener("popstate", (ev) => {
        ev.preventDefault();
        if (popMode) {
            var popFn = popMode;
            popMode = null;
            popFn();
        } else {
            moveTo(new URL(window.location.href), ev.state);
        }
    });
}

JSHISTORY_V9:jshistory/js/history.js

Ironically, this now makes replaceConfirm simpler, mainly because all the loop stuff has gone:
function replaceConfirm() {
    var url = new URL(window.location.href);
    url.pathname = version + '/confirm';
    var atTop = window.history.state.launchCart;

    var prev = window.history.state.unique;
    window.history.replaceState({ unique: pushid }, null, url);
    write("cart confirmed at " + prev + " replaced with " + pushid + " for confirmation: " + url + "; stack = " + window.history.length);
    pushid++;

    if (!atTop) {
        popMode = replaceConfirm;
        window.history.back();
    }
}

JSHISTORY_V9:jshistory/js/history.js

This function behaves in exactly the same way regardless of whether it is being called from the code processing the confirm button, or repeatedly from the popstate event handler. Because the event handler automatically sets the popMode back to null, it only needs to consider setting it (to itself) if we have not reached the first step of the cart experience.

One Last Wrinkle

There's one last wrinkle which I tried to fix but failed. Pressing confirm goes back a few items in the history and, on each occasion, sets the URL to be /confirm. It seems to me that it should be possible to delete all the "forward history" at this point. But when I tried calling pushState, which I felt would do that, it didn't. I don't know why, although I suspect it has to do with JavaScript trying to protect you (the user) from me deleting your future history without you actively pushing a button. I may revisit this at some point.

It bugs me, but at the end of the day, it is "just" an elegance thing. If the user wants to go "back" from the confirmation screen, they will get back where they expect to be. They can go forward if they want; while it doesn't make any intuitive sense, all they will see are more confirmation screens: they won't ever end up in an inconsistent state, or have the option to confirm the same transaction again.

Conclusion

I came here primarily to nail down what all the steps and cases are to handle client side routing using the History API. That being the case, it seems important to write down what I established:
  • There are boring methods to move between the various pages in the history: back(), forward() and go().
  • You can't tell where in the history you are, or how much of it "stays" within this app, but you can see how much total history there is using history.length.
  • You can see the state object associated with the current URL using history.state.
  • When first loading the app, you may need to use replaceState and pushState to put in place the expected behaviour when the user attempts to go "back" from a nested page.
  • At all other times, the navigation should be controlled within the app and there should be a single pushState and a corresponding change in screen layout.
  • It is possible to add a document-level click handler to intercept "standard" <a> links and treat them the same as buttons within the app.
  • When the user pushes the back or forward buttons, a new state will be selected and returned through the popstate event handler. In this case, there needs to be a change in screen layout corresponding to the selected URL, but pushState should not be called. (Depending on your application, you may want to call replaceState).
  • Because the History API is so weak, you may want to use the state object to track where you are in the stack. A technique similar to our unique flag could indicate where you are and link to a data structure in your webapp which keeps track of which pages have been visited. This can be used in popstate to see how far you have travelled.
  • It is possible to use replaceState to change the URLs at previous locations, provided you set up a loop between your rewriting method and the popstate event handler. This makes it possible to eliminate unwanted "temporary" states.

Monday, February 3, 2025

Internode Communication

Now that the nodes know about each other, they can communicate with each other. This consists of sending messages and receiving them. For now, all we want to do is broadcast every thought we have to all the other nodes and have them be accepted.

Sending Transactions

It's always hard to know whether to write the sender or receiver first, but given that it is clearer that the code is "present", if not necessarily correct, I'm going to do the sending first with the expectation that I will see some kind of error because the URL does not exist.

The idea is going to be that every time we create an item and store it in the journal, we will then also turn around and ask the configuration what all the other nodes are and then send the message (in binary form) to a URL constructed of the node URL and a path, say /remotetx.

So, let's review the code in clienthandler I'm planning to change:
// ServeHTTP implements http.Handler.
func (r RecordStorage) ServeHTTP(resp http.ResponseWriter, req *http.Request) {
    log.Printf("asked to store record with length %d\n", req.ContentLength)

    body, err := io.ReadAll(req.Body)
    if err != nil {
        log.Printf("Error: %v\n", err)
        resp.WriteHeader(http.StatusBadRequest)
        return
    }
    log.Printf("have json input %s\n", string(body))

    var tx = api.Transaction{}
    err = json.Unmarshal(body, &tx)
    if err != nil {
        log.Printf("Error unmarshalling: %v\n", err)
        resp.WriteHeader(http.StatusBadRequest)
        return
    }

    log.Printf("Have transaction %v\n", &tx)
    if stx, err := r.resolver.ResolveTx(&tx); stx != nil {
        r.journal.RecordTx(stx)
    } else if err != nil {
        log.Printf("Error resolving tx: %v\n", err)
        resp.WriteHeader(http.StatusInternalServerError)
        return
    } else {
        log.Printf("have acknowledged this transaction, but not yet ready")
    }
}

REDO_CONFIG:internal/clienthandler/recordstorage.go

This is a method on the class RecordStorage which has the following members:
type RecordStorage struct {
    resolver Resolver
    journal  storage.Journaller
}

REDO_CONFIG:internal/clienthandler/recordstorage.go

What I want to do then is to have an array of endpoints to connect to and send the binary version of the stored transaction to each one. What does that look like? Well, what I have "in my hand", as it were, from the last chapter is a slice of NodeConfigs for the OtherNodes. But what I want here is something that can send a binary blob. Given that I would like to be able to test this code, I think what I'm going to do is declare a BinarySender interface and require that we are given a slice of those. (Note: I actually have no intention of actually writing any unit tests for this specific code because I am too lazy, but I will happily accept any pull requests; I am merely making sure that it would be testable.)

The interface looks like this:
package internode

type BinarySender interface {
    Send(path string, blob []byte)
}

INTERNODE_TRANSACTIONS:internal/internode/binarysender.go

We can do the work here to make sure we have these passed in and available to us:
type RecordStorage struct {
    resolver Resolver
    journal  storage.Journaller
    senders  []internode.BinarySender
}

func NewRecordStorage(r Resolver, j storage.Journaller, senders []internode.BinarySender) RecordStorage {
    return RecordStorage{resolver: r, journal: j, senders: senders}
}

INTERNODE_TRANSACTIONS:internal/clienthandler/recordstorage.go

And then we can marshal the transaction to binary and send it over the wire:
// ServeHTTP implements http.Handler.
func (r RecordStorage) ServeHTTP(resp http.ResponseWriter, req *http.Request) {
    log.Printf("asked to store record with length %d\n", req.ContentLength)

    body, err := io.ReadAll(req.Body)
    if err != nil {
        log.Printf("Error: %v\n", err)
        resp.WriteHeader(http.StatusBadRequest)
        return
    }
    log.Printf("have json input %s\n", string(body))

    var tx = api.Transaction{}
    err = json.Unmarshal(body, &tx)
    if err != nil {
        log.Printf("Error unmarshalling: %v\n", err)
        resp.WriteHeader(http.StatusBadRequest)
        return
    }

    log.Printf("Have transaction %v\n", &tx)
    if stx, err := r.resolver.ResolveTx(&tx); stx != nil {
        r.journal.RecordTx(stx)
        blob, err := stx.MarshalBinary()
        if err != nil {
            log.Printf("Error marshalling tx: %v %v\n", tx.ID(), err)
            return
        }
        for _, bs := range r.senders {
            go bs.Send("/remotetx", blob)
        }
    } else if err != nil {
        log.Printf("Error resolving tx: %v\n", err)
        resp.WriteHeader(http.StatusInternalServerError)
        return
    } else {
        log.Printf("have acknowledged this transaction, but not yet ready")
    }
}

INTERNODE_TRANSACTIONS:internal/clienthandler/recordstorage.go

Note that when we invoke the sender, we do so in a new goroutine. This ensures that all of our messages are sent in parallel, which is really important because almost all network communication is a question of sitting around for a long time waiting for responses. Especially when you are doing HTTPS requests halfway around the world (which is absolutely our intention here).

That leaves us with a few loose ends. Firstly, we aren't passing in any BinarySenders to the NewRecordStorage constructor. I'm going to go off and quietly sort that out since it's mainly a bookkeeping exercise turning NodeConfigs into HttpBinarySender objects. Secondly, there isn't in fact an HttpBinarySender object, so we need to create one.
package internode

import (
    "bytes"
    "log"
    "net/http"
    "net/url"
)

type HttpBinarySender struct {
    cli *http.Client
    url *url.URL
}

// Send implements BinarySender.
func (h *HttpBinarySender) Send(path string, blob []byte) {
    tourl := h.url.JoinPath(path).String()
    log.Printf("sending blob(%d) to %s\n", len(blob), tourl)
    resp, err := h.cli.Post(tourl, "application/octet-stream", bytes.NewReader(blob))
    if err != nil {
        log.Printf("error sending to %s: %v\n", tourl, err)
    } else if resp.StatusCode/100 != 2 {
        log.Printf("bad status code sending to %s: %d\n", tourl, resp.StatusCode)
    }
}

func NewHttpBinarySender(url *url.URL) BinarySender {
    return &HttpBinarySender{cli: &http.Client{}, url: url}
}

INTERNODE_TRANSACTIONS:internal/internode/httpbinarysender.go

We create one of these for each of the remote nodes. When we create it, we create a new http.Client instance. I'm assuming that these can be used from multiple threads at the same time, but I'm not entirely sure. We also store the "base url" which comes from the "name" of the remote node.

When Send is called, we append the relative path which has been passed in to the base url, and then call Post on the client, specifying the binary MIME type application/octet-stream and the binary blob. Finally we check for errors and invalid status returns.

And finally, we don't currently have a method to marshal the stored transaction to binary format, so we want to add that. It's all a bit clumsy, but the main method looks like this:
func (s *StoredTransaction) MarshalBinary() ([]byte, error) {
    ret := types.NewBinaryMarshallingBuffer()
    s.TxID.MarshalBinaryInto(ret)
    s.WhenReceived.MarshalBinaryInto(ret)
    types.MarshalStringInto(ret, s.ContentLink.String())
    s.ContentHash.MarshalBinaryInto(ret)
    types.MarshalInt32Into(ret, int32(len(s.Signatories)))
    for _, sg := range s.Signatories {
        sg.MarshalBinaryInto(ret)
    }
    s.NodeSig.MarshalBinaryInto(ret)
    return ret.Bytes(), nil
}

INTERNODE_TRANSACTIONS:internal/records/storedtransaction.go

All of the subsidiary methods are implemented in various files in the types package. You can look at them if you want, but they all have fairly obvious implementations. At the end of the day, marshalling everything comes down to either an integer or a length followed by a byte slice.

When we run the harness tests, we see these messages come out:
2025/01/11 21:13:04 sending blob(998) to http://localhost:5002/remotetx
2025/01/11 21:13:04 bad status code sending to http://localhost:5002/remotetx: 404
This confirms that we have built a blob (a total of 998 bytes long) and that it was sent but not received. The 404 code of course means that there wasn't anyone at home listening for a /remotetx request.

Maybe we should fix that.

Receiving transactions

On the other end, we want to implement a handler for the /remotetx request and within that handler unmarshal the binary buffer. In the fulness of time we want to check that the message was valid and store it, as well as starting to build up a block identical to the one on the originating server. But that's for later. For now, can we just get far enough that the 404s go away and we have a StoredTransaction object in our hands?

First, let's just make the message go away:
func (node *ListenerNode) startAPIListener(resolver Resolver, journaller storage.Journaller) {
    cliapi := http.NewServeMux()
    pingMe := PingHandler{}
    cliapi.Handle("/ping", pingMe)
    senders := make([]internode.BinarySender, len(node.config.OtherNodes()))
    for i, n := range node.config.OtherNodes() {
        senders[i] = internode.NewHttpBinarySender(n.Name())
    }
    storeRecord := NewRecordStorage(resolver, journaller, senders)
    cliapi.Handle("/store", storeRecord)
    remoteTxHandler := internode.NewTransactionHandler()
    cliapi.Handle("/remotetx", remoteTxHandler)
    node.server = &http.Server{Addr: node.config.ListenOn(), Handler: cliapi}
    err := node.server.ListenAndServe()
    if err != nil && !errors.Is(err, http.ErrServerClosed) {
        fmt.Printf("error starting server: %s\n", err)
    }
}

INTERNODE_NO_404:internal/clienthandler/node.go

which of course requires an implementation of internode.NewTransactionHandler:
package internode

import (
    "io"
    "log"
    "net/http"
)

type TransactionHandler struct {
}

// ServeHTTP implements http.Handler.
func (t *TransactionHandler) ServeHTTP(resp http.ResponseWriter, req *http.Request) {
    buf, _ := io.ReadAll(req.Body)
    log.Printf("have received an internode request length: %d\n", len(buf))
}

func NewTransactionHandler() *TransactionHandler {
    return &TransactionHandler{}
}

INTERNODE_NO_404:internal/internode/transactionhandler.go

And then when we run the harness again, we receive positive affirmation from the recipient rather than errors from the sender:
2025/01/11 21:31:41 http://localhost:5001 recording tx with id [4 218 55 214 175 54 220 250 78 117 196 247 29 62 161 49 195 14 174 5 130 83 68 180 28 142 193 230 200 152 149 6 200 103 247 206 210 105 168 58 109 68 240 129 167 170 44 72 245 166 178 5 137 126 30 96 51 3 107 47 13 169 33 22], have 12 at 0xc000500080
2025/01/11 21:31:41 sending blob(998) to http://localhost:5002/remotetx
2025/01/11 21:31:41 have received an internode request length 998
Excellent. Now let's unmarshal the buffer. This is basically the mirror image of what we did for marshalling, so as before, I'll just show you the overview. This is what happens in the HTTP handler:
func (t *TransactionHandler) ServeHTTP(resp http.ResponseWriter, req *http.Request) {
    buf, err := io.ReadAll(req.Body)
    if err != nil {
        log.Printf("could not read the buffer from the request")
        return
    }
    log.Printf("have received an internode request length: %d\n", len(buf))
    stx, err := records.UnmarshalBinaryStoredTransaction(buf)
    if err != nil {
        log.Printf("could not unpack the internode message")
        return
    }
    log.Printf("unmarshalled message to: %v\n", stx)
}

INTERNODE_UNMARSHALLING:internal/internode/transactionhandler.go

And we need to add this code to unmarshal the stored transaction:
func UnmarshalBinaryStoredTransaction(bytes []byte) (*StoredTransaction, error) {
    buf := types.NewBinaryUnmarshallingBuffer(bytes)
    stx := StoredTransaction{}
    stx.TxID, _ = types.UnmarshalHashFrom(buf)
    stx.WhenReceived, _ = types.UnmarshalTimestampFrom(buf)
    cls, _ := types.UnmarshalStringFrom(buf)
    stx.ContentLink, _ = url.Parse(cls)
    stx.ContentHash, _ = types.UnmarshalHashFrom(buf)
    nsigs, _ := types.UnmarshalInt32From(buf)
    stx.Signatories = make([]*types.Signatory, nsigs)
    for i := 0; i < int(nsigs); i++ {
        stx.Signatories[i], _ = types.UnmarshalSignatoryFrom(buf)
    }
    _ = buf.ShouldBeDone()
    return &stx, nil
}

INTERNODE_UNMARSHALLING:internal/records/storedtransaction.go

(Note that to make this easier to understand, I have omitted all the error handling for now; I am going to go back and add that now.)

And then, running the harness again, we get a lot of interleaved messages, but these might be a pair:
2025/02/03 14:59:11 sending blob(1002) to http://localhost:5002/remotetx
2025/02/03 14:59:11 have received an internode request length: 1002
2025/02/03 14:59:11 unmarshalled message to: &{[222 208 107 140 79 14 157 97 132 106 128 254 150 37 195 208 157 67 234 94 15 14 57 28 42 106 84 33 94 158 113 99 114 59 115 191 11 163 49 7 12 78 173 152 174 234 179 80 103 172 83 80 58 123 30 24 29 177 103 29 74 187 45 168] 1738594751553 http://tx.info/t23ehi_zz07 [142 222 72 165 226 123 251 31 164 239 135 236 222 53 131 204 225 168 66 127 56 173 120 145 187 223 122 203 98 144 15 113 206 250 116 90 25 169 146 58 255 210 166 155 120 29 147 1 37 150 218 82 253 40 51 148 104 17 37 20 240 169 255 83] [0xc000016480 0xc0000164a0] []}

Conclusion

Great! We've managed to send across all the transactions from all the nodes to all the other nodes and, on arrival, unpack that transaction so that we're ready to use it.

Next time, we'll check the signatures and then we'll store it on the "other nodes" along with their own transactions.