Sunday, March 23, 2025

A Debugging Plugin for Chrome

Once again, in the course of my day job, I find myself needing to do something I've never done before. And when that happens, I come here to do it first so that I have done it before by the time I try writing some production code and hopefully I can learn from the mistakes I will make here.

"Over there", I have a programming language, FLAS, which is a complicated and somewhat sluggish beast. I have reached the point where I am now actually using it, rather than developing it, and in the consequence of doing that I find myself wanting to have a debugger that helps me debug the code I'm writing rather than the compiler (this is probably a familiar feeling if you write compilers; otherwise "WTAF?" is perfectly reasonable).

Apart from anything else, my language has lazy evaluation, which makes it very hard to see where you are "in the stack" because it doesn't follow normal stack evaluation. Given that a JavaScript debugger does, that makes for an impedance mismatch which is hard to compensate for in the standard Chrome debugger.

I'm not going to tackle that here.

What I do want to tackle is the idea that I should be able to build my own debugger for my own programming language which, regardless of how far removed it gets from JavaScript, is capable of showing what is going on in that programming language, not in the JavaScript running in the browser.

So, to do that, here's what we're going to do:
  • We're going to build a compiler for a trivial and non-sensical language but which is sufficiently complex that it needs a debugger and there are non-trivial debugging operations.
  • We're going to compile down to an "intermediate form" (as Java and C#) do which is represented in JSON, not JavaScript.
  • We're going to build a runtime for this "intermediate form" in JavaScript and wire it up so we can write programs and run them in the browser.
  • When we have seen that working, we're then going to build a debugger for that in Chrome.
  • This will take the form of a Chrome plugin which opens a "side panel" which can display source code, current location, current state and possibly allow evaluations, and will have the obvious "set breakpoint", "go", "next", etc. commands.
  • It will use CDP (the chrome debugging protocol) to interact with the JavaScript in the runtime library, but then interpret everything in the context of the original source code and, for example, only actually "stop" the program when it hits a breakpoint in the original source, not in the JavaScript.
I don't think I've built any kind of browser plugins before (although I've looked into it a few times), so I'm going to go slowly through that, and obviously I'll do a lot of monkeying with the CDP stuff, but I'm planning on blowing through things like writing a compiler and a server very quickly because they're "easy" (or at least familiar). I suspect I will linger over the runtime a bit more because, while it is fairly straightforward, it's going to be very important.

A Web Server

For various reasons, it is almost impossible in the modern era to serve websites statically off a filesystem (it doesn't work with JavaScript modules, for example). So the first thing I need to do is to make sure I have a web server running locally. While it's possible to use something like PHP or python or node to do this, it's not that hard to write your own in Go, and this will have the advantage that we will be in a good position to serve the code directly from the compiler when we get there.

So I am basically going to steal the WebServer code that I wrote for the ChainLedger and configure it to serve a static website. As and when, we'll come back and modify it to serve the compiled JSON.

Here we go:
package server

import (
    "errors"
    "fmt"
    "net/http"
)

func StartServer(addr string) {
    handlers := http.NewServeMux()
    index := NewFileHandler("website/index.html", "text/html")
    handlers.Handle("/{$}", index)
    handlers.Handle("/index.html", index)
    favicon := NewFileHandler("website/favicon.ico", "image/x-icon")
    handlers.Handle("/favicon.ico", favicon)
    cssHandler := NewDirHandler("website/css", "text/css")
    handlers.Handle("/css/{resource}", cssHandler)
    jsHandler := NewDirHandler("website/js", "text/javascript")
    handlers.Handle("/js/{resource}", jsHandler)
    server := &http.Server{Addr: addr, Handler: handlers}
    err := server.ListenAndServe()
    if err != nil && !errors.Is(err, http.ErrServerClosed) {
        fmt.Printf("error starting server: %s\n", err)
    }
}

CDP_WEBSERVER:cdp-till/internal/web/server.go

The server code creates a Mux handler, and then adds path handlers for / and /index.html. Not that the suffix {$} on the path for / indicates that it should not match any subpaths (which is apparently the default in Go for paths that end in /). These, along with the handler for /favicon.ico, reference a FileHandler which always returns a specific file.

The css and js handlers use a path pattern which says that it is looking for a file in the relevant directory and gives it the name resource.

These handers are defined in handlers.go:
package server

import (
    "fmt"
    "io"
    "log"
    "net/http"
    "os"
    "path/filepath"
)

type FileHandler struct {
    file      string
    mediatype string
}

type DirHandler struct {
    dir       string
    mediatype string
}

func (r *FileHandler) ServeHTTP(resp http.ResponseWriter, req *http.Request) {
    sendFile(resp, r.mediatype, r.file)
}

func (r *DirHandler) ServeHTTP(resp http.ResponseWriter, req *http.Request) {
    rsc := req.PathValue("resource")
    fmt.Printf("serving resource from dir %s\n", rsc)
    sendFile(resp, r.mediatype, filepath.Join(r.dir, rsc))
}

func sendFile(resp http.ResponseWriter, mediatype, path string) {
    log.Printf("Sending file %s of type %s\n", path, mediatype)
    resp.Header().Set("Content-Type", mediatype)
    info, err := os.Stat(path)
    if err != nil {
        log.Printf("could not serve %s\n", path)
        return
    }
    resp.Header().Set("Content-Length", fmt.Sprintf("%d", info.Size()))
    stream, err := os.Open(path)
    if err != nil {
        log.Printf("could not serve %s\n", path)
        return
    }
    defer stream.Close()

    _, err = io.Copy(resp, stream)
    if err != nil {
        log.Printf("error streaming %s\n", path)
        return
    }
}

func NewFileHandler(file, mediatype string) http.Handler {
    return &FileHandler{file: file, mediatype: mediatype}
}

func NewDirHandler(dir, mediatype string) http.Handler {
    return &DirHandler{dir: dir, mediatype: mediatype}
}

CDP_WEBSERVER:cdp-till/internal/web/handlers.go

Other than to point out that the DirHandler recovers the value of resource by using req.PathValue(), I don't think there's anything very interesting about this code.

Watching and Reloading

What I generally find when developing websites like this is that I am in a cycle of write (website) code, save it, go to the browser, press refresh, look at the results, go back to the editor.

To save time, I have often added an editor shortcut (usually some version of F12) which is bound to a command refresh-chrome which I've written to ask Chrome to refresh. On this occasion, I'm going to try and fully automate this: let's write a "watcher" module which watches for files to change in named directories and then use the CDP library (in Go) to ask the browser to refresh. This will also hopefully give us a taste of using the CDP library before we start doing complex things with it later.

I thought this was going to be fairly easy when I started out, but for a variety of reasons, it turns out to be somewhat complicated. So let's take it in stages.

The first thing I'm going to do is to handler the reloading, and then we'll come back and add a watcher to that.

And I'll take the reloader in stages as well. Here are all the headers and the main flow:
package chrome

import (
    "context"
    "log"
    "time"

    "github.com/gmmapowell/ignorance/cdp-till/internal/watcher"
    "github.com/mafredri/cdp"
    "github.com/mafredri/cdp/devtool"
    "github.com/mafredri/cdp/protocol/page"
    "github.com/mafredri/cdp/protocol/target"
    "github.com/mafredri/cdp/rpcc"
)

type Reloader struct {
    devconn *devtool.DevTools
    page    *devtool.Target
    client  *cdp.Client
    loadURL string
}

func (r *Reloader) Changed(file string) {
    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()

    if r.page == nil {
        if !r.EnsurePage(ctx) {
            return
        }
    }

    reloaded := r.TryReloading(ctx)

    if !reloaded {
        r.NavigateToURL(ctx)
    }
}

CDP_WEBSERVER:cdp-till/internal/chrome/reloader.go

It doesn't seem (to me) that there is one "official" CDP package in Go, but mafredri seemed the closest thing I could find, which claims to generate its API directly from the current CDP spec. I have to admit I don't really understand this (that's why we're here, after all), but I will try my best to handwave through it.

The Reloader struct has three pointers in it which represent a "connection" to the browser. I think devconn is a handle to a "development tools" instance which is just local to my Go program. Page seems to be a pointer to an open window or tab, and client seems to be an API on top of a connection between the two. The struct also keeps track of the URL the user wants to see displayed.

The Changed method is the main entry point here, which is called every time something changes and we want to reload the website. It creates a context (which is a standard thing in Go, which again, I don't really understand), which only exists for the duration of this update. The basic logic of Changed is:
  • Make sure you have a Page open (you might not when you start, or if the user has deliberately closed it);
  • Try finding an existing tab with the URL loaded and refresh that;
  • If you can't find one, open a new one.
With the exception of the context, this keeps all of the protocol ugliness out of this function.

Before covering that ugliness, here is the NewReloader method:
func NewReloader(url, devurl string) watcher.FileChanged {
    ret := &Reloader{loadURL: url, devconn: devtool.New(devurl)}

    // Make sure it loads 50ms after we start up, thus allowing the server to start
    time.AfterFunc(50*time.Millisecond, func() {
        ret.Changed("")
    })

    return ret
}

CDP_WEBSERVER:cdp-till/internal/chrome/reloader.go

This takes two URLs: the first one is the URL we want to visit (the one we are serving ourselves, http://localhost:1399/) and the other is the URL where we can connect to the development tools (http://localhost:9222/; see below). This method then creates the Reloader, passing it the visiting URL and a newly created DevTools instance with the devtools URL.

Because when we start this program, we will generally want it to display the web page, it then waits 50ms and invokes the Changed method. The delay is just there to enable the server (starting at the same time) enough time to open its socket.

Going back to the "ugliness", EnsurePage looks like this:
func (r *Reloader) EnsurePage(ctx context.Context) bool {
    var err error
    r.page, err = r.devconn.Get(ctx, devtool.Page)
    if err != nil {
        r.page, err = r.devconn.Create(ctx)
        if err != nil {
            log.Printf("Error opening a window: %v\n", err)
            return false
        }
    }

    conn, err := rpcc.DialContext(ctx, r.page.WebSocketDebuggerURL)
    if err != nil {
        log.Printf("Error connecting to CDP: %v\n", err)
        return false
    }

    r.client = cdp.NewClient(conn)
    return true
}

CDP_WEBSERVER:cdp-till/internal/chrome/reloader.go

I'm not entirely sure what all this does, but I've highlighted the most important four lines. We attempt to get (the? a?) Page that's currently open. If that fails, we try and create one. If that fails, it's an error.

We then try and Dial the WebSocketDebuggerURL associated with that page. (DialContext is just the name of the Dial overload which takes a context.) And then we attempt to create a CDP Client which wraps an API around that socket connection (I think). This is the client which has an API.
func (r *Reloader) TryReloading(ctx context.Context) bool {
    tabs, err := r.client.Target.GetTargets(ctx, &target.GetTargetsArgs{})
    if err != nil {
        log.Printf("Error recovering targets: %v\n", err)
        r.client.Page.Close(ctx)
        r.EnsurePage(ctx)
        return false
    }
    for _, ti := range tabs.TargetInfos {
        if ti.URL == r.loadURL {
            reload := &page.ReloadArgs{}
            reload.SetIgnoreCache(true)
            err = r.client.Page.Reload(ctx, reload)
            if err != nil {
                log.Printf("Error refreshing target: %v\n", err)
                return false
            }
            return true
        }
    }
    return false
}

CDP_WEBSERVER:cdp-till/internal/chrome/reloader.go

We try to reload an existing page if it's possible to do so. I believe that GetTargets is supposed to recover all of the currently open pages, but the word Target, along with various pieces of documentation I've read, suggest that it is somewhat broader than that. I've seen this request fail, usually when I've closed a window or something, so I've added code here to close any open page, then reopen a page and return false if it sees an error, which will cause the Changed function to move on to loading the page from scratch (see below).

If we do manage to recover a list of Targets, we then go through them all and see if any of them have the same URL as the URL we are trying to load. Note that Chrome normalizes all of its URLs, so you need to be careful when specifying the matching URL to do the same: in particular, it has a trailing / on the end of the URL (which is actually a leading slash on the path portion). If you don't do this right, the URLs will not match.

I want this reload to be a "hard" reload so that all the CSS and JS files are reloaded along with the HTML file. I believe that the call here to SetIgnoreCache is supposed to do that, but it is not currently working for me. This code refreshes the HTML code but does not refresh the CSS or JS files. When I do the hard refresh manually, it works fine.

Moving on to the NavigateToURL code, which is called if there is not already a tab to refresh:
func (r *Reloader) NavigateToURL(ctx context.Context) {
    navArgs := page.NewNavigateArgs(r.loadURL)
    nav, err := r.client.Page.Navigate(ctx, navArgs)
    if err != nil {
        log.Printf("Error navigating: %v\n", err)
        return
    }
    if nav.ErrorText != nil {
        log.Printf("Error navigating: %v\n", *nav.ErrorText)
        return
    }
}

CDP_WEBSERVER:cdp-till/internal/chrome/reloader.go

This basically constructs an argument block with the desired URL in it and then passes that to Page.Navigate. Interestingly, I found that if there is a problem with loading the page (i.e. a 404 or a 500 or whatever), this is not returned as an error, but is considered a success, but there is a pointer to a string inside the response which contains the error text. I am sure there is a good reason for this design, but it seems odd. Anyway, that is why there are two tests for errors here: there are two mechanisms for handling errors.

That's the reload code. Now we can turn our attention to the code watching the disk for changes.
package watcher

import (
    "log"
    "path/filepath"

    "github.com/fsnotify/fsnotify"
)

type FileChanged interface {
    Changed(file string)
}

CDP_WEBSERVER:cdp-till/internal/watcher/watcher.go

Apart from the boilerplate, this defines the interface FileChanged which the Reloader implemented.
func Watch(dir string, handler FileChanged) {
    watcher, err := fsnotify.NewWatcher()
    if err != nil {
        log.Printf("could not launch a watcher, you're on your own")
        return
    }
    defer watcher.Close()

    watcher.Add(dir)
    watcher.Add(filepath.Join(dir, "css"))
    watcher.Add(filepath.Join(dir, "js"))

    for {
        select {
        case event, ok := <-watcher.Events:
            if !ok {
                return
            }
            if event.Has(fsnotify.Write) {
                handler.Changed(event.Name)
            }
        case err, ok := <-watcher.Errors:
            if !ok {
                return
            }
            log.Println("error:", err)
        }
    }
}

CDP_WEBSERVER:cdp-till/internal/watcher/watcher.go

The Watch method basically copies the example provided by the author of the fsnotify library. It creates a new watcher, adds the three directories we want to watch, and then when we receive a "changed" event which represents a Write (i.e. an update), we call the handler method.

A main command

Now that we have all the pieces, we need to wire all this up in main:
package main

import (
    "github.com/gmmapowell/ignorance/cdp-till/internal/chrome"
    "github.com/gmmapowell/ignorance/cdp-till/internal/watcher"
    server "github.com/gmmapowell/ignorance/cdp-till/internal/web"
)

const port = "1399"

func main() {
    reloader := chrome.NewReloader("http://localhost:"+port+"/", "http://localhost:9222")
    go watcher.Watch("website", reloader)
    server.StartServer(":" + port)
}

CDP_WEBSERVER:cdp-till/cmd/till/main.go

And obviously we need to build it and run it:
$ go -C cmd/till build
$ cmd/till/till
This should hang "forever" because it is a webserver. Unfortunately, because of the "reload" code, it needs to connect to port 9222, and there isn't anything there at the moment. You need an instance of Chrome running that is listening to port 9222.

As far as I can tell, you can't just "turn this on" inside an already running Chrome instance. You need to actively enable the --remote-debugging-port option on the command line. That being the case, I'm going to use Google Chrome for Testing which you can get here.

And then run like this on a Mac:
$ chrome-mac-x64/Google\ Chrome\ for\ Testing.app/Contents/MacOS/Google\ Chrome\ for\ Testing --remote-debugging-port=9222
or like this on linux:
$ chrome-linux64/chrome --remote-debugging-port=9222
Once that is running, it is listening on the port you specify (I have used 9222, which I obtained from examples on the internet, but you can use any port as long as you are consistent and change the reload code above) and the server should start and load a website.

And A Website

Ah yes, the website. I have also checked in a basic website: index.html, js/runtime.js and css/till.css.

I'm only going to go through this briefly because it's all fairly vanilla..
<html>
    <head>
        <title>A Simple Till Program</title>
        <template id='root'>
<div class="root"></div>
        </template>
        <template id="row">
<div class="row"></div>
        </template>
        <template id="cell">
<div class="cell">
    <div class="cell-text"></div>
</div>
        </template>
        <link rel="stylesheet" href="css/till.css" type="text/css">
        <script src="js/till-runtime.js" type="module"></script>
    </head>
    <body>
        <div id="content-goes-here"></div>
    </body>
</html>

CDP_WEBSERVER:cdp-till/website/index.html

This html lays out the basic structure of the website as a content area in the body (content-goes-here), along with three templates: one for the "root" (i.e. everything), one for each "row" of cells, and one for each "cell". It also "includes" the JavaScript and CSS.

As I find is often the case with module JavaScript, the "main" module is just a handler for the load event:
window.addEventListener('load', function(ev) {
    var div = document.getElementById("content-goes-here");

    var root = document.getElementById("root");
    var iroot = root.content.cloneNode(true);
    iroot = div.appendChild(iroot.children[0]);

    var row = document.getElementById("row");
    var irow = row.content.cloneNode(true);
    irow = iroot.appendChild(irow.children[0]);

    var cell = document.getElementById("cell");
    var icell = cell.content.cloneNode(true);
    icell = irow.appendChild(icell.children[0]);
    icell.className = "cell blue-cell";

    var tc = icell.querySelector(".cell-text");
    tc.appendChild(document.createTextNode("hello"));
})

CDP_WEBSERVER:cdp-till/website/js/till-runtime.js

When the page has finished loading, it instantiates each of the templates and wires them up together, adding the whole lot to the content area. This is a snapshot of what the runtime is going to do when it is complete.

Meanwhile, the minimal CSS styles this table:
.cell {
    display: flex;
    width: 100px;
    height: 100px;
}

.blue-cell {
    background-color: blue;
    color: white;
}

.cell-text {
    margin: auto auto;
}

CDP_WEBSERVER:cdp-till/website/css/till.css

It should be enough that if you check everything out, build it and run cmd/till/till in the root directory, and then point your browser to localhost:1399, you should see something like this:


Conclusion

We managed to write - or steal - a web server and build a basic website. Come on, you didn't actually think that was going to be hard, did you?

For kicks, I also added code to automatically refresh the page when we changed the source. That was more complicated than the rest of it put together.

No comments:

Post a Comment