Tuesday, September 10, 2024

Handling Git Projects

One of the key use cases I have for ScriptFormatter is "generating" this blog. As I have said many times, I am incredibly lazy and not at all good at dealing with anything involving "pixels", whether images, pixel-perfect UI layouts, powerpoints or manipulating a blog page to look just the way I want it.

What I would rather do is just type out (much as I am at the moment), a whole lot of words, group them according to meaning, and then have a tool translate all of that to "pixel-perfect" HTML.

Even worse than pixel-perfect layouts, however, is all the effort involved in copying across code. I just hate doing that. Apart from anything else, there's the duplication. And then the rework if you go back and fix a bug.

So, at some point (certainly not initially), I came up with the idea of including my code samples directly from git. (In fact, I thought this was such a good idea, I did it twice - we'll get to that.) The idea here is that git already has all of the changes you made to your code, over time, so at any point (e.g. when you are regenerating your blog two years later), you can reach back in time and pluck out the code exactly as it stood back then - or alternatively, you can make a change back in history, follow it through, retag everything and regenerate the blog to reflect the code as changed, but with the correct history still in place.

How does this work?

A Simple Overview of Git

I imagine most people who read this already know how git works, so I won't labour the obvious. The key features I am going to rely on here are:
  • Git keeps track of a series of versions of a file structure (a repository, itself located somewhere in the filesystem) and can, at any time, show you what changed between two versions of that structure, or any part of it;
  • Git allows you to tag any given version with a name of your choosing: that name can later be moved if you want to point to a different version, but Git will manage the mapping of name to version;
  • It is possible to pluck out what any given file looked like at any time in the past by giving the version number (or a tag) and the path to the file in the repository.

My first technique

The first thing that I did, and still the thing that I use most on this blog, is based on the idea that pretty much everything I write about proceeds on a commit-to-commit basis. In other words, I write some code, get it working, check it in, give it a tag and then blog about it. That being the case, almost every time I blog, I want to show you (the readers) what I changed and why. The details are a little gory, but this basically consists of three steps:
  • I indicate where the git repository is;
  • I specify which tag I want to use and which file, and possibly starting and ending points in that file;
  • The processor removes all the "annotation" lines, together with any lines that have been removed, and then pretties the result up for display and writes it into the output as if I had cut and paste it into the input myself.
As an example, I can write this
&git "~/IgnoranceBlog"
which specifies the root of the repository.

Then if I write this:
&import "FBAR_PLAYWRIGHT_DEMO" "build.gradle"
it will include the whole of a file (build.gradle) as it stood at a specific moment in time. This works well for short files and especially for short files that have just been introduced into the project.

If I only want to include an excerpt of a file, I can specify patterns to indicate where to start and end a selection:
&import "FBAR_PLAYWRIGHT_CHROMIUM" "FBAR.java" "chromium" "navigate"
This includes the contents of the specified file at the specified point from the first line matching "chromium" to the first line from there matching "navigate". For convenience, regular expressions are permitted. If the to pattern is omitted, the entirety of the file from the from pattern is included.

My Second Technique

And that all worked fine while this blog stayed within its fundamental remit to be essentially "co-written" with the code. If you look at the git history, you will see that almost every commit is tagged - for this very reason.

But then one day I went "off the rails". I wrote a lot of code, checked it in and then wanted to come back and comment on this and that that I'd done. And, because of where my mind was that day, I was thinking of everything in terms of files and the code I'd previously written (for the doc processor), to include code samples in documentation. And that works differently: you specify a file to include, and then specify the section of it you want to show, and then any sections you want to hide.

Of course, there's no reason you can't do that with a file extracted from git. So I put in a couple of hacks to say "well, you do need to specify the tag, because that isn't there - it's just pulling a file from a directory" and then, deep in the bowels of the include code, I need to say "ah, that file you want isn't that file, it's this thing that I've just pulled out of git".

And we have a solution to that

And as I looked at that code with my new-fangled "I can rewrite the file system better than it's ever been done before" hat on, I realized that I was describing a new way of thinking about git: as a filesystem provider.

In other words, it's possible to view git as a filesystem provider with multiple "roots" - each tag or branch in each repository can be considered as a root. So I could specify a path something like this:
git:~/IgnoranceBlog:FBAR_PLAYWRIGHT_DEMO:playwright-fbar/build.gradle
In this case, the repository is ~/IgnoranceBlog, the tag is FBAR_PLAYWRIGHT_DEMO and the path is playwright-fbar/build.gradle (which is relative to the repository). And if I can construct this internally, I can use all of my existing code that handles importing without any difficulty. (For full disclosure, the previous version did something not dissimilar to this, but had a specific hack in to make it work).

Obviously there is a lot of work putting that filesystem provider together, and then there is a little more work integrating it into the processor, but, yeah. It's that easy.

No comments:

Post a Comment