Ignorance may be Strength

Tuesday, July 18, 2023

Basic CSS

How much styling we want to add is always a difficult question. For me, it tends to be somewhat "creeping" - I'll start with "just enough" and then keep adding a bit at a time while I'm working on other features and using the code. On here, though, I tend to keep it minimal, especially since I'm trying to make it clear what is important and what isn't. So while I might usually do things such as styling the date pickers, all I'm going to do here is make it clear what's included in the "feedback" box and then style it to look like a calendar.

So, in this case, I add a border around the feedback area to make it clear what the "page" looks like, then a border around each "date" box, and then style the number in the date box. The only oddity (IMHO) is styling body-day as position: relative which has nothing to do with what it wants, but just to make it the frame of reference for the position: absolute in body-day-date.

Simples. You may also notice that I added a css class onto the feedback box, changed the id to a class for the controls box, and fixed a bug in where the date value was being added in controls.js.

.feedback {
  border: 0.1mm solid black;
}

.body-day {
  display: inline-block;
  border: 0.1mm solid black;
  margin: 2mm;
  width: 25mm;
  height: 25mm;
  position: relative;
}

.body-day-date {
  position: absolute;
  top: 3mm;
  left: 3mm;
  font-family: 'Courier New', Courier, monospace;
  font-weight: bold;
  font-size: 5mm;
}

This code is in PLANNING_CALENDAR_BASIC_CSS.

Wiring up the Basics

Now that we have the outline of the calendar, we want to actually generate the calendar that has been asked for. As you may have guessed, 90% of this blog is going to be about how to get the calendar to look the way you want; the actual generation is a few simple lines of JavaScript. Certainly I'm doing this more than anything else (well, I guess I'm actually interested in the end product, but as I already have a functioning desktop app…) to investigate building CSS stylesheets dynamically in JavaScript.

That being the case, I'm going to gloss over most of this with a couple of classic handwaves. As always, if you feel I've left out significant details, get back to me and I'll elaborate.

The outline has three ui elements, each of which controls what is seen in the feedback window, and when any of them change, we want to generate an appropriate calendar based on the current set of three values. So, we add a listener on the appropriate change event for each control, all of which invoke the same method to redraw the calendar.

While I'm about setting that up, I have also added an init method which initializes the default start and end dates to be today and, for good measure, then generates a "default" feedback pane to be this week, starting on Monday. It's not a lot, but it gets you started.

    <script src="js/controls.js"></script>
  </head>
  <body onload="init()">
    <div id="controls">
      <label for="start-date">Start date:</label>
      <input type="date" id="start-date" onchange="redraw()">
      <label for="end-date">End date:</label>
      <input type="date" id="end-date" onchange="redraw()">
      <label for="first-day">First Day:</label>
      <select id="first-day" onchange="redraw()">
        <option value="1">Monday</option>
        <option value="2">Tuesday</option>
        <option value="3">Wednesday</option>
        <option value="4">Thursday</option>
        <option value="5">Friday</option>
        <option value="6">Saturday</option>
        <option value="0">Sunday</option>
      </select>
    </div>

What we want to draw is basically the same as what I put in the outline: a nested structure consisting of the whole calendar containing weeks, containing days, containing information about that day. redraw needs to clear out the existing feedback calendar and draw a new one starting the week of the start date from the first day until the end date is in the past.

A couple of notes on this code which is, otherwise, fairly straightforward:

The calculation of the offset for the start of the week works most of the time, but when the start date is a day of the week before the "left column day", the first week might be missed, and it is important to check for this and start the calendar a week earlier.
It's important to copy the value of from into leftDate (and leftDate into cellDate), otherwise, because dates in JavaScript are references, both leftDate and from will be updated, making this check pointless.
In order to make this work correctly, I changed the values of the options in the select statement from what I had initially used.
I have used a "do" loop to make sure at least one week comes out, but after that it is just a question of deciding whether the end date is in the future or the past. This automatically handles the case where the end date is before the start date in a satisfactory way.
There are comments logged to the console to clarify what is happening.

Obviously, we have not provided any styling yet. Because of this, by default, all the dates "run together" on a line to make a mess. This did not happen in the outline because of the white space inherent in the layout of the document. To replicate this, I have added a white space character between successive date boxes which will be removed when we add the borders.

var start, end, first, fbdiv;

function init() {
  start = document.getElementById('start-date');
  end = document.getElementById('end-date');
  first = document.getElementById('first-day');
  fbdiv = document.getElementById('feedback');

  start.valueAsDate = new Date();
  end.valueAsDate = new Date();

  redraw();
}

function redraw() {
  var from = new Date(start.value);
  var to = new Date(end.value);
  var leftColumn = parseInt(first.value);
  console.log("Generating calendar from", from, "to", to, "based on", leftColumn);

  fbdiv.innerHTML = '';
  var leftDate = new Date(from);
  leftDate.setDate(leftDate.getDate() - leftDate.getDay() + leftColumn);
  if (leftDate > from) {
    leftDate.setDate(leftDate.getDate() - 7);
  }
  do {
    console.log(" showing week with", leftDate, "in the left column");

    // create a div for the whole week
    var week = document.createElement("div");
    week.className = "body-week";
    fbdiv.appendChild(week);
    for (var i=0;i<7;i++) {
      var cellDate = new Date(leftDate);
      cellDate.setDate(cellDate.getDate() + i);
      console.log(" cell", i, "has date", cellDate);

      // create a div for each day, to contain all the aspects we will have
      var day = document.createElement("div");
      day.className = "body-day";
      week.appendChild(day);

      // the first aspect is the date
      var date = document.createElement("div");
      date.className = "body-day-date";
      day.appendChild(date);

      // and set the date text in here
      var dateValue = document.createTextNode(cellDate.getDate());
      day.appendChild(dateValue);

      // add a space between days until we style this
      var space = document.createTextNode(" ");
      week.appendChild(space);
    }

    // advance to next week
    leftDate.setDate(leftDate.getDate() + 7);
  } while (leftDate <= to);
}

The code is under the tag PLANNING_CALENDAR_BASIC_WIRING.

Monday, July 17, 2023

Planning Calendars

I am, by nature, a planner.

Many who meet me through the agile community are surprised by this, but, in fact, the nature of my planning fits well with agile: it is small scale, and can be repeated endlessly as new information emerges.

As such, I am always looking for tools to help me achieve this or communicate this. And because the things I do are generally small scale and "cheap", the tools I am looking for are generally not high priced enterprise things.

One such tool is a calendar. As I always like to say when planning projects, "you can always get your money back; but once the time is gone, it's gone". In my experience, most people don't have a good grasp of time in general, and, in particular, that only one thing fits in a given spot.

The origins of this tool go back to a "Filofax replacement" project I was building on NeWS (if you're under 50, Google is your friend here) back in the late 80s. Not much came of that, but the PostScript header file is the basis of the subsequent calendar programs I've written over the past 30 years. The plan here is to replace all of those with one that doesn't use that PostScript header file because it is 100% web based (HTML and CSS with the appropriate print media definitions).

The key to this tool is threefold:

Most projects I have worked on do not run on a nice, comfortable "exactly one calendar month" basis for which a standard calendar is a good fit;
Most people I know work on a Monday-Friday schedule, and then enjoy a weekend, whereas most calendars I can buy seem to think the week runs from Sunday to Saturday;
The month breaks on standard calendars are extremely distracting and often make people count the same week twice.

So, what I want is a calendar whose first week is the first week of the project, whose last week is the last week of the project and exactly fills one sheet of A4 (or "letter paper" if you happen to live in the dark ages). Sounds simple? OK, well, in that case, for good measure, I'd like to be able to choose which day of the week it starts on (although the default should be Monday) and whether you arrange it in landscape or portrait orientation (obviously, this is just on the print menu, but I want the calendar to re-layout when I do). And obviously, I want a lot of controls and visual feedback on the screen, but just the calendar on the output.

Oh, and over the past 30 years I have acquired a few new requirements that would be good to include:

The option to display the calendar in different "styles", in particular, how and where the month/year names are displayed, colours and shading for dates, weeks and months, and so on;
Sometimes I do want "long range plans": not plans as such, but if I have certain fixed engagements over the next few months, I'd like to have a calendar that has those on it so that I can see what chunks of work I might be able to commit to in that span. Experience has taught me that 13 weeks (i.e. a quarter of a year) on a calendar works, but that 26 (half a year) does not. So I'd like to be able to have multiple pages with (approximately) the same number of weeks on each.
There are a lot of calendars out there in ICS form that I'd like to include so that I can have them show up in order to avoid conflicts and to know when holidays are; and I'd like to be able to configure how they show up.

All good? Let's get started.

The basics

I believe in keeping things simple. So what I basically want is a single page that has a control area and a feedback area. The control area allows you to select a start date (which will be in the first row of the calendar) and an end date (which should be later than the start date, in which case it will be in the end row; otherwise you will just get a one-week calendar with the start date) and which day of the week is going to appear in the left-hand column.

The feedback area shows you approximately how the calendar will display, but without laying it out "on paper", or applying any fancy styling.

And then the "print" button does all the hard work through the application of a print media stylesheet.

Let's get started. We won't get everything we want done today, but let's at least get that outline in place.

The code for this can be found tagged PLANNING_CALENDAR_OUTLINE in the directory calendar. We're just getting started here, so nothing really works as such, but hopefully it's a useful guide to you as to where we're going with this - I know it is to me.

<html>
  <head>
    <title>Planning Calendar</title>
    <link rel="stylesheet" href="css/controls.css">
    <link rel="stylesheet" href="css/feedback.css">
  </head>
  <body>
    <div id="controls">
      <label for="start-date">Start date:</label>
      <input type="date" id="start-date">
      <label for="end-date">End date:</label>
      <input type="date" id="end-date">
      <label for="first-day">First Day:</label>
      <select id="first-day">
        <option value="0">Monday</option>
        <option value="1">Tuesday</option>
        <option value="2">Wednesday</option>
        <option value="3">Thursday</option>
        <option value="4">Friday</option>
        <option value="5">Saturday</option>
        <option value="6">Sunday</option>
      </select>
    </div>
    <div id="feedback">
      <div class="body-week">
        <div class="body-day">
          <div class="body-day-date">17</div>
        </div>
        <div class="body-day">
          <div class="body-day-date">18</div>
        </div>
        <div class="body-day">
          <div class="body-day-date">19</div>
        </div>
        <div class="body-day">
          <div class="body-day-date">20</div>
        </div>
        <div class="body-day">
          <div class="body-day-date">21</div>
        </div>
        <div class="body-day">
          <div class="body-day-date">22</div>
        </div>
        <div class="body-day">
          <div class="body-day-date">23</div>
        </div>
      </div>
    </div>
  </body>
</html>

At the moment, this website is 100% static, so you can just open index.html in your browser, and everything will just work fine. This is going to be true for a while, until we start opening ICS calendars using AJAX. Then we'll need to have it running on/in a server. So that I'm prepared, I just run this command in the web directory:

python -m SimpleHTTPServer 8080

And then visit localhost:8080 in my browser. I know there are 2,000 ways of doing this, so feel free to use your own. I have just got into this habit, and it keeps working for me.

Wednesday, April 26, 2023

Populating a Side View with a Notification

Following on from my previous post, it has come to my attention that it is also possible to populate the view by using a notification from the back end to the front end.

The purpose of notifications is to provide updates without being asked. In our case, it makes sense when we have updated the repository (after parsing one or more files) to send such a notification. Otherwise, the front end needs to "guess" when the information may have changed and request it again.

So the task here is to try and turn the process around, and instead of sending a request for the information we requested, for the back end to automatically send it when it is ready.

Reading through the specification, it seems that this is possible, but that it requires us to "extend" the protocol. But I'm not entirely sure how to do that. It turns out that while it's possible, it doesn't seem to me that it's "genuinely" supported at all.

On the server side, we need to define a new interface which extends LanguageClient and specify the new notification methods that we want, tagged with the @JsonNotification attribute, which specifies the command string that will be used in sending the message.

package ignorance;

import org.eclipse.lsp4j.jsonrpc.services.JsonNotification;
import org.eclipse.lsp4j.services.LanguageClient;

public interface ExtendedLanguageClient extends LanguageClient {
@JsonNotification("ignorance/tokens")
void sendTokens(Object object);

}

It should then just be a case of telling the LSPLauncher that this is the protocol we want to use, but none of the factory methods takes an interface - they all assume that you want to use LanguageClient. But fortunately, there is nothing magic about any of these methods, so we can extract it locally and make the change we need.

Launcher<ExtendedLanguageClient> launcher = createServerLauncher(server, in, out);

  private static Launcher<ExtendedLanguageClient> createServerLauncher(IgnorantLanguageServer server, InputStream in, OutputStream out) {
    return new LSPLauncher.Builder<ExtendedLanguageClient>()
      .setLocalService(server)
      .setRemoteInterface(ExtendedLanguageClient.class)
      .setInput(in)
      .setOutput(out)
      .create();
  }

Likewise, on the client side, it should be just as simple as calling the onNotification method of the client, but this can only be done once the client is ready, and thus after communication has already started up - meaning that some messages might be sent (and lost) before the handler is installed. I tried a number of ways of working around this (there appears to be a notion of Features which can install handlers) but nothing fixed the fundamental problem, so I ended up having to add my own synchronization around this. But for now, here's the code that sets up the handler in the onReady method.

  client.onReady().then(() => {
    client.onNotification("ignorance/tokens", () => {
      console.log("received token notification");
    });

So to handle the synchronization, we add another command with the message ignorance/readyForTokens. In the onReady method we call this AFTER configuring the notification handler.

    client.onNotification("ignorance/tokens", () => {
      console.log("received token notification");
    });
    client.sendRequest(ExecuteCommandRequest.type, {
      command: 'ignorance/readyForTokens',
      arguments: [ ]
    });

    const tokensProvider = new TokensProvider();

On the server side, we can then start sending messages when appropriate.

      public CompletableFuture<Object> executeCommand(ExecuteCommandParams params) {
        switch (params.getCommand()) {
        case "java.lsp.requestTokens": {
          System.err.println("requestTokens command called for " + params.getArguments().get(0));
          return repo.allTokensFor(URI.create(((JsonPrimitive) params.getArguments().get(0)).getAsString()));
        }
        case "ignorance/readyForTokens": {
          System.err.println("readyForTokens");
          amReady = true;
          return CompletableFuture.completedFuture(null);
        }
        default: {
          System.err.println("cannot handle command " + params.getCommand());
          return CompletableFuture.completedFuture(null);
        }
        }
      }

Which just leaves the process of actually sending a message, which we want to do once we have finished parsing. There is a certain amount of off-camera wiring to make this happen reliably.

  private void parseAllFiles() {
    File file = new File(workspaceRoot.getPath());
    parseDir(file);
    client.sendTokens("hello, world");
  }

All the complex wiring is once again left as an exercise for the reader (see VSCODE_NOTIFICATIONS_COMPLEX_WIRING).

Conclusion

It is certainly possible to configure both client and server to handle custom notifications, but there are enough tricky issues involved that I don't think that it can honestly be described as "supported". But if you're an event-driven freak like me, you just have to bite it off.

Tuesday, April 25, 2023

Adding a side view to LSP

It seems like a while since I last worked on my LSP server (*checks git history*: it's a very long while, OK).

But right now I find myself wanting to add a new view with some meta data to my LSP server in "the real world". And so I come back here to figure out how to support that.

I was originally thinking I could add a window at the bottom (akin to Problems, or Output) but it doesn't seem possible to add something there. I considered adding a virtual document but then I realized that was a lot of hard work for not much gain and what I think I probably want is a view.

So here goes …

The basic setup is that we want to add a view which is capable of presenting some hierarchical information as a tree. The thought would be that the server has some way of deducing "global" information about a project which it can then feed back to you (a class explorer, for example).

Looking back at what I did before, I can see that for each project in the workspace, I collected together a list of tokens (called Names) which have a file (in the form of a URI), and a location. For the purposes of testing what I want to, I am going to use this a source of information and build a tree which has a list of Names, and then for each Name, it will return a list of URIs, and within each URI a list of locations. If I understand what I did previously, each name will only have one URI and location, but that is not all that interesting: it's the fact that it's presented as a list which I care about.

But before we get to that, we need to start on the front end, and add a new contribution to our existing package.json.

    "views": {
      "explorer": [
        {
          "id": "ignorantTokens",
          "name": "Ignorant Tokens"
        }
      ]
    }

And run the extension using F5. Well, yes, that was easy. The "Ignorant Tokens" view appears at the bottom of the left hand column. Click on it and it expands to show … nothing.

So now we need to put some content there. For now, we're not going to do the hard work of getting the content from the server, we're just going to hack something in, say a list of items with lists in them. We'll use multiple items here to make sure our code works, even if that isn't something the Java backend is going to offer.

Start small, Gareth, start small. So let's just start with a list of the currently active folders in the workspace.

The key to providing the content is to call the window.registerTreeDataProvider method and to provide it with an object which can provide all the data that you need. That, in itself, is not that hard.

const tokensProvider = new TokensProvider();
window.registerTreeDataProvider('ignorantTokens', tokensProvider);

Now we can move on to actually implementing the provider, along with the data model representing the tree. The provider has to be an instance of the interface vscode.TreeDataProvider, and this has basically two methods: one to get the children of an item and one to get the "tree item data" itself. The Microsoft example I'm working from assumed it was, in fact, possible for the item to be an instance of the TreeItem class, and I have followed suit. This is my implementation of ProjectTokens, which basically just says that it's not hard to represent a workspace folder as a tree item by providing the super constructor with the folder name and saying that it should initially be in the "collapsed" state (click to show children).

import { TreeItem, TreeItemCollapsibleState, WorkspaceFolder } from "vscode";

export class ProjectTokens extends TreeItem {
  constructor(wf : WorkspaceFolder) {
    super(wf.name, TreeItemCollapsibleState.Collapsed);
  }
}

The provider has a constructor which looks at all the folders in the workspace and creates an item for them. It's not actually clear to me whether it is better to put this code in the constructor, or to put it in the code which returns the top-level list of items. I have put it in the constructor here to provide for a clearer separation of responsibilities, but it is not clear to me how well this will adapt to changes.

On the subject of which, I think this is the case which is supposed to be handled by the Event onDidChangeTreeData, but I have to confess that I am ignorant about this and I could not (at this juncture) be bothered to go and check it out, nor am I sure who would do the checking to generate this event (a later juncture occurred while tidying up at the end; see below for how this works and is used).

The first method that VSCode calls when trying to generate the tree view is, oddly, getChildren. But it does so passing in undefined or null or something which gives you the clue that it doesn't know whose children it wants: and the children of nobody must be the top level items, which we have to hand and can return.

Then for each of these items that you pass back, it turns around and asks you to hand it a TreeItem. As noted above, we chose to implement our ProjectTokens as TreeItems, so we can just return the very thing we are given. In other situations, you could return a member of the object, or presumably create one on the fly.

import { Event, ProviderResult, TreeDataProvider, TreeItem } from "vscode";
import { ProjectTokens } from "./tokenlocation";
import * as vscode from 'vscode';

export class TokensProvider implements TreeDataProvider<ProjectTokens> {
  locations: ProjectTokens[];
  constructor() {
    this.locations = [];
    if (vscode.workspace.workspaceFolders == null)
      return;
    for (var wf=0;wf<vscode.workspace.workspaceFolders.length;wf++) {
      this.locations.push(new ProjectTokens(vscode.workspace.workspaceFolders[wf]));
    }
  }

  onDidChangeTreeData?: Event<ProjectTokens | null | undefined> | undefined;
  getChildren(element?: ProjectTokens | undefined): ProviderResult<ProjectTokens[]> {
    if (!element) { // it wants the top list
      if (!this.locations) {
        return Promise.resolve([]);
      } else {
        return Promise.resolve(this.locations);
      }
    }
  }
  getTreeItem(element: ProjectTokens): TreeItem {
    return element;
  }
  getParent?(element: ProjectTokens): ProviderResult<ProjectTokens> {
    throw new Error("Method not implemented.");
  }
}

To support multiple levels of nesting, we now need to provide two more data classes (implementing TreeItem): Token and TokenLocation. The idea here is that Token gives you the name of the token that has been defined and TokenLocation gives you the location where it can be found.

import { TreeItem, TreeItemCollapsibleState, WorkspaceFolder } from "vscode";

export class ProjectTokens extends TreeItem {
  tokens: Token[];
  constructor(wf : WorkspaceFolder, tokens : Token[]) {
    super(wf.name, TreeItemCollapsibleState.Collapsed);
    this.tokens = tokens;
  }

  children() : Promise<Token[]> {
    return Promise.resolve(this.tokens);
  }
}

export class Token extends TreeItem {
  locations: TokenLocation[];
  constructor(name: string, locations: TokenLocation[]) {
    super(name, TreeItemCollapsibleState.Collapsed);
    this.locations = locations;
  }

  children() : Promise<TokenLocation[]> {
    return Promise.resolve(this.locations);
  }
};

export class TokenLocation extends TreeItem {
  constructor(where: string) {
    super(where, TreeItemCollapsibleState.None);
  }

  children() : Promise<Token[]> {
    return Promise.resolve([]);
  }
};

In the provider, we need to make three sets of changes. First, we need to change the signatures of the provider and all the methods to enable all three TreeItem types to be returned. The second, and most important, change is that in the getChildren method, when an element is provided, we must ask that element for its children. And finally, because we are hacking things together, we must update the constructor to provide the appropriate tokens.

import { Event, ProviderResult, TreeDataProvider, TreeItem } from "vscode";
import { ProjectTokens, Token, TokenLocation } from "./tokenlocation";
import * as vscode from 'vscode';

export class TokensProvider implements TreeDataProvider<ProjectTokens | Token | TokenLocation> {
  locations: ProjectTokens[];
  constructor() {
    this.locations = [];
    if (vscode.workspace.workspaceFolders == null)
      return;
    const tmp = [
      new Token("List", [
        new TokenLocation("46.2"),
        new TokenLocation("53.1")
      ]),
      new Token("Map", [
        new TokenLocation("15.7"),
        new TokenLocation("28.9")
      ])
    ];
    for (var wf=0;wf<vscode.workspace.workspaceFolders.length;wf++) {
      this.locations.push(new ProjectTokens(vscode.workspace.workspaceFolders[wf], tmp));
    }
  }

  onDidChangeTreeData?: Event<ProjectTokens | Token | TokenLocation | null | undefined> | undefined;
  getChildren(element?: ProjectTokens | Token | TokenLocation | undefined): ProviderResult<ProjectTokens[] | Token[] | TokenLocation[]> {
    if (!element) { // it wants the top list
      if (!this.locations) {
        return Promise.resolve([]);
      } else {
        return Promise.resolve(this.locations);
      }
    } else {
      return element.children();
    }
  }
  getTreeItem(element: ProjectTokens | Token | TokenLocation): TreeItem {
    return element;
  }
  getParent?(element: ProjectTokens | Token | TokenLocation): ProviderResult<ProjectTokens | Token | TokenLocation> {
    throw new Error("Method not implemented.");
  }
}

The back end

Now it's time to turn our attention to the back end and how we can generate this data on the server side and pass it across. You may recall (or, if you're more like me, may not - I had to go and look at the code) that the communication between client and server is handled by a language client (called client in extension.ts).

Now, the key to doing this is to be able to send a command across from the client to the server when we would otherwise generate the data ourselves - i.e. in the constructor of the data provider. But what message should we send? The documentation doesn't seem wildly clear on this, but it would seem that we can repurpose executeCommand for this purpose. The documentation states that "usually" this will return an applyEdit message, but it seems entirely possible that we could return something else - maybe "hello, world"? Let's give it a try, shall we?

First things first: we need to pass the client to the token provider, and that in turn means that we need to move the declaration down to the bottom. There are a couple of other caveats here: because we are going to call a method in client, we need to wait to make sure that it has initialized, so we attach this code to its onReady method. And because this call is asynchronous, we will want to call await, so it cannot be located in the constructor, so we need to move it out into a separate method. Note that because of this, the constructor could be outside the onReady callback if you needed to store it somewhere; I don't, and it looks neater to me all grouped together.

  client.onReady().then(() => {
    const tokensProvider = new TokensProvider();
    tokensProvider.loadTokens(client);
    window.registerTreeDataProvider('ignorantTokens', tokensProvider);
  });

Inside the tokensprovider, we need to call sendRequest specifying that we want an ExecuteCommandRequest and providing a unique string as the command identifier and then any arguments we might want (in this case the URI of the workspace folder). When we get the response back, we need to process it but this is left as an exercise for the reader.

  async loadTokens(client: LanguageClient) : Promise<undefined> {
    if (vscode.workspace.workspaceFolders == null)
      return;
    for (var wf=0;wf<vscode.workspace.workspaceFolders.length;wf++) {
      let uri = vscode.workspace.workspaceFolders[wf].uri.toString();
      const result = await client.sendRequest(ExecuteCommandRequest.type, {
        command: 'java.lsp.requestTokens',
        arguments: [ uri ]
      })
      // this.locations.push(new ProjectTokens(vscode.workspace.workspaceFolders[wf], result));
    }
  }

On the back end, we need to make two changes. First, we need to say that we now support executeCommand and to list the unique command identifiers we support. Then, in the WorkspaceService, we need to provide a trivial implementation of the command execution - in this case, returning "hello, world".

        ServerCapabilities capabilities = new ServerCapabilities();
        ExecuteCommandOptions requestTokens = new ExecuteCommandOptions(Arrays.asList("java.lsp.requestTokens"));
    capabilities.setExecuteCommandProvider(requestTokens);
        return new WorkspaceService() {
          @Override
          public CompletableFuture<Object> executeCommand(ExecuteCommandParams params) {
            System.err.println("execute command called for " + params.getArguments().get(0));
            return CompletableFuture.completedFuture("hello, world");
          }

I thought that was all I needed to do, but it turns out that it's important to take the final step of updating the list, because we need to make sure that we send notifications when we do.

I had commented earlier about the Event object that was part of the interface and I wasn't quite sure what to do about it. The answer would appear to be that it is an idiom that you define another, similar object called an EventEmitter and then wire up the Event to be the event field of this, and then to call the fire() method on the EventEmitter when you want to the tree to be updated.

      this.locations.push(new ProjectTokens(vscode.workspace.workspaceFolders[wf], result));
    }
    this._onDidChangeTreeData.fire();
  }

  private _onDidChangeTreeData: vscode.EventEmitter<ProjectTokens | Token | TokenLocation | null | undefined> = new vscode.EventEmitter<ProjectTokens | Token | TokenLocation | null | undefined>();
  readonly onDidChangeTreeData: vscode.Event<ProjectTokens | Token | TokenLocation | null | undefined> = this._onDidChangeTreeData.event;

In order to support this, I had to change "hello, world" to be an empty JSON array:

          public CompletableFuture<Object> executeCommand(ExecuteCommandParams params) {
            System.err.println("execute command called for " + params.getArguments().get(0));
            return CompletableFuture.completedFuture(new JsonArray());
          }

Exercise

As I always go back and read my own work, I also have to follow through on the exercise of wiring up the dots. I'm not going to bore you with the details here, but you can check it out in the repository (tag VSCODE_SIDEVIEW_EXERCISE).

Conclusion

It is possible to create side views in VSCode and populate them from a repository of information that can be found on the server side.

Late in the game, I also came across https://code.visualstudio.com/api/extension-guides/tree-view which contains explanations for some of the things I found confusing while doing this work. It might be worth a read.

Monday, February 20, 2023

Playwright and FBAR forms

For those fortunate enough not to know, an FBAR is a form required by the US Government for all US Persons who have assets worth more than $10,000 in another country. And once you have more than that, they want all the details, no matter how big or small. And to make matters worse, while they will allow you to fill in your personal details just once, if you have a joint account, you need to identify that individual for each and every jointly held asset.

And you need to do this every year, even though almost no information changes from year to year.

So, for a long time, I have wanted to automate filling this form in. Normally, I download the PDF version and complete it. Two years ago, I tried to see if I could automate that using the PDFBox tool, but for various reasons that did not work. But last year, I discovered that there is also an online version of the form, and a few weeks ago I discovered the Playwright Chrome Driver library. So …

Let's download Playwright

One of the cool things about Playwright (from my perspective) is that it has an API in Java. So I'm going to use that. In order to build everything, I'm going to use gradle since that seems quite common these days, and so to start with I'm going to have this build.gradle file:

plugins {
    id 'java'
    id 'application'
}

mainClassName = 'ignorance.FBAR'

repositories {
    jcenter()
}

dependencies {
  implementation 'com.microsoft.playwright:playwright:1.30.0'
}

task copyToLib(type: Copy) {
    from configurations.default
    into "$buildDir/output/lib"
}

Following along from the documentation, the first step is to create a central Playwright instance and then use that to open a browser window. I tend to use Chrome, so that's what I'm doing here, but you can also use Webkit or Firefox.

package ignorance;

import com.microsoft.playwright.Browser;
import com.microsoft.playwright.Page;
import com.microsoft.playwright.Playwright;

class FBAR {
  public static void main(String[] argv) {
    try (Playwright playwright = Playwright.create()) {
          Browser browser = playwright.webkit().launch();
          Page page = browser.newPage();
          page.navigate("http://whatsmyuseragent.org/");
    }
  }
}

(In order to get this to work with eclipse, at least for me, I needed to copy the files into a local directory, for which I created the gradle task copyToLib, ran it, and then added the copied JAR files to my classpath).

When I first run this, it downloads a whole bunch of files, which appear to be the browsers it supports. And then, to my (not) very great surprise, it threw an exception:

Caused by: com.microsoft.playwright.impl.DriverException: Error {
message='Target closed
name='Error
stack='Error: Target closed

Now, I have basically no idea what this means, but I'm going to assume I haven't set something up correctly. Before panicking though, I'm going to try it again. No, still no joy.

Reviewing the code, I realized I was a little over-zealous with my copying and, instead of launching chromium as planned, I launched webkit. I'm not really sure what that does, or what browser it would use, but changing it to chromium certainly solves the problem.

And I also want to see what I am doing, so I have added the headless-off and slowmo options to the launch configuration. And so we have the following:

      Browser browser = playwright.chromium().launch(new BrowserType.LaunchOptions().setHeadless(false).setSlowMo(50));
      Page page = browser.newPage();
      page.navigate("https://bsaefiling1.fincen.treas.gov/lc/content/xfaforms/profiles/htmldefault.html");

Filling in some fields

So, since I don't know what I am doing, I am going to just start randomly filling in the form (my overall plan is to pull all the info from my personal records, probably through a JSON intermediary) using the most stable references I can find. If you haven't pulled the link from the code, the form I'm filling in is here. And obviously, I have this open in a regular Chrome window with the inspector on so I can find things in it.

It seems to me the best way of identifying the first field - Email Address - is to use the div with class EmailAddress and then find the input within that. So let's do that and give my official email address - mickey.mouse@disney.com.

page.fill("div.EmailAddress input", "mickey.mouse@disney.com");

OK, that didn't work. It loaded the form, and then paused for a long while (30s to be precise) before giving up and telling me it couldn't find the div:

Timeout 30000ms exceeded.
=========================== logs ===========================
waiting for locator("div.EmailAddress input")
============================================================

And, after mature reflection, I realized that the div class is actually Email, not EmailAddress, so let's try that again:

page.fill("div.Email input", "mickey.mouse@disney.com");

Indeed, that does work. So let's quickly fill in the rest of the details in the form and check in.

      page.fill("div.Email input", "mickey.mouse@disney.com");
      page.fill("div.ConfirmedEmail input", "mickey.mouse@disney.com");
      page.fill("div.FirstName input", "Mickey");
      page.fill("div.LastName input", "Mouse");
      page.fill("div.PhoneNumber input", "770-555-1234");

The next thing to do is to "start" filling in the form. I'm not quite sure why those first fields don't count as filling in the form, I don't know. This involves pushing the "Start FBAR" button, which is the click action on the page.

But, as I go to look at the documentation for this, I discover that both fill and click on the Page have been deprecated in favour of locators, so I am going to digress for a moment into refactoring to use these.

It would seem that this is an attempt to abstract away CSS selectors, and this makes a lot of sense to me. It's more typing, to be sure, but we should never be afraid of trading typing for reliability and correctness.

Interestingly, in doing this, it turns out that there are a number of "duplicate" entries in the form. In particular, because it appears that it just matches some of the text you provide, the phrase "Enter your email address" also matches the confirmation message. To clarify, it is necessary to add .setExact(true) to the end. But I have to say, the error message is exceedingly helpful and clear:

Error: strict mode violation: getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Enter your email address.")) resolved to 2 elements:
1) <input type="text" class="_O" name="Email_5" placehold…/> aka getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Enter your email address.").setExact(true))
2) <input type="text" class="_O" placeholder="" maxlength…/> aka getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Re-enter your email address."))

So, with the refactoring done and the click() added, let's check in again:

      page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Enter your email address.").setExact(true)).fill("mickey.mouse@disney.com");
      page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Re-enter your email address.")).fill("mickey.mouse@disney.com");
      page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Enter your first name.")).fill("Mickey");
      page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Enter your last name.")).fill("Mouse");
      page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Enter your telephone number. Do not include formatting such as spaces, dashes, or other punctuation.")).fill("770-555-1234");

      page.getByRole(AriaRole.BUTTON, new Page.GetByRoleOptions().setName("Please click this button to begin preparing your FBAR.")).click();

So now we can quickly fill in the next few fields. I am hoping that I will never file this form late again (because it will be so easy when I have this script working!) but in the past few years I have struggled because they moved the deadline from June 30 to April 15 (to align with the US tax year). And I keep forgetting that. So, for the purposes of this blog, I will choose a reason and provide an explanation.

page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Filing name")).fill("Mouse FBAR 2022");
page.getByRole(AriaRole.COMBOBOX, new Page.GetByRoleOptions().setName("reason")).selectOption("A");
page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Explanation")).fill("I keep forgetting the deadline has changed.");

Once again, it fails. And once again, playwright's exception message is very clear:

Timeout 30000ms exceeded.
=========================== logs ===========================
waiting for getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Explanation"))
locator resolved to <textarea class="_k" placeholder="" maxlength="750" tabin…></textarea>
elementHandle.fill("I keep forgetting the deadline has changed.")
waiting for element to be visible, enabled and editable
element is not enabled - waiting...
============================================================

The element is not enabled. Checking by hand, it seems that "I forgot" is enough of an explanation and that you only need to provide an explanation if you choose "Other". I'm not that bothered, so I'm just going to comment all that lot out and move on.

page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Filing name")).fill("Mouse FBAR 2022");
// page.getByRole(AriaRole.COMBOBOX, new Page.GetByRoleOptions().setName("reason")).selectOption("A");
// page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("Explanation")).fill("I keep forgetting the deadline has changed.");

Filling in the joint assets

Right, well, so far, I don't think I've achieved anything very much. As my wife would say, "you could have done that by hand with a lot less typing". Fair enough. So let's skip to the interesting part of the operation (page 3 is just more information about the primary filer which only needs to be provided once). Parts II and III reflect accounts owned individually or jointly, and the forms can be duplicated by using the appropriate + button in the top right hand corner of the page. Now, as noted above, the main thing I want to do is not provide my wife's details ten times on the ten copies of the page for each of the forms (I don't actually want to do any of it) but this is the thing that drives me crazy).

In order to show the important things, then, I'm going to create two classes now: Portfolio, which holds all my assets, and JointAsset which holds the information about a joint asset. Because this shares most of its information with an individually held asset, I'm only going to have this have two fields: an AccountInfo and a Asset; I'm going to reuse the former when I go back and fill in Part I.

In the fullness of time, I will extract all the information about the portfolio from its ultimate sources of truth; for now, I am just going to hack something together. Anyway, it's all in PortfolioLoader.java:

package ignorance;

public class PortfolioLoader {

  public Portfolio load() {
    Portfolio ret = new Portfolio();
    AccountInfo me = new AccountInfo();
    AccountInfo other = new AccountInfo();
    ret.user(me);

    ret.joint(new JointAsset().jointWith(other).setMaximumValue(10000).setType("A"));
    return ret;
  }

}

All the other classes I created are just boring POJOs, although you could think of them as DTOs between the two systems (the loader and the form-filler).

For now, we are just going to try and load one account. This should not be too difficult. Having said that, we are going to build it as if we are loading multiple accounts and just throw an error if we reach the second.

So, we start by doing the obvious thing:

page.getByRole(AriaRole.TEXTBOX, new Page.GetByRoleOptions().setName("*15")).fill(Integer.toString(joint.getMaximumValue()));

which should identify the maximum account value field, but in fact, there are four of them (one in each of sections II, III, IV and V). So that doesn't work that well. We need some means of distinguishing them.

Looking through the structure there is a div with a class subForm Part3 and that would seem enough of a distinction. Note that although we would prefer to use a nice, stable mechanism for identifying the fields, in a pinch it is still possible to use a selector. So let's do that now.

Very good. That works. Let's check in again.

      boolean first = true;
      for (JointAsset joint : portfolio.joints()) {
        if (!first) {
          throw new RuntimeException("Not implemented");
        }
        first = false;

        Locator mypage3 = page.locator("div.subform.Part3");
        mypage3.getByRole(AriaRole.TEXTBOX, new Locator.GetByRoleOptions().setName("*15")).fill(Integer.toString(joint.getMaximumValue()));
        mypage3.getByRole(AriaRole.COMBOBOX, new Locator.GetByRoleOptions().setName("*16")).selectOption(joint.getType());
      }
      Thread.sleep(10000);

So the one remaining thing I'm interested in experimenting with before I get serious and start integrating things is to try adding a second page for a second asset. Adding the second asset to PortfolioLoader is easy enough:

    ret.joint(new JointAsset().jointWith(other).setMaximumValue(10000).setType("A"));
    ret.joint(new JointAsset().jointWith(other).setMaximumValue(20000).setType("B"));
    return ret;

which is all fine and dandy until we reach the exception we included earlier for the "more than one" case. Now we need to go back and handle that.

The first thing to do is to add another page by clicking on the "+" button. This has "+" as its aria label, so we can do this quite easily:

      for (JointAsset joint : portfolio.joints()) {
        Locator mypage3 = page.locator("div.subform.Part3");
        if (!first) {
          mypage3.getByRole(AriaRole.BUTTON, new Locator.GetByRoleOptions().setName("+").setExact(true)).click();

which works first time, but then puts us in the situation where it cannot resolve which of the two "Maximum Value" entries it should be considering:

Error: strict mode violation: locator("div.subform.Part3").getByRole(AriaRole.TEXTBOX, new Locator.GetByRoleOptions().setName("*15")) resolved to 2 elements:
1) <input class="_s" value="10000" type="numeric" placeho…/> aka locator("input[name=\"MaxAcctValue_137\"]")
2) <input class="_s" type="numeric" placeholder="" tabind…/> aka locator("input[name=\"MaxAcctValueCL_1676554053694\"]")

It turns out that the Locator abstraction is happy to contain one or many potential DOM nodes, until you decide to do something with them - then it complains about the fact that it cannot choose. But we can easily force that choice using the last() operator (we could use first() or nth(), but since the new forms are added at the end, last() is what is wanted).

So now we have this code:

        Locator mypage3 = page.locator("div.subform.Part3");
        if (!first) {
          mypage3.getByRole(AriaRole.BUTTON, new Locator.GetByRoleOptions().setName("+").setExact(true)).click();
          mypage3 = mypage3.last();

Conclusion

In the space of this blog post, I have convinced myself that Playwright is a reasonable tool for interacting with websites. Hopefully after the conclusion, I will be able to go on and build a tool to import JSON files and fill out FBARs. This will save me a headache for years to come - and may be of use to others too! The repository will be updated to include the final version, even though it is not shown here.

Addendum

In the process of working through the rest of the features, I discovered a number of wrinkles that needed to be resolved.

Selecting the country on this form is a little tricky, as it has an aria-label which is the empty string. I don't think there is anything that can capture that. I solved this problem by instead selecting based on a good, old-fashioned CSS locator. It's not as elegant, but it does the trick.

mypage3.locator("div.partSub div.choicelist.Country select").selectOption(joint.getCountry());

When filling out the address of the owners, it turns out that there is JavaScript logic that connects the country to the list of states. Before a state can be selected, this logic must be run. For whatever reason, this happens in real life but not in Playwright. A little bit of googling suggested that a "blur" event was necessary. (By the way, I discovered the monitorEvents operation in the Chrome Developer Tools while I was doing this: check it out).

with.getByRole(AriaRole.COMBOBOX, new Locator.GetByRoleOptions().setName("33")).selectOption(other.getCountry());
with.getByRole(AriaRole.COMBOBOX, new Locator.GetByRoleOptions().setName("33")).dispatchEvent("blur");

Early on, I had tested that I could add additional pages to the document. What I had not considered was that this would add additional + buttons. So when I came to add a third asset, it could not tell which + button to push. A simple last() fixed that problem:

mypage3.getByRole(AriaRole.BUTTON, new Locator.GetByRoleOptions().setName("+").setExact(true)).last().click();

At the end of the operation, I manually sign and submit the form - which gives me an opportunity to download what I have submitted. Sadly, this download ended up somewhere in the ether (possibly just in memory) where I could not find it. Before next year, I need to figure this out and save it responsibly to somewhere I can keep records of it.

Thursday, January 5, 2023

Lambda Snapstart is Harder than I Thought

Apologies, but kind of in violation of my rules, I don't have any actual example working code. That's because this is more complex than that and the number of moving parts were just too vast. But if you want any help or conversation about this, drop me a line and I'll do/share whatever I can.

When I saw that AWS had released a "snapstart" feature for Lambda, I was ecstatic. I have taken to using Lambda as a way of delivering servers with minimum fuss, but I have somewhat abused the technology by basically moving my existing server-based code into a lambda, along with its long start time.

I grew up in a world where the logic has always been that what you want to optimize is the time spent doing frequent operations; initialization time is effectively "free" (up to the point where it becomes minutes or hours: at which point you really have to do something about it). Lambda, on the other hand, says you have 10s: if you fail to complete in this time, you will be rejected and the whole thing starts again. When you are trying to configure multiple things with multiple services … it doesn't quite make it.

I finally went over the edge when I found that in order to support the latest version of JavaFX, I needed to copy a whole bunch of ".so" files from S3 to the "local disk". This was taking pretty much all the 10s … As I started to consider my options (I'd got as far as panicking, and decided that was not very productive), AWS announced the "SnapStart" feature: initialize once, use repeatedly. Excited, I turned it on for my functions (so excitedly, I just did it in the console, rather than using CloudFormation; more on that later).

And nothing happened. Or, at least, I continued to have problems. But why?

It only works on published functions

Uh, boss, it's not as simple as that. On the upside, it is very clear that the 10s timeout does not apply to SnapStart functions - you have up to 900s to be used when the function is published. On the downside, this implies and, indeed, it is actively stated that you need to publish your functions in order to take advantage of this. Except my test environment does not bother with publishing and versioning lambdas, so I still want to bring that initialization time down.

The Runtime Hooks

Before doing that, I decided to at least try and integrate the runtime hooks and issue tracing messages. My thought process was firstly to see if these were called when the lambda wasn't published, and even if they weren't, I would then at least know when I had successfully published the lambda, because my tracing would come out.

In order to turn on the runtime hooks, it is necessary to download the CRAC library and attach this to your project. The AWSHandler then needs to implement the CRAC Resource interface, and register with the CRAC global context. So I added code like this:

public AWSHandler() {
Core.getGlobalContext().register(this);
}

and then it's a simple matter of implementing the callbacks provided in the Resource interface:

public void beforeCheckpoint(org.crac.Context<? extends Resource> arg0) {
logger.info("before checkpoint");
}
public void afterRestore(org.crac.Context<? extends Resource> arg0) {
logger.info("after restore");
}

When It Works …

I added the relevant code to publish and alias my lambdas, and also to ensure that the code inside APIGateway called the alias (and thus the published version), rather than the "$LATEST" version, and, lo and behold, it all worked.

During the publication, this happens:

INIT_START Runtime Version: java:11.v15 Runtime Version ARN: arn:aws:lambda:us-east-1::runtime:0a25e3e7a1cc9ce404bc435eeb2ad358d8fa64338e618d0c224fe509403583ca
Picked up JAVATOOLOPTIONS: -Dui4j.headless=true
-Dglass.platform=Monocle
-Dmonocle.platform=Headless
-Dprism.order=sw
-Djavafx.cachedir=/tmp/solibs
20230104-13:06:48.640 tdaserver/Thread-0 INFO: In config

The key thing here being the "INIT_START" rather than just "START". Interestingly, it doesn't seem to issue any message when the initialization is done, it just stops issuing messages.

And then, when the lambda is called during API Gateway access, I see this:

RESTORE_START Runtime Version: java:11.v15 Runtime Version ARN: arn:aws:lambda:us-east-1::runtime:0a25e3e7a1cc9ce404bc435eeb2ad358d8fa64338e618d0c224fe509403583ca
RESTORE_REPORT Restore Duration: 383.49 ms
START RequestId: c55dfaca-70b4-4a7e-8e4e-b6fc920269aa Version: 1
…
END RequestId: c55dfaca-70b4-4a7e-8e4e-b6fc920269aa
REPORT RequestId: c55dfaca-70b4-4a7e-8e4e-b6fc920269aa Duration: 785.60 ms Billed Duration: 1044 ms Memory Size: 1024 MB Max Memory Used: 401 MB Restore Duration: 383.49 ms

Here the RESTORE_START and RESTORE_REPORT make it clear that a SnapStart image is being used, and how much time has been used to literally start the lambda (383ms may seem a lot, until you realize that it was over 15000ms to actually do the initialization).

After that, the lambda proceeds in the normal way.

No CRAC Output

Interestingly, up to this point, I have not seen any of the tracing output I would expect from my CRAC callback. I don't know whether I didn't succeed in registering correctly, or whether my tracing is simply not coming out. In the fullness of time, I will need to sort this out because it is necessary to check that all the initialization that has been done up to this point is up to date.

Configuring from CloudFormation

As a bleeding-edge adopter, when I first tried to use SnapStart, there wasn't any active CloudFormation documentation on using it. For all I know, it wasn't supported in CloudFormation. However, now, a couple of months later, all the relevant documentation is there.

As you'd expect, you configure this by adding a SnapStart property to your Lambda Function configuration which is quite simple so that it basically amounts to adding:

"SnapStart": {
"ApplyOn": "PublishedVersions"
}

to your existing function declarations.

Conclusion

For a long time, Lambda on AWS with Java has been plagued by painfully slow startup times. It does seem that SnapStart makes major strides towards addressing these and, provided you are publishing your lambdas, is relatively easy to set up.

On the other hand, it seems somewhat opaque to use.