Thursday, December 12, 2019

Generating a Response for Lambda Proxy Integration

Go to the Table of Contents for the Java API Gateway

We now find ourselves in the position where we have working code and scripts that can redeploy that code: what is commonly known as "Green".

But the objective of our code at the moment is to output "hello, world" and it doesn't do that.  We need to return a body output somehow.

The output we are providing is in the IgnorantResponse class, but at the moment that's just an empty class.  We need to update it to return a body.

As far as I can tell, all of this works by Amazon magic using Java reflection on beans.

Lambda Proxy Communication

You can see what the lambda proxy input looks like by going back and studying the output from testing the API in the UI.  About halfway down there is a line that starts "Endpoint request body after transformations:".   Looking along that, you can see a big JSON structure which has fields called "resource", "path" and "httpMethod" among others.  If you are familiar with how HTTP works, all of this should seem very reasonable, even if some of it is AWS specific and generally there is too much information so it ends up being truncated.

Fortunately, the input and output formats are described in the AWS documentation.

So, fairly obviously, we want to set the body field of the response.

Creating a Response

Because AWS uses Java reflection - and assumes our response object is a POJO or bean, all we should need to do is to create the getBody() method on our IgnorantResponse and have it return the relevant message.

public class IgnorantResponse {
  public String getBody() {
    return "hello, world";
  }
}
We can then repackage and retry our curl command:
scripts/package.sh
curl https://tovogqsfoj.execute-api.us-east-1.amazonaws.com/ignorance/hello
We can also have the handler specifically set the status code.  This, for example, will return a 204 with no data:
public class IgnorantResponse {
  public int getStatusCode() {
    return 204;
  }

  public String getBody() {
    return "";
  }
}
In order to see the status code, you will need to use curl -v.

Conclusion

Working with the Response object is fairly simple: we just create the appropriate POJO getters for the fields that we want to populate in the lambda response.

You can find all the code in the repository tagged API_GATEWAY_HELLO_WORLD.

Next: We Need to Handle Input

Diagnosing 403 Errors on AWS API Gateway

Go to the Table of Contents for the Java API Gateway

Inwardly I groan every time I see a 403 come back from API Gateway.

Personally, I think they do it "wrong", but at the same time, I can see where they are coming from.

404 Issues

According to the HTTP standards, if a page doesn't exist, you should see a 404 response.  This is what we all expect.  When we see a 403, we start asking security-like questions, not "have I typed the right URL" questions.

In fact, the majority of times that I see a 403 with AWS API Gateway, the problem is that I have typed the wrong URL or am using the wrong method.  To make matters worse, API Gateway gives you absolutely no logging to let you know that this is happening (at least, I haven't found any), possibly because it cannot think of anywhere to put that logging for a resource it cannot find.

The root cause of this appears to be that API Gateway can't tell if the non-existent resource would be secured if it did exist.  Since it can't tell, it decides that it should tell you that you can't access it, rather than telling you that it doesn't exist.

Fixing the problem

So the first thing you need to do when encountering a 403 is to look very, very carefully at your URL and see if there is anything that could possibly be wrong with it.  On the upside, if the hostname is wrong you do get a sensible error - but mainly because Route53 has not registered the domain name and so your browser cannot resolve it!

If you are still wondering what went wrong in the first post, it's that the stage/deployment name "Ignorance" is misspelled "ignorance" (yes, just a case error) in the URI that is assembled at the end of the createGateway.sh script.  But it's a valuable point in two regards:
  • URLs must contain the "stage name" in order to function (otherwise you'll get a 403)
  • The stage name and resource name must both be spelled correctly (including case) or you will see a 403.
I am going to fix this from now on by having the stage name be lower case - but you will need to drop and recreate the gateway to see the changes.

Next: Let's Generate a Response

Wednesday, December 11, 2019

Using API Gateway with Java

This is a multipart experiment at trying to deploy an API Gateway instance on AWS which can integrate with a tell-don't-ask dispatch server, websockets and Couchbase.

The first few steps are to try and get a basic API Gateway server up with a Proxy Lambda behind it.
Then we want to merge in the Tell-Don't-Ask server
Then we want to introduce websockets
Finally, we want to handle unsolicited traffic

Minimal Setup for AWS API Gateway with Java

Go to the Table of Contents for the Java API Gateway

Although I've been using clouds for the past decade or so, I do not (yet) consider myself an expert with all things serverless.  This year,  I have done some work with AWS API Gateway and Lambdas, but everything I've done has been with node.js.

The node.js environment in Lambda is really quite friendly: you can (for small enough packages) view and even edit your code online.  But it seems fairly obvious that the same approach will not work for Java since "runtime" Java consists of bytecodes and jar files.

The background for this investigation is that I want to take one of my mainline servers, currently running on provisioned AWS boxes with high availability proxies and get it running in a serverless environment.

Of course, because it's a real system, it's a little complicated.  So I need to make sure that I can do all of the following things:
  • Actually get something up and running with API Gateway, Lambda and Java
  • Obtain all the information sent across from the client (method, headers, path information, payloads, etc)
  • Integrate with my existing code which uses custom servlet handlers
  • Handle WebSocket traffic (a showstopper a couple of years ago, but available since last December)
  • Integrate with complex backend components (specifically Couchbase on a different configured server)
Given that this is quite a few separate things, I'm going to split my treatment up into a series of blog posts (there is a contents page linked from the top of every post).  All the code is in one place - git@github.com:gmmapowell/ignorance-may-be-strength.git - under the directory aws-java-gateway.  Each post has its own git tag: for this post you will want to check out the tag API_GATEWAY_MINIMAL.

Before we Begin

If you want to do more than follow along, you'll obviously need an AWS account and you'll need to have it appropriately configured along with suitable configuration on your machine.  I think most of that is beyond the scope of this article.

You will definitely need to install the AWS CLI.  For MacOS, you can follow these instructions.

The operations we are going to perform need to take place in an AWS region.  You need to specify this to the scripts I have written in the REGION environment variable.  To move the code from your machine to AWS Lambda, we use an S3 bucket.  You need to provide the name of a bucket you have permission to write to in the BUCKET environment variable.

Getting Something Working

The first thing we want to do is to get an AWS API Gateway up and running.  I've done this bit before and - if you know me at all - you will know that there's no way I'm going to do it through the UI other than to experiment.  So we're going to start by using CloudFormation to build an API Gateway.  If you've not used CloudFormation before, this is a good time to start.  I should probably write a separate blog about that, but I haven't yet.

You can invoke CloudFormation in a number of ways, but for simplicity I'm just going to invoke it from the command line.  Everything we're going to do here is wrapped up in a couple of scripts (createGateway.sh and dropGateway.sh) and they reference a CloudFormation description (gateway-cf.json).  You can specify CloudFormation in either JSON or YAML; YAML is more terse, but I tend to find it's harder to read.  Moreover, it is less "machine-friendly": it's harder to parse and generate (at least with languages I tend to use).

You would think that it wouldn't be that hard to get a simple gateway up and running, and it's probably not, but by the time you have all the plumbing in place, you'll find you've created:
  • An execution role for the lambda function we're going to want to execute
  • The lambda function itself
  • A "permission" that allows api gateway to invoke the lambda
  • The gateway itself
  • A resource on the gateway
  • A method on the resource
  • A "deployment" for the gateway (i.e. where it shows up in the real world)
Running the createGateway.sh script creates all these resources which you can confirm in the UI if you'd like.

Buried deep in there is the fact that we are pulling a zip file from S3 in the hopes that it will have some useful Java resources in it.  If you look, you'll see that zip file is built earlier on in the script with just junk in it.  For now, what is in the zip file is unimportant: it just matters that it is a zip file.

One thing I do find very useful to do in the UI is testing.  It may well be possible to do the same level of testing with the same quality of output from the command line, but if so, I don't know how.  Even if it is, the ability to see and scroll through the output makes the UI worthwhile.

To test the gateway in the UI, start on the CloudFormation page in the console. Select the Resources and identify the gateway (called SimpleGateway).  Click on the named link and you will go to the API Gateway page.  Click on through to the appropriate gateway and you can see the resources.  Select the GET method on the left hand side and press "TEST".  Scroll down and press "TEST" at the bottom.  On the right hand side, you'll see a testing pane appear.  At this point, I'd expect you to see an error something like this:

Lambda execution failed with status 200 due to customer function error: Class not found: blog.ignorance.apigateway.Handler
If noted above,  we created a zip file from the scripts directory - there is no Java in sight!  It's not surprising AWS can't find a handler.

Which is really the point of this blog post - and what we have been building up to.  We need to create some Java, put it in the right place and see if it works.

Creating a Java Handler for a Lambda

There are a number of different ways of writing Java Lambda Handlers and the exact technique you choose depends on what you are trying to do.  The options are outlined in the AWS Java Documentation, but I have to say that I didn't myself come away very clear on how any of it worked - hence these experiments here.

But after considerable research, and finding madhead's repository, I realized that with a Lambda Proxy integration, the POST packet from API Gateway gives you a serious quantity of information that you can pull into a POJO.  This uses the technique that AWS describes as Leveraging Predefined Interfaces for Creating Handler (Java).

So, let's begin to write some code.

First, we need a class.  If you look back carefully at what we created, you will see that the handler for the lambda was defined to be blog.ignorance.apigateway.Handler.  The plethora of options in how to define a Handler seems complicated, but for the case I want to pursue, this appears to be the name of a class that implements the com.amazonaws.services.lambda.runtime.RequestHandler interface.

Let's create that class, and provide a minimal implementation of the required Request and Response objects and call our work here "done".

public class Handler implements RequestHandler<IgnorantRequest, IgnorantResponse> {
  @Override
  public IgnorantResponse handleRequest(IgnorantRequest arg0, Context arg1) {
    return new IgnorantResponse();
  }
}
We can now build this code and update the lambda with the appropriate zip file using the scripts/package.sh script.  This script does four things:

  • Downloads all the necessary libraries from maven
  • Compiles and assembles a lambda zip file
  • Uploads the file to S3
  • Calls aws lambda update-function-code to notify AWS to refresh the lambda code from the S3 bucket 

Note that this final step is very important - without it, your lambda will keep on using the same code it has always had.

With all the code in place, it should be possible to test the handler again with hopefully delightful results:

Thu Dec 05 10:09:10 UTC 2019 : Successfully completed execution
Thu Dec 05 10:09:10 UTC 2019 : Method completed with status: 200
It doesn't do anything useful yet, but in addition to testing in the UI, you should also be able to hit it from the command line or a browser using the URI that was printed during the creation of the gateway.

curl https://tctf8f9r45.execute-api.us-east-1.amazonaws.com/ignorance/hello
{"message":"Forbidden"}
Oh, well, such is life.

Cleaning Up

There is a script to clean up as well: scripts/dropGateway.sh.

Conclusion

So ends the first part of this tutorial.  I have created (and committed to github) a very simple project that builds the minimal resources to deploy an AWS API Gateway backed by a very simple Java Lambda.

What I haven't done yet (apart from being able to access it from outside the test environment) is to access any of the fields coming from AWS, do any useful work, integrate with anything else or return anything to the user.

We clearly have more work to do.

Next: Let's figure out that 403

Friday, July 12, 2019

A New Computer

I bought a new computer today.  It's a bright, shiny 2019 MacBook Pro with 32GB of memory and the super-fast i9 processor.  Now I need to set it up.
I don't buy development machines very often.  My current machine was purchased in 2013 and since then I've become a cloud afficianado.  I don't expect to have to set boxes up by hand; that should happen automatically.  In those six years, I've set up - and torn down - thousands of virtual machines, mainly on AWS but also on my own fleet of Zini and Raspberry PI boxes.  Basically all by remote control (they may need a bit of initial manual setup, but after that ...).
So, although I'm tempted to see what would happen if I did the Apple recommended thing and tried to restore from TimeMachine (especially since I've spent a lot of time and effort backing up over the past six years), I'm going to go the opposite way and say "what would happen if I tried to treat this as a cloud box".
One thing I'm fairly sure will happen is that I will waste a lot of time with this.
Hopefully I'll also learn a thing or two.

The Strategy

My fundamental strategy is to create a script on my personal webserver that has all the relevant setup commands in it.  Then I'll download that and execute it.  It should go off and do three sets of things:
  • Download and install any tools that can be done on auto-pilot;
  • Recover any and all git repositories I want to clone locally;
  • Help me get started with other tasks that cannot be done on auto-pilot.
I'm not (at this point) sure what's in the third category; but I'm thinking of software that I might need to purchase or that needs a complicated login to access.  If I can't do it all automatically, I'm hoping that I might be able to open the relevant pages in Chrome.

The First Script

So this is what I'm going to run:

$ curl https://gmmapowell.com/autopilot/macbook.sh | bash -x

This references a script that I've put up there from an existing machine linking all the things I'll need to do.  You should be able to reference it (it shouldn't have any of my personal data in it, unless you consider the things I find useful on a development machine personal).

First off, it sets up ssh by creating an SSH key and turning on the ssh daemon.

mkdir -p .ssh

if [ ! -r .ssh/id_rsa ] ; then
  ssh-keygen -f .ssh/id_rsa -t rsa -b 4096 -q -N ""
fi

if sudo systemsetup -getremotelogin | grep -q Off ; then
  sudo systemsetup -setremotelogin On
fi

Then it creates a directory to put things it's going to download from the internet, rather than downloading into Downloads as you'd expect.

mkdir -p autosetup

Then it starts to download and install things.  I've had to figure these recipes out by hand, and they'll probably have changed by the next time I want to do this, but "treating it as a cloud" for a moment ...

I can install Chrome:

if [ ! -d "/Applications/Google Chrome.app" ] ; then
  curl -s -o autosetup/chrome.dmg
      https://dl.google.com/chrome/mac/stable/GGRO/googlechrome.dmg
      autosetup/chrome.dmg
  volume=`hdiutil attach autosetup/chrome.dmg |
              sed -n 's/.*\(\/Volumes.*\)/\1/p'`
  cp -r "$volume/Google Chrome.app"
        "/Applications/Google Chrome.app"
  hdiutil detach "$volume"
fi

Note: I've used indenting to show line wrapping, but in reality long lines are all on one line.

The if block makes sure that we don't put all this effort in multiple times.  If you do want to put it in (for example, to move to a later version) you just need to move the existing app into the trash.

The curl line downloads the appropriate dmg file into the autosetup directory.  This is basically exactly what you'd do with the browser downloading into Downloads.

To understand hdiutil, I'd suggest going to google; to understand what I've done with it, I'd just suggest running the command and playing with it.  Basically, I'm mounting (or opening if you prefer) the dmg file and figuring out where on the file system the directory is placed.  I'm then able to find the actual app on that dmg and cp it into Applications.  Obviously the final step is to unmount the dmg.

I'm hoping to use Docker a lot to simplify application configuration and avoid downloading and setting up a lot of things.  I'm a relative Docker noob, so you'll see other posts dealing with this, but there is a similar construct to install the main Docker runtime.

I run multiple copies of Eclipse, generally with different configurations and plugins, but they basically all start from the same dmg; I download this in much the same way (although the path needs a certain amount of construction) and then copy it into different places as needed.

I found I can install git in much the same way using a package from newcontinuum on sourceforge.  This avoids installing XCode (for me, as I don't generally do much MacOS or iOS development) and I'm hoping it will be a more up-to-date and complete version.

Moving onwards into the realms of "dodgy" software (i.e. not open source, so probably somewhat guarded), I'm going to consider Microsoft Office 365, Quickbooks and Adobe Creative Cloud.

Office surprised me: there was even a web page telling you exactly what to do.  It was almost the easiest thing I did.  Of course, you need to register and log in once you've installed it; and I don't think that can be automated.

Quickbooks meandered through the purchase process and then provided me with a download link.  I think it's time-limited and all that, but I'm not really sure why.  When you download it and open it you still have to enter your credentials.

Adobe Creative Cloud defeated me.  It's download appears to be open, but it's hidden behind visiting another page and involves invoking some javascript.  I'm sure if I'd tried hard enough, I could have figured it out, but I get bored easily and I need to move on ...

Git Repos

In order to access git, I needed to add the new computer as a collaborator.  I probably should have put the effort in to automatically upload the SSH key to the server, but I didn't (I did it by hand).

From there, I wrote another script that attached the SSH key both to the server's authorized_keys map, and also as a github key using the V3 API:

curl -u<user> -XPOST -H"Content-Type: application/json"
    --data '{"title":"'"$PUB"'","key":"'"`cat $PUB`"'"}'
    https://api.github.com/user/keys

Then I was able to run another script which downloaded all of my personal scripts and configuration and ran all the checkout commands on the various git repos.

Lessons Learned

It certainly is possible to treat the install of a new desktop machine as a cloud install.  Many of the problems are the same: much software is not packaged to be easily deployed this way, particularly on a Mac compared to a Linux box.

But the payback in the cloud environment is that you do this literally many times a day; on a development machine you probably want to aim for no more than once every couple of years as a freelancer.  In a corporate setting, it might be worth the investment, but then you probably have images that you burn from anyway.

Wednesday, June 19, 2019

Experiments with Formal Grammars

This post is technically a bit off topic, since the theme is supposed to be things I am ignorant about and I am in fact a compiler expert - I read the Dragon Book in 1986, after I'd already written a few parsers; I've got a PhD in the area; I've worked for a compiler company; I implemented the IDL spec for TIBCO's CORBA implementation; yada, yada, yada.

But most of the parsers I've written over the past 25 years have been done using Agile methods and TDD in particular while discovering the application domain, rather than relying on formal grammars.  But while working on my FLAS compiler recently, I felt I'd hit a brick wall in terms of understanding the parser in terms of the wider context and looked back at that IDL spec I'd implemented in 1994 and thought, "I need something like this".

But I don't like writing documents like that, let alone maintaining them.  So I thought "let's get the computer to write the document".  And so, let the experimentation begin ...

Obviously the computer can't actually write documents ...

Computers can't write documents.  At least not in general.  But that isn't really what I meant.  I want it to do all the hard formatting and repetition for me, based on an explanation of what the grammar should be.  I want it to be flexible enough to change whenever I want to add or remove or rule, or just change the name of something.  I want it to renumber everything and make sure all the references are accurate.  I want it to tell me when I've made a mistake (omitted something, defined something twice, not used anything ...).  In short, all I want to do is to specify the grammar formally, and that only in order to make myself commit to exactly what I want it to be and describe what the semantics are supposed to be.

To understand what I'm talking about, look at that IDL spec, particularly the grammar starting in section 3.4.  Everything in the grammar is duplicated: section 3.4 has a "summary" of all the rules, and then section 3.5 goes back and looks at each group of rules together explaining both the intricacies of parsing them and the intended semantics of the various constructions.

Each of the rules references various other items, including tokens (described in section 3.2) and other rules in the grammar.

Each rule has a number associated with it, and fairly obviously, the numbers are presented in ascending order.

I want all of that, I just don't want to have to create it, and I certainly don't want to maintain it when I decide to add a new rule between 35 and 36.  Or to change one name.

In short, I want a programming language.  Except I don't want to use something like YACC, because I don't want to try and go straight from the specification to generating a parser; that never works for me.

So I'm going to use XML.

XML gets a lot of bad press, and has a lot of haters.  Particularly among the "cool kids" who do front end development and think for some reason that YAML and JSON are so much better, apparently based on the fact that they have less angle brackets and more curly ones.  But on this specific occasion, I think it's the right tool for the job.  Let me briefly explain why.
  • First and foremost, it's very easy to parse.  I've worked with various XML formats before for one reason or another (and often hated it) but have developed a very convenient XML processing library that wraps the standard Java one and makes this kind of work remarkably easy.
  • It is clear from the outset that this definition needs to combine "definitions" and "text".  XML is very good at this - much better than JSON or YAML.  Including HTML inside XML can be a bit tricky, but as long as you stick to XHTML and don't try and get fancy with DTDs, it all works fine.
  • The structure of the formal grammar is very repetitive and quite flat, and so the XML representation is remarkably easy both to represent and read (as I hope you'll agree with the examples below).
OK, enough of the XML sales pitch.  Let's get down to the meat of it.

Representing a production as XML

So, all those rules in the grammar are basically just productions - that's the technical term for a name (specifically a non-terminal) on the left which can be produced when the parser sees one of the possible sequences of terminals (or tokens) and non-terminals on the right hand side.

Consider this rule from my grammar:

(50)             list-literal ::= OSB CSB
                                                   |   OSB <expression> <comma-expression>* CSB 

This defines a list literal in the language, which is basically exactly the same as it is in JavaScript or any functional language.  Things like this:
  • []                       - the empty list
  • [1]                                                    - a simple list
  • [a,b]                                               - a list with two elements
  • [2+3, "hello".length()]     - a list with two expressions
Formally, it says that there are two cases for defining a list literal.  It is possible to produce a list literal by either following the first case and having an "Open Square Bracket" followed by a "Close Square Bracket" (i.e. the empty list) or by having those with a non-terminal expression followed by zero or more non-terminal comma-expressions. You'd have to go and look for those rules (currently 48 and 51 respectively) to check, but informally it's fairly obvious that that covers all the other cases - one or more expressions separated by commas.

Note, however, that this formulation expressly prohibits the trailing comma which is allowed in JavaScript (but not in JSON).  Changing the grammar to support that (in one of several different ways) is left as an exercise to the reader.  What I'm interested in here is how that is represented in XML.

At the outermost level, it's a production and we want to call it list-literal, so let's do that:

<production name='list-literal'>
...
</production>

That's enough to define the production name, make it available as a reference, and give it a number.  But that only defines the left hand side of the production.

The internals give more information.

For the more detailed description of the grammar, each production needs to be placed in a particular section, along with a description.  It would be possible, indeed reasonable, to group the productions in XML inside a <section> block, but in the end I decided I wanted the flexibility to have the rules in one order and the sections potentially grouping from different areas.  I haven't used that flexibility, but it led to me putting the section inside the production:

<section title='Expressions' /> 

One consequence of this choice is the odd feature that the first time a particular section is named, it has to include a <description> element; on subsequent usages, such as this one, that is forbidden and only the title is required.

Finally, the rule has a body; that is the set of productions that make it up.  As you would expect, this is basically a tree of all the things you see on the right hand side.

As a tree, it consists of multiple "levels" of definition.  At the top level, this is an "OR" definition, so we declare an <or> element, and then nest the cases inside it - one per line of the text representation.  In this case, each of those is a "sequence" of other definitions, so we use a <seq> element for each line.  The first line consists of just two, named tokens (using the <token> element) so we include those, in order, inside the first <seq>.

   <or>
     <seq>
       <token type='OSB' />
       <token type='CSB' />
     </seq>
     <seq>
       <token type='OSB' />
       <ref production='expression' />
       <many>
         <ref production='comma-expression' />
       </many>
       <token type='CSB' />
     </seq>
   </or>

The second rule is more complicated (understandably).  It has the same two tokens (at the start and the end).  Inside this is a reference (<ref>) to another non-terminal, expression.  This is followed by a <many> rule which says that whatever it contains will occur zero or more times; the content in this case is another reference to another non-terminal, in this case comma-expression.

Transforming into a Grammar Document

Having parsed this, we need to generate the grammar text.  This is really not as complicated as it might seem.

The basic idea is to generate an HTML file, although we could also generate PDF with a little more effort.  We can run through the productions we read from the XML and for each one apply a simple transformation to generate an appropriate div with an appropriate CSS class.  We can then define an appropriate CSS stylesheet that lays out the rules in the way we want.

In order to generate both a summary and a detailed grammar description, it is necessary to run through the grammar twice.  The first time generates the summary; on the second pass, the <section> tags are analyzed in detail and each section is generated with a header, the actual productions for that section and then the documentation from the section description.

The Gift that Keeps on Giving

While my original motivation for doing the grammar in this way was to simplify the documentation process, I have subsequently found that now that I have the grammar in machine-readable form, I find it an easy step to encourage the machine to do just "that one more thing" for me.

Testing the Grammar

When reading the grammar, the parsing tool builds a cross reference of all the rules and tokens.  In doing so, it sometimes finds inconsistencies.  It is possible to extend this to an automated grammar consistency test.  For example, once all the grammar rules have been read, it is easy to tell whether there are some production references that refer to productions that are not defined anywhere in the grammar.  Likewise, it is possible to tell if all the tokens are defined.  With a minor extension, it is possible to check that all the productions and tokens are in fact referenced somewhere.  And so on.  Every time I change the grammar my automated build runs the process that generates the documentation (actually coded as a unit test) and the build fails if any of these invariants is not met.

Random Sentence Production

Moreover, it gives me the opportunity to automatically, randomly test the compiler's parser.  By reversing the grammar, it is possible to generate random sentences by looking at each production and filling in the blanks.  By doing so, it is possible to produce (admittedly meaningless) programs which should at the very least produce credible parse trees.

Again, my automated build contains a suite of unit tests which generate a few thousand random sentences based on the grammar and runs the parser on them.  Any syntax errors represent failures in the parser and thus broken unit tests ... and a failed build.

Which Rule Went Wrong?

A step beyond this is statistical analysis pointing out which rules are failing.  By knowing for each random sentence which production it involved and which sentences failed, it "should" be possible to identify which production(s) are not parsing correctly by analyzing the failures.  While I have written code to do this, it is sufficiently hard to interpret the results that I have given up using it an reverted to the normal process of analyzing what I changed most recently as a guide as to what is likely to be broken ...

How Well Did I Do?

Judge for yourself ... this is the current version of my FLAS language specification (note this may well be different to what I've written above due to this blog post not being kept up to date). 

TL;DR - Lessons Learned

The main lesson to be learnt here is what a "gift that keeps on giving" it is to make things machine readable.  My experiences really did follow the path I outlined here - my original objective was just to save myself from having to lay out and update all of the rule numbers while continually adjusting the grammar.  Everything else occurred to me much as I've stated it and became possible because of  the decision to use a machine-readable format.

XML in this case worked out really well for me and I haven't regretted it at all.  The virtues I extolled above were reinforced, rather than diminished, as I played the game. 

Tuesday, June 4, 2019

Installing Haskell on a Mac

I'm not going to tell you how to install any Haskell implementation on any box.  I'm just going to help you a little bit by telling you what I did to install ghc on my Mac.

Basically, I followed these instructions.

What this amounts to is running this curl script:

$ curl -sSL https://get.haskellstack.org/ | sh

Towards the end, it asks for permission to "sudo" in order to install to /usr/local/bin and you need to give your password.

In order to do the code generation, it needs you to carry out two additional steps, which it informs you about towards the end of the installation:

$ xcode-select --install
$ open /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg

And then you need to run the compiler at least once to download the latest version and so on:

$ stack ghc