Ignorance may be Strength

Friday, December 13, 2019

Logging to Cloudwatch from API Gateway

Go to the Table of Contents for the Java API Gateway

Logging is an important means of being able to determine what is going on in any system. This is particularly true for otherwise opaque systems in the cloud.

API Gateway offers the ability to log through Cloudwatch by providing the handler with a Context object containing a logging service in its getLogger property. We can obtain this in our handler, wrap it in a "generic" interface (to log a string) and then pass that to any userspace handlers that want it.

These changes are tagged AWS_GATEWAY_LOGGING in the repository.

So, in TDAHandler we add this constructor:

  ServerLogger logger = new AWSLambdaLogger(cx.getLogger());

The AWSLambdaLogger is define as:

public class AWSLambdaLogger implements ServerLogger {
  private final LambdaLogger logger;

  public AWSLambdaLogger(LambdaLogger logger) {
    this.logger = logger;
  }

  @Override
  public void log(String string) {
    logger.log(string);
  }
}

Finally, in ProcessorRequest we pass it off to anybody who asks for it by implementing the DesiresLogger interface:

  if (handler instanceof DesiresLogger) {
    ((DesiresLogger)handler).provideLogger(logger);
  }

It would, of course, be possible to integrate this with frameworks such as SLF4J by providing a suitable implementation of the concrete classes. This is left as an exercise to the reader (but look at aws-lambda-java-log4j).

If you don't have access to "the context" you can still log. You just need to obtain the logger from a static method on LambdaRuntime:

LambdaLogger logger = LambdaRuntime.getLogger();

One thing to note (if you are unused to Cloudwatch Logging): it often takes a while for logging to work its way through Amazon's cloud. This can be very distracting, particularly since a lot of issues never seem to get logged at all. How do you determine if a log failed to get through, or just hasn't come through yet? And having to wait a minute or two (sometimes more) for feedback on what just went wrong can kill productivity. Sadly, I don't have a solution for this other than to "test" as little as possible on AWS and to always assume that every time you push everything except the environment is perfect. And then to change the environment as little as possible, and always in small steps so that it is always "green" to "green" and any time you do see a failure, you know what the problem "must be".

Next: Getting Started with Websockets

Thursday, December 12, 2019

Handling Input in a Lambda Proxy

Go to the Table of Contents for the Java API Gateway

If we don't want to greet the entire world, but one individual by name, then we will need to obtain information about that person and their name. There are many ways to do this, and this section will look at some of them in an attempt to develop a more complete Lambda Integration before we go full-bore with trying to actually integrate an existing Java server.

All of the code here is checked in together and can be found at the tag API_GATEWAY_HELLO_INPUT.

Reading from a query parameter

According to the AWS documentation, the query parameters are available as a JSON object in the lambda proxy input request in a field called queryStringParameters. There is also a multiValueQueryStringParameters which allows multiple copies of the same parameter to be specified, but let's start small.

In order to obtain this, we need to create a setter on our IgnorantRequest POJO. I am guessing that in the Java binding, it will happily translate a JSON object into a Java Map. Thus I can write this code:

public class IgnorantRequest {
 private Map<String, String> values = new HashMap<>();

 public void setQueryStringParameters(Map<String, String> values) {
  if (values != null)
   this.values = values;
 }
}

In my experiments, I found that with no parameters, AWS could choose to pass null in to this function, which could cause an exception. On the other hand, it does seem to always call the function. But since I can't be sure, I wrote this code in the most defensive way possible: values will always have a valid map regardless of how it is invoked.

For full disclosure (for those that don't know me), I really don't like the usual Java-style POJOs with setters and getters (to be precise, it's the getters that get me) and prefer a "tell-don't-ask" style of programming. We'll see what this looks like when we integrate my "tell-don't-ask-server" in a later post, but for now I'm just going to grit my teeth and add a getter to this class. To make myself a little happier (and a little less primitive-obsessed), I'm not going to let you ask for all of the parameters: you have to know which one you want. I'm also going to let you ask first if we have that one.

public class IgnorantRequest {
 private Map<String, String> values = new HashMap<>();

 public void setQueryStringParameters(Map<String, String> values) {
  if (values != null)
   this.values = values;
 }
 
 public boolean hasQueryParameter(String p) {
  return values.containsKey(p);
 }
 
 public String queryParameter(String p) {
  return values.get(p);
 }
}

It's now quite easy to update both the main function and the response to handle the fact that we may have a query parameter (or may not) and that we want to use it in the greeting if we do, and carry on with the old "hello, world" behavior if not.

public class Handler implements RequestHandler<IgnorantRequest, IgnorantResponse> {
 public IgnorantResponse handleRequest(IgnorantRequest arg0, Context arg1) {
  if (arg0.hasQueryParameter("name"))
   return new IgnorantResponse(arg0.queryParameter("name"));
  else
   return new IgnorantResponse("world");
 }
}

public class IgnorantResponse {
 private final String helloTo;

 public IgnorantResponse(String helloTo) {
  this.helloTo = helloTo;
 }

 public String getBody() {
  return "hello, " + helloTo;
 }
}

And you can package that up and try to curl it again with and without a parameter:


$ scripts/package.sh

$ curl https://tovogqsfoj.execute-api.us-east-1.amazonaws.com/ignorance/hello

hello, world

$ curl https://tovogqsfoj.execute-api.us-east-1.amazonaws.com/ignorance/hello?name=Fred

hello, Fred

Reading from an HTTP header

HTTP headers are a standard way of passing information between clients and servers. The headers are passed to a lambda proxy through the headers field of the JSON object, which can again be interpreted as a Map in Java.

There is something of a wrinkle here, which possibly applies to the query parameters as well, but simply didn't come up, that often HTTP headers are thought of in upper case or mixed case but can be any case and curl in particular shifts them to lower case.

To handle this, we define a case-insensitive TreeMap to hold the header values and then put all of the headers passed across into that. Note that I don't see how it is possible for the headers array to be null, but I've defended against it anyway.

public class IgnorantRequest {
  private Map<String, String> headers = new TreeMap<>(String.CASE_INSENSITIVE_ORDER);

  public void setHeaders(Map<String, String> headers) {
    if (headers != null)
      this.headers.putAll(headers);
  }
 
  public boolean hasHeader(String hdr) {
    return headers != null && headers.containsKey(hdr);
  }

  public String getHeader(String hdr) {
    return headers.get(hdr);
  }
}

(Note that I have omitted code already shown for brevity.)

The updates to the main code are very similar to those already shown for the query parameter case.

$ scripts/package.sh
$ curl -HX-USER-NAME:Fred https://tovogqsfoj.execute-api.us-east-1.amazonaws.com/ignorance/hello
hello, Fred

Reading from a path parameter

In order to move on, we need new resource methods. I'm going to add two at the same time to the gateway configuration: a POST method on the existing hello resource to handle reading from the body (see next section) and another GET method on a new hello/{who} resource that will enable us to read from the who path parameter to find out who we should be greeting. If you are following along and have already deployed the appropriate version of the gateway, you'll be fine. If you have checked out the code but not dropped and recreated the gateway, you will need to do that before you can continue.

Path parameters come across in a map called pathParameters. This is starting to look easy (and repetitive). A little bit of cutten-and-pasten (making sure not to make any stupid duplication mistakes) and we can try again:

$ scripts/package.sh
$ curl https://tovogqsfoj.execute-api.us-east-1.amazonaws.com/ignorance/hello/George
hello, George

Reading from the body

The body can also be passed in for a POST request, so it is important to add that method (done above) and then the body should be available through the body parameter. The documentation describes it as a "JSON string" but we have to assume that will be translated into a Java String for our purposes (although note that binary bodies apparently come across as Byte64 encoded strings).

I am not going to try and parse the body or do anything fancy. I just assume that if there is a non-empty body, it is the name to be greeted. I store the body (if any) and provide code to test the body exists and return it much the same as for the other cases. The handler is updated to test this in turn.


$ scripts/package.sh
$ curl --data Henry https://vuhsa5vvlj.execute-api.us-east-1.amazonaws.com/ignorance/hello
hello, Henry

Ordering

You may be wondering at this point what happens if you specify multiple names to greet. The answer, of course, is that it depends on the logic in the handler. What I chose to do was to test each of the cases we have considered in turn and return the first one that matches; if none of them match, we continue to greet the world.

This is not the only possible choice, but it is simple and, in any case, it is not our greeting strategy that is under review here: it is whether we can integrate with AWS.

Conclusion

We have experimented with four different ways of handling input from a client (query parameters, path parameters, headers and body) and used these to customize our greeting response.

Next: Integrating our TDA Server

Generating a Response for Lambda Proxy Integration

Go to the Table of Contents for the Java API Gateway

We now find ourselves in the position where we have working code and scripts that can redeploy that code: what is commonly known as "Green".

But the objective of our code at the moment is to output "hello, world" and it doesn't do that. We need to return a body output somehow.

The output we are providing is in the IgnorantResponse class, but at the moment that's just an empty class. We need to update it to return a body.

As far as I can tell, all of this works by Amazon magic using Java reflection on beans.

Lambda Proxy Communication

You can see what the lambda proxy input looks like by going back and studying the output from testing the API in the UI. About halfway down there is a line that starts "Endpoint request body after transformations:". Looking along that, you can see a big JSON structure which has fields called "resource", "path" and "httpMethod" among others. If you are familiar with how HTTP works, all of this should seem very reasonable, even if some of it is AWS specific and generally there is too much information so it ends up being truncated.

Fortunately, the input and output formats are described in the AWS documentation.

So, fairly obviously, we want to set the body field of the response.

Creating a Response

Because AWS uses Java reflection - and assumes our response object is a POJO or bean, all we should need to do is to create the getBody() method on our IgnorantResponse and have it return the relevant message.

public class IgnorantResponse {
  public String getBody() {
    return "hello, world";
  }
}

We can then repackage and retry our curl command:

scripts/package.sh
curl https://tovogqsfoj.execute-api.us-east-1.amazonaws.com/ignorance/hello

We can also have the handler specifically set the status code. This, for example, will return a 204 with no data:

public class IgnorantResponse {
  public int getStatusCode() {
    return 204;
  }

  public String getBody() {
    return "";
  }
}

In order to see the status code, you will need to use curl -v.

Conclusion

Working with the Response object is fairly simple: we just create the appropriate POJO getters for the fields that we want to populate in the lambda response.

You can find all the code in the repository tagged API_GATEWAY_HELLO_WORLD.

Next: We Need to Handle Input

Diagnosing 403 Errors on AWS API Gateway

Go to the Table of Contents for the Java API Gateway

Inwardly I groan every time I see a 403 come back from API Gateway.

Personally, I think they do it "wrong", but at the same time, I can see where they are coming from.

404 Issues

According to the HTTP standards, if a page doesn't exist, you should see a 404 response. This is what we all expect. When we see a 403, we start asking security-like questions, not "have I typed the right URL" questions.

In fact, the majority of times that I see a 403 with AWS API Gateway, the problem is that I have typed the wrong URL or am using the wrong method. To make matters worse, API Gateway gives you absolutely no logging to let you know that this is happening (at least, I haven't found any), possibly because it cannot think of anywhere to put that logging for a resource it cannot find.

The root cause of this appears to be that API Gateway can't tell if the non-existent resource would be secured if it did exist. Since it can't tell, it decides that it should tell you that you can't access it, rather than telling you that it doesn't exist.

Fixing the problem

So the first thing you need to do when encountering a 403 is to look very, very carefully at your URL and see if there is anything that could possibly be wrong with it. On the upside, if the hostname is wrong you do get a sensible error - but mainly because Route53 has not registered the domain name and so your browser cannot resolve it!

If you are still wondering what went wrong in the first post, it's that the stage/deployment name "Ignorance" is misspelled "ignorance" (yes, just a case error) in the URI that is assembled at the end of the createGateway.sh script. But it's a valuable point in two regards:

URLs must contain the "stage name" in order to function (otherwise you'll get a 403)
The stage name and resource name must both be spelled correctly (including case) or you will see a 403.

I am going to fix this from now on by having the stage name be lower case - but you will need to drop and recreate the gateway to see the changes.

Next: Let's Generate a Response

Wednesday, December 11, 2019

Using API Gateway with Java

This is a multipart experiment at trying to deploy an API Gateway instance on AWS which can integrate with a tell-don't-ask dispatch server, websockets and Couchbase.

The first few steps are to try and get a basic API Gateway server up with a Proxy Lambda behind it.

Then we want to merge in the Tell-Don't-Ask server

Then we want to introduce websockets

Finally, we want to handle unsolicited traffic

Minimal Setup for AWS API Gateway with Java

Go to the Table of Contents for the Java API Gateway

Although I've been using clouds for the past decade or so, I do not (yet) consider myself an expert with all things serverless. This year, I have done some work with AWS API Gateway and Lambdas, but everything I've done has been with node.js.

The node.js environment in Lambda is really quite friendly: you can (for small enough packages) view and even edit your code online. But it seems fairly obvious that the same approach will not work for Java since "runtime" Java consists of bytecodes and jar files.

The background for this investigation is that I want to take one of my mainline servers, currently running on provisioned AWS boxes with high availability proxies and get it running in a serverless environment.

Of course, because it's a real system, it's a little complicated. So I need to make sure that I can do all of the following things:

Actually get something up and running with API Gateway, Lambda and Java
Obtain all the information sent across from the client (method, headers, path information, payloads, etc)
Integrate with my existing code which uses custom servlet handlers
Handle WebSocket traffic (a showstopper a couple of years ago, but available since last December)
Integrate with complex backend components (specifically Couchbase on a different configured server)

Given that this is quite a few separate things, I'm going to split my treatment up into a series of blog posts (there is a contents page linked from the top of every post). All the code is in one place - git@github.com:gmmapowell/ignorance-may-be-strength.git - under the directory aws-java-gateway. Each post has its own git tag: for this post you will want to check out the tag API_GATEWAY_MINIMAL.

Before we Begin

If you want to do more than follow along, you'll obviously need an AWS account and you'll need to have it appropriately configured along with suitable configuration on your machine. I think most of that is beyond the scope of this article.

You will definitely need to install the AWS CLI. For MacOS, you can follow these instructions.

The operations we are going to perform need to take place in an AWS region. You need to specify this to the scripts I have written in the REGION environment variable. To move the code from your machine to AWS Lambda, we use an S3 bucket. You need to provide the name of a bucket you have permission to write to in the BUCKET environment variable.

Getting Something Working

The first thing we want to do is to get an AWS API Gateway up and running. I've done this bit before and - if you know me at all - you will know that there's no way I'm going to do it through the UI other than to experiment. So we're going to start by using CloudFormation to build an API Gateway. If you've not used CloudFormation before, this is a good time to start. I should probably write a separate blog about that, but I haven't yet.

You can invoke CloudFormation in a number of ways, but for simplicity I'm just going to invoke it from the command line. Everything we're going to do here is wrapped up in a couple of scripts (createGateway.sh and dropGateway.sh) and they reference a CloudFormation description (gateway-cf.json). You can specify CloudFormation in either JSON or YAML; YAML is more terse, but I tend to find it's harder to read. Moreover, it is less "machine-friendly": it's harder to parse and generate (at least with languages I tend to use).

You would think that it wouldn't be that hard to get a simple gateway up and running, and it's probably not, but by the time you have all the plumbing in place, you'll find you've created:

An execution role for the lambda function we're going to want to execute
The lambda function itself
A "permission" that allows api gateway to invoke the lambda
The gateway itself
A resource on the gateway
A method on the resource
A "deployment" for the gateway (i.e. where it shows up in the real world)

Running the createGateway.sh script creates all these resources which you can confirm in the UI if you'd like.

Buried deep in there is the fact that we are pulling a zip file from S3 in the hopes that it will have some useful Java resources in it. If you look, you'll see that zip file is built earlier on in the script with just junk in it. For now, what is in the zip file is unimportant: it just matters that it is a zip file.

One thing I do find very useful to do in the UI is testing. It may well be possible to do the same level of testing with the same quality of output from the command line, but if so, I don't know how. Even if it is, the ability to see and scroll through the output makes the UI worthwhile.

To test the gateway in the UI, start on the CloudFormation page in the console. Select the Resources and identify the gateway (called SimpleGateway). Click on the named link and you will go to the API Gateway page. Click on through to the appropriate gateway and you can see the resources. Select the GET method on the left hand side and press "TEST". Scroll down and press "TEST" at the bottom. On the right hand side, you'll see a testing pane appear. At this point, I'd expect you to see an error something like this:

Lambda execution failed with status 200 due to customer function error: Class not found: blog.ignorance.apigateway.Handler

If noted above, we created a zip file from the scripts directory - there is no Java in sight! It's not surprising AWS can't find a handler.

Which is really the point of this blog post - and what we have been building up to. We need to create some Java, put it in the right place and see if it works.

Creating a Java Handler for a Lambda

There are a number of different ways of writing Java Lambda Handlers and the exact technique you choose depends on what you are trying to do. The options are outlined in the AWS Java Documentation, but I have to say that I didn't myself come away very clear on how any of it worked - hence these experiments here.

But after considerable research, and finding madhead's repository, I realized that with a Lambda Proxy integration, the POST packet from API Gateway gives you a serious quantity of information that you can pull into a POJO. This uses the technique that AWS describes as Leveraging Predefined Interfaces for Creating Handler (Java).

So, let's begin to write some code.

First, we need a class. If you look back carefully at what we created, you will see that the handler for the lambda was defined to be blog.ignorance.apigateway.Handler. The plethora of options in how to define a Handler seems complicated, but for the case I want to pursue, this appears to be the name of a class that implements the com.amazonaws.services.lambda.runtime.RequestHandler interface.

Let's create that class, and provide a minimal implementation of the required Request and Response objects and call our work here "done".

public class Handler implements RequestHandler<IgnorantRequest, IgnorantResponse> {
  @Override
  public IgnorantResponse handleRequest(IgnorantRequest arg0, Context arg1) {
    return new IgnorantResponse();
  }
}

We can now build this code and update the lambda with the appropriate zip file using the scripts/package.sh script. This script does four things:

Downloads all the necessary libraries from maven
Compiles and assembles a lambda zip file
Uploads the file to S3
Calls aws lambda update-function-code to notify AWS to refresh the lambda code from the S3 bucket

Note that this final step is very important - without it, your lambda will keep on using the same code it has always had.

With all the code in place, it should be possible to test the handler again with hopefully delightful results:

Thu Dec 05 10:09:10 UTC 2019 : Successfully completed execution
Thu Dec 05 10:09:10 UTC 2019 : Method completed with status: 200

It doesn't do anything useful yet, but in addition to testing in the UI, you should also be able to hit it from the command line or a browser using the URI that was printed during the creation of the gateway.

curl https://tctf8f9r45.execute-api.us-east-1.amazonaws.com/ignorance/hello
{"message":"Forbidden"}

Oh, well, such is life.

Cleaning Up

There is a script to clean up as well: scripts/dropGateway.sh.

Conclusion

So ends the first part of this tutorial. I have created (and committed to github) a very simple project that builds the minimal resources to deploy an AWS API Gateway backed by a very simple Java Lambda.

What I haven't done yet (apart from being able to access it from outside the test environment) is to access any of the fields coming from AWS, do any useful work, integrate with anything else or return anything to the user.

We clearly have more work to do.

Next: Let's figure out that 403

Friday, July 12, 2019

A New Computer

I bought a new computer today. It's a bright, shiny 2019 MacBook Pro with 32GB of memory and the super-fast i9 processor. Now I need to set it up.
I don't buy development machines very often. My current machine was purchased in 2013 and since then I've become a cloud afficianado. I don't expect to have to set boxes up by hand; that should happen automatically. In those six years, I've set up - and torn down - thousands of virtual machines, mainly on AWS but also on my own fleet of Zini and Raspberry PI boxes. Basically all by remote control (they may need a bit of initial manual setup, but after that ...).
So, although I'm tempted to see what would happen if I did the Apple recommended thing and tried to restore from TimeMachine (especially since I've spent a lot of time and effort backing up over the past six years), I'm going to go the opposite way and say "what would happen if I tried to treat this as a cloud box".
One thing I'm fairly sure will happen is that I will waste a lot of time with this.
Hopefully I'll also learn a thing or two.

The Strategy

My fundamental strategy is to create a script on my personal webserver that has all the relevant setup commands in it. Then I'll download that and execute it. It should go off and do three sets of things:

Download and install any tools that can be done on auto-pilot;
Recover any and all git repositories I want to clone locally;
Help me get started with other tasks that cannot be done on auto-pilot.

I'm not (at this point) sure what's in the third category; but I'm thinking of software that I might need to purchase or that needs a complicated login to access. If I can't do it all automatically, I'm hoping that I might be able to open the relevant pages in Chrome.

The First Script

So this is what I'm going to run:

$ curl https://gmmapowell.com/autopilot/macbook.sh | bash -x

This references a script that I've put up there from an existing machine linking all the things I'll need to do. You should be able to reference it (it shouldn't have any of my personal data in it, unless you consider the things I find useful on a development machine personal).

First off, it sets up ssh by creating an SSH key and turning on the ssh daemon.

mkdir -p .ssh

if [ ! -r .ssh/id_rsa ] ; then
ssh-keygen -f .ssh/id_rsa -t rsa -b 4096 -q -N ""
fi

if sudo systemsetup -getremotelogin | grep -q Off ; then
sudo systemsetup -setremotelogin On
fi

Then it creates a directory to put things it's going to download from the internet, rather than downloading into Downloads as you'd expect.

mkdir -p autosetup

Then it starts to download and install things. I've had to figure these recipes out by hand, and they'll probably have changed by the next time I want to do this, but "treating it as a cloud" for a moment ...

I can install Chrome:

if [ ! -d "/Applications/Google Chrome.app" ] ; then
curl -s -o autosetup/chrome.dmg
https://dl.google.com/chrome/mac/stable/GGRO/googlechrome.dmg
autosetup/chrome.dmg
volume=`hdiutil attach autosetup/chrome.dmg |
sed -n 's/.*$\/Volumes.*$/\1/p'`
cp -r "$volume/Google Chrome.app"
"/Applications/Google Chrome.app"
hdiutil detach "$volume"
fi

Note: I've used indenting to show line wrapping, but in reality long lines are all on one line.

The if block makes sure that we don't put all this effort in multiple times. If you do want to put it in (for example, to move to a later version) you just need to move the existing app into the trash.

The curl line downloads the appropriate dmg file into the autosetup directory. This is basically exactly what you'd do with the browser downloading into Downloads.

To understand hdiutil, I'd suggest going to google; to understand what I've done with it, I'd just suggest running the command and playing with it. Basically, I'm mounting (or opening if you prefer) the dmg file and figuring out where on the file system the directory is placed. I'm then able to find the actual app on that dmg and cp it into Applications. Obviously the final step is to unmount the dmg.

I'm hoping to use Docker a lot to simplify application configuration and avoid downloading and setting up a lot of things. I'm a relative Docker noob, so you'll see other posts dealing with this, but there is a similar construct to install the main Docker runtime.

I run multiple copies of Eclipse, generally with different configurations and plugins, but they basically all start from the same dmg; I download this in much the same way (although the path needs a certain amount of construction) and then copy it into different places as needed.

I found I can install git in much the same way using a package from newcontinuum on sourceforge. This avoids installing XCode (for me, as I don't generally do much MacOS or iOS development) and I'm hoping it will be a more up-to-date and complete version.

Moving onwards into the realms of "dodgy" software (i.e. not open source, so probably somewhat guarded), I'm going to consider Microsoft Office 365, Quickbooks and Adobe Creative Cloud.

Office surprised me: there was even a web page telling you exactly what to do. It was almost the easiest thing I did. Of course, you need to register and log in once you've installed it; and I don't think that can be automated.

Quickbooks meandered through the purchase process and then provided me with a download link. I think it's time-limited and all that, but I'm not really sure why. When you download it and open it you still have to enter your credentials.

Adobe Creative Cloud defeated me. It's download appears to be open, but it's hidden behind visiting another page and involves invoking some javascript. I'm sure if I'd tried hard enough, I could have figured it out, but I get bored easily and I need to move on ...

Git Repos

In order to access git, I needed to add the new computer as a collaborator. I probably should have put the effort in to automatically upload the SSH key to the server, but I didn't (I did it by hand).

From there, I wrote another script that attached the SSH key both to the server's authorized_keys map, and also as a github key using the V3 API:

curl -u<user> -XPOST -H"Content-Type: application/json"

--data '{"title":"'"$PUB"'","key":"'"`cat $PUB`"'"}'

https://api.github.com/user/keys

Then I was able to run another script which downloaded all of my personal scripts and configuration and ran all the checkout commands on the various git repos.

Lessons Learned

It certainly is possible to treat the install of a new desktop machine as a cloud install. Many of the problems are the same: much software is not packaged to be easily deployed this way, particularly on a Mac compared to a Linux box.

But the payback in the cloud environment is that you do this literally many times a day; on a development machine you probably want to aim for no more than once every couple of years as a freelancer. In a corporate setting, it might be worth the investment, but then you probably have images that you burn from anyway.