Tuesday, December 17, 2019

Handling websocket responses with API Gateway

Go to the Table of Contents for the Java API Gateway

Apologies if that previous post seemed to end abruptly.  The simple reason is that I realized I had painted myself into something of a corner.  In every other environment I have used, web socket handlers are long-lived, leading to stickiness on the server.  That is, the websocket handler goes through a lifecycle of being created, receiving messages and being closed.  All on the same in-memory object.

For all I know, API Gateway takes the same approach internally.  But the separation of concerns, by which it delegates processing to a lambda, means that even if the actual socket connection is sticky, the lambda which is invoked quite possibly has no recollection of every having dealt with this connection before.

I was hoping that it would be simple enough to "respond" to the "calling" websocket and then defer the broader issues of connections to this post, but (reasonably enough) AWS takes the attitude that all websockets are equal and all must be accessed through the same mechanism.

In a moment, we'll look at that mechanism, but what stopped me in my tracks was a broken abstraction: in the WSProcessor abstraction presented in the previous post, the open method is handed a WSResponder and expected to hold onto it.  We simply can't do that here, because we don't know that an object in a given lambda will see the operations in the correct order.

So it's time for a new abstraction.  For now, we don't need to do anything too dramatic, although when we (finally) get around to considering other connections (as well as data storage), we will need to update the abstraction again.

For now, the code can be found in the repository tagged API_GATEWAY_RESPOND.

The Updated Abstraction

In the WSProcessor class, we have methods for handling the opening, closing and error conditions on websockets as well as processing messages.  In the original abstraction, only the open method is provided with the WSResponder object.  We need to change that so that the message processing method onText is provided with it every time as well.  Remember that WSResponder is just an interface: it doesn't give away how it's implemented; it just sends messages back to where they came from.  At the moment, neither error nor close need this handle: they cannot actually respond.

This makes it easy to finish our implementation of onText in CounterSocket:
public void onText(WSResponder responder, String text) {
  responder.send("Length: " + text.length());
}
But this obviously won't work until we have completed the behind-the-scenes implementation.

Posting to Connections in API Gateway

The key to writing to connection in API Gateway is to use the AmazonApiGatewayManagementApi interface in their client library.  This has a method postToConnection which enables you to send a message to an arbitrary connection by id.

The skeleton code for this seems quite easy.  We just "create" one of these somewhere and then call it with an appropriate object.  There is a small complexity in that the post method wants a byte buffer, but briefly ignoring character encodings, we can make that happen.
AmazonApiGatewayManagementApi wsapi = ...
PostToConnectionRequest msg = new PostToConnectionRequest();
msg.setConnectionId(connId);
msg.setData(ByteBuffer.wrap(text.getBytes()));
wsapi.postToConnection(msg);
The problems start to come when you try and create the ApiGatewayManagementApi object.  First off, you need to include a jar with it in (at time of writing, aws-java-sdk-apigatewaymanagementapi-1.11.688.jar).  This also has a string of dependencies (which are automatically resolved if you are using maven or gradle, but make quite a difference in my hand-rolled package.sh).  It needs to know how to reach the API Gateway, which requires the current region, the domain name and the stage name:
AmazonApiGatewayManagementApi wsapi =
  AmazonApiGatewayManagementApiClientBuilder.standard()
    .withEndpointConfiguration(
       new EndpointConfiguration(domainName + "/" + stage, System.getenv("AWS_REGION"))
    )
    .build();
Where can we get these from?  The region (as shown) comes from an environment variable.  The domain name and stage name are included in the request context, so we can provide these to the function from the calling site.
public WSResponder responderFor(ServerLogger logger, String connId,
                                String domainName, String stage) {
  return new WSResponder() {
    ...
  }
}
The calling site is in ProcessorRequest, well away from user code.
wsproc.onText(
  central.responderFor(logger,
                       (String)context.get("connectionId"),
                       (String)context.get("domainName"),
                       (String)context.get("stage")
  ),
  body
);
Unfortunately, that is not everything that you need.  If you run the code in this state, it fails.  It actually took me quite a long time to figure out why, in part because (as previously noted), the debug loop when you are depending on lambdas is quite long.

One problem I had was I had failed to understand how to correctly configure the Gateway Management Api (the code shown above is correct, but there are a number of other ways of trying to configure it which are wrong).

But more importantly, I kept getting "OutOfMemory" errors.  These are very unusual in normal Java development, but AWS Lambda has a default memory limit of 128MB.  Amazingly, 128MB is not enough to load in all the libraries we now need.  As I upped the limit, I found (in my environment) that 150MB or so was required.  I gave it 256MB and it still didn't work.  It kept timing out.  I increased the timeout to 25s and it returned (on cold start) in about 15s, and about 3s thereafter.  While that probably counts as "working", I wasn't pleased.

It turns out if you increase the memory limit more, you also get a bigger slice of all the other pies: specifically CPU time.  I found at around 1GB, I was getting performance I considered acceptable.  Your mileage may vary.

Conclusion

I was surprised by how very hard it was just to respond to the client calling you.  I can't help feeling I'm missing something - especially since every message returns a response to the gateway and whenever the handler throws an exception the gateway automatically sends a message back to the client.  But it certainly doesn't seem to directly couple your response to the gateway response.

There is a further wrinkle which is that I have used lambda proxies everywhere because that seemed to be the best fit when I was working with REST APIs (in particular, being able to obtain headers).  There is a lot of information in the documentation about building response routes and response integrations but that only applies if you are using non-proxy integrations.  I did not - but possibly should - take the time to investigate that.

I was also surprised by the complexity and slowness of the overall mechanism of writing to clients in general.  It concerns me that, if it takes 2-3s just to respond to one client, it is going to be unscalable when we try writing to hundreds of clients.

Next: Integrating Couchbase to Store Connected Clients

No comments:

Post a Comment