Tuesday, August 31, 2010
Testable System Architecture

At work we were having a discussion about how we wanted to do SSL termination for a particular web service. We had narrowed the possibilities down to doing hardware SSL termination in our load balancer or doing software SSL termination in an Apache layer sitting in front of our web apps.

During the course of the conversation, we talked about factors like performance (would there be a noticeable effect on latency), capacity (were we already CPU bound on the servers that would run the Apaches), maintainability (is it easier to update configs on a single load balancer or to script config changes across a cluster with 40+ servers), cost (how much does the SSL card cost), and scalability (will we be able to expand the solution out to higher traffic levels easily).

I think this was a pretty typical example of taking a reasoned approach to system design and trying to cover all the potential points of view. However, it ended up that we left a big one off: testability.

The business rules about which URLs need to be SSL terminated and which ones don't (or shouldn't) need to be encoded somewhere, and we'd already ruled out doing the SSL termination in the application itself for other reasons, so that means they'd be encoded in either a load balancer config or an Apache config. Which one of these is easier to get under automated test on a developer workstation? For an agile shop where quality and time-to-market are of primary importance, this is a question we can't forget to ask when designing our system architecture.

Friday, August 13, 2010
RESTful Refactor: Combine Resources

I've been spending a lot of time thinking about RESTful web services, particularly hypermedia APIs, and I've started to discover several design patterns as I've begun to play around with these in code. Today, I want to talk about the granularity of resources, which is roughly "how much stuff shows up at a single resource". Generally speaking, RESTful architectures work better with coarser-grained resources, i.e., transferring more stuff in one response, and I'll walk through an example of that in this article.

Now, in my previous article, I suggested taking each domain object (or collection of domain objects) and making it a resource with an assigned URL. While following this path (along with the other guidelines mentioned) does gets you to a RESTful architecture, it may not always be an optimal one, and you may want to refactor your API to improve it.

Let's take, for example, the canonical and oversimplified "list of favorite things" web service. There are potentially two resource types:

  • a favorite thing (/favorites/{id})
  • a list of favorite things (/favorites)
All well and good, and I can model all sorts of actions here:
adding a new favorite
POST to /favorites
removing a favorite
DELETE to the specific /favorites/{id}
editing a favorite
PUT to the specific /favorites/{id}
getting the full list
GET to /favorites
Fully RESTful, great. However, let's think about cache semantics, particularly the cache semantics we should assign to the GET to /favorites. This is probably the most common request we'd have to serve, and in fact it ought to be quite cacheable, as in practice (as with a lot of user-maintained preferences or data) there are going to be lots of read accesses between writes.

There's a problem here, though: some of the actions that would cause an update to the list don't operate on the list's URL (namely, editing a single entry or deleting an entry). This means an intermediary HTTP cache won't invalidate the cache entry for the list when those updates happen. If we want a subsequent fetch of the list by a user to reflect an immediate update, we either have to put 'Cache-Control: max-age=0' on the list and require validation on each access, or we need the client to remember to send 'Cache-Control: no-cache' when fetching a list after an update.

Putting 'Cache-Control: max-age=0' on the list resource really seems a shame; most RESTful APIs are set up to cross WAN links, and so you may be paying most of the latency of a full fetch that returned a 200 OK even if you are getting a 304 Not Modified response, especially if you have fine-grained resources that don't have a lot of data (and a textual list of 10 or so favorite items isn't a lot of data!).

Requiring the client to send 'Cache-Control: no-cache' is also problematic: the cache semantics of the resources are really supposed to be the server's concern, yet we are relying on the client to understand something extra about the relationship between various resources and their caching semantics. This is a road that leads to tight coupling between client and server, thus throwing away one of the really useful properties of a REST architecture: allowing the server and client to evolve largely independently.

Instead, let me offer the following rule of thumb: if a change to one resource should cause a cache invalidation of another resource, maybe they shouldn't be separate resources. I'll call this a "RESTful refactoring": Combining Resources.

In our case, I would suggest that we only need one resource:

  • the list of favorites
We can still model all of our actions:
adding a new favorite
PUT to /favorites a list containing the new item
removing a favorite
PUT to /favorites a new list with the offending item removed
editing a favorite
PUT to /favorites a list containing an updated item
getting the full list
GET to /favorites
But now, I can put a much longer cache timeout on the /favorites resource, because if a client does something to change its state, it will do a PUT to /favorites, invalidating its own cache (assuming the client has its own non-shared/private cache). If the resource represents a user-specific list, then I can probably set the cache timeout considering:
  • how long am I willing to wait for another user to see the results of this user's updates?
  • if the same user accesses the resource from a different computer, how long am I willing to allow those two views to stay out of sync? (bearing in mind that the user can usually, and pretty intuitively, hit refresh on a browser page that looks out of date)?
Probably these values are a lot larger than the zero seconds we were using via 'Cache-Control: max-age=0'. When you can figure out how to assign longer expiration times to your responses, you can get a much bigger win for performance and scale. While revalidating a cached response is probably faster than fetching the resource anew, not having to send a request at all to the origin is waaaaaaay better.

The extreme case, here, of course, would be a web service where a user could just get all their "stuff" in one big blob with one request (as we modelled above). There are many domains where this is quite possible, and when you factor in gzip encoding, you can start to contemplate pushing around quite verbose documents, which can be a big win assuming your server can render the response reasonably quickly enough.

Wednesday, August 11, 2010
Thoughts on Hypermedia APIs

The REST architectural style is defined in Roy Fielding's thesis, primarily chapter 5, where the style is described as a set of architectural constraints. A quick summary of these constraints is:

client-server
The system is divided into client and server portions.
stateless
Each request from client to server must contain all of the information necessary to understand the request.
cache
Response data is implicitly or explicitly marked as cacheable or non-cacheable.
uniform interface
All interactions through the system happen via a standard, common interface. This is achieved by adhering to four sub-constraints:
identification of resources
Domain objects are assigned resource identifiers (e.g. URIs)
manipulation via representations
Actions occur by exchanging representations of current or intended resource state.
self-descriptive messages
Messages include control data (e.g. cache-related), resource metadata (e.g. alternates), and representation metadata (e.g. media type) in addition to a representation itself.
hypermedia as the engine of application state
Clients move from one state to the next by selecting and following state transitions described in the current set of representations.
layered system
Components can only "see" the component with which they are directly interacting.
code-on-demand (optional)
Clients can by dynamically extended by downloading and running code.

Achieving a RESTful architecture with XHTML

Mike Amundsen proposed using XHTML as a media-type of choice for web APIs rather than the ubiquitous Atom (or other application-specific XML) or JSON representations commonly seen. By using XHTML profiles, we are able to define the semantics of the data contained within a particular document, as well as the semantics of contained link relations and form types.

Now, let's throw a few simple rules into the system:

  1. all domain objects (including collections of domain objects) are resources and get assigned a URL
  2. beyond an HTTP GET to the API's "home page", a client simply follows standard XHMTL semantics from returned documents; namely, doing a GET to follow a link, and constructing a GET or POST request by filling out and submitting a form.
  3. retrieval (read) of resource state should be accomplished by GET, and modification of resource state should happen with POST (via a form).

Interestingly, this means that in addition to programmatic clients being able to parse XHTML (as a subset of XML) and apply standard XHTML semantics for interactions, it is possible for a human to use a browser to interact with the resources (or, as my colleague Karl Martino put it, "you can surf an API!").

Evaluation

So how well does this match up against the REST constraints? By leveraging HTTP directly as an application protocol, we can get a lot of constraints for free, namely: client-server, statelessness, caching, layered system, and self-descriptive messages.

Now, we also get a uniform interface, because all of our domain objects are modelled as resources with identifiers, reads are accomplished by retrieving XHTML documents as representations, and writes are accomplished by sending form-encoded inputs as representations. Finally, because a client accomplishes its goals by "clicking links and submitting forms", the hypermedia features of XHTML let us model the available state transitions to the client, who can then select what to do next and know how to follow one of the available transitions. Also, because an update to a resource is modelled as a PUT to the same URL we would use to GET its state, this plays nicely and naturally with standard HTTP/1.1 cache semantics (invalidation on write-through).

Finally, we're not using code-on-demand, in our case, although we could include Javascript with our XHTML representations to provide additional functionality for that human "surfing" our API, even if a programmatic client would ignore the Javascript. However, code-on-demand is listed as an optional constraint anyway.

Coming soon...

This is an intentionally high-level post that I'm intending will be the first in a series of posts that go over specific examples and examine some practical considerations and implementation patterns that are useful. Hopefully, we'll also be able to illustrate some of the architectural strengths and weaknesses that the REST architectural style is purported to have. Stay tuned!