2007-10-15

HTTP: what to sign?

There's been quite a number of useful comments on my previous post, and even an implementation.  The main area where there seems to be disagreement is on the issue of what exactly to sign.

It seems to me that you can look at an HTTP interaction at two different levels:

  • at a low level, it consist of request and response messages;
  • at a slightly higher level, it consists of the transfer of the representations of resources.

With a simple GET, there's a one-to-one correspondence between a response message a representation transfer.  But with fancier HTTP features, like HEAD or conditional GET or ranges or the proposed PATCH method, these two levels start to diverge: the messages aren't independent entities in themselves, they are artifacts of the client attempting to efficiently synchronize the representation of the resource that it has with the current representation defined by the origin server.

The question then arises of whether, at an abstract level, the right thing to sign is messages or resource representations.  I think the right answer is resource representations: those are things whose integrity is important to applications.  For example, in the response to the HEAD message, the signature wouldn't simply sign the response to the HEAD message; rather it would cover the entity that would have been returned by a GET. The Signature header would thus be allowed in similar situations to the ETag header and would correspond to the same thing that a strong entity tag corresponds to.

It's important to remember that the representation of the resource doesn't consist of just the data in the entity body.  It also includes the metadata in the entity headers.  At the very least, I think you would want to sign the Content-Type header. Note that there are some headers that you definitely wouldn't want to sign, in particular hop-to-hop headers.  I don't think there's a single right answer as to which headers to sign, which means that the Signature header will need to explicitly identify which headers it is signing.

With this approach the signature doesn't need to cover the request.  However, it does need to relate the representation to a particular resource. Otherwise there's a nasty attack possible: the bad guy can replace the response to a request for one resource with the response to a request for another resource. (Suppose http://www.example.com/products/x/price returns the price of product x; an attacker could completely switch around the price list.)  I think the simplest way to solve this is for the Signature header in the response to include a uri="request_uri" parameter, where request_uri is the URI of the resource whose representation is being signed. This allows the signature verification process to work with just the response headers and body as input, which should simplify plugging this feature into implementations.

Although not including the request headers in the signature simplifies things, it must be recognized that it does lose some functionality. When there are multiple variants, the signature can't prove that you've got the right variant. However, I think that's a reasonable tradeoff.  Even if the request headers were signed, sometimes the response depends on things that aren't in the request, like the client's IP address (as indicated by Vary: *). The response can at least indicate that the response is one of several possible variants, by including Content-Location, Content-Language and/or Vary headers amongst the signed response headers.

The signature will also need to include information about the time during which the relationship between the representation and the resource applies.  I haven't figured out exactly how this should work.  It might be a matter of signing some combination of Date, Last-Modified, Expires and Cache-Control (specifically the s-maxage and maxage directives) headers, or it might involve adding timestamp parameters to the Signature header.

To summarize, the signature in the response should assert that a particular entity is a representation of a particular resource at a particular time.

4 comments:

Anonymous said...

It seems that Etag and Signature have the same lifespan. Maybe signing the Etag would serve two purposes: enforcing the same expiry and no signature dependency on date values?

On the "request-uri" response header I see some similarity with Julians GET-Location draft. I think it addresses similar issues and also could define how to sign responses by making them first level HTTP resources.

Davanum Srinivas (dims) said...

James,

What needs to be signed in a signed HTTP (say POST or PUT) request?

-- dims

Anonymous said...

You make the argument that it is necessary to sign "{URI, representation}" to prevent an active attacker providing a signed response from a different representation.

This seems like is a flawed argument. It is trivial for the active attacker to simply insert a 30x redirect to a different resource, regardless of how the final response is signed.

That is the kind of problem which can only be addressed by doing hop-by-hop mandatory MESSAGE integrity.

I think you need to specify exactly what threat model you're trying to address here.

James Clark said...

If I ask for http://www.example.com/foo and the attacker redirects to http://www.example.com/bar, then I will get something back that is signed as being from http://www.example.com/bar.

Obviously, it would be insecure for me to trust the redirection. On the other hand it would be very inconvenient if redirections couldn't be used with signatures.

I think what your observation shows is that it needs to be possible to sign redirects. A redirection of one resource to another is something that makes perfect sense at a resource-level view.

I am afraid I don't understand how you reach the conclusion that hop-to-hop message integrity is what's needed.