Bytes not infosets

Security is the one area where the WS-* world has developed a set of standards that provide significantly more functionality than has so far been standardized in the REST world. I don't believe that this is an inherent limitation of REST; I'm convinced there's an opportunity to standardize better security for the REST world. So I've been giving quite a lot of thought to the issue of what the REST world can learn from WS-Security (and its numerous related standards).

Peter Gutmann has a thought-provoking piece on his web site in which he argues that XML security (i.e. XML-DSig and XML encryption) are fundamentally broken. He argues that the fundamental causes of this brokenness are as follows:

1. XML is an inherently unstable and therefore unsignable data format. XML-DSig attempts to fix this via canonicalization rules, but they don't really work.

2. The use of an "If it isn't XML, it's crap" design approach that lead to the rejection of conventional, proven designs in an attempt to prove that XML was more flexible than existing stuff.

He also complains of the difficulty of supporting XML in general-purpose security toolkit:

It's impossible to create something that's simply a security component that you can plug in wherever you need it, because XML security is inseparable from the underlying XML processing system.

I would suggest that there are two different ways to view XML:

  1. the concrete view: in this view, interchanging XML is all about interchanging sequences of bytes in the concrete syntax defined by XML 1.0
  2. the infoset view: in this view, interchanging XML is all about interchanging abstract structures representing XML infosets; the syntax used to represent the infoset is just a detail to be specified by a binding (the infoset view tends to lead to bindings up the wazoo)

I think each of these views has its place.  The infoset is an invaluable conceptual tool for thinking about XML processing. However, I think there's been an unfortunate tendency in the XML world (and the WS-* world) to overemphasize the infoset view at the expense of the concrete view.  I believe this tendency underlies a lot of the problems that Gutmann complains of.

  • There's nothing unstable or unsignable about an XML document under the concrete view.  It's just a blob of bytes that you can hash and sign as easily as anything else (putting external entities on one side for the moment).
  • The infoset view makes it hard to accommodate non-XML formats as first-class citizens.  If your central data model is the XML infoset, then everything that isn't XML has to get mapped into XML in order to be accommodated. For example, the WS-* world has MTOM. This tends to lead to reinventing XML versions of things just so they can be a first-class citizens in an infoset-oriented world.
  • If you look at everything as an infoset, then it starts to look natural to use XML for things that XML isn't all that good at. For example, if your message body is an XML infoset and your message headers are infosets, then it looks like a reasonable choice to use XML as your envelope format to combine your body with your headers. But using XML as a container format like this leads you into all the complexity and inefficiency of XML Security, since you need to be able to sign the things that you put in containers.  It's much simpler to use a container format that works on bytes, like zip or MIME multipart.
  • The infoset view leads to emphasize APIs that work with XML using abstractions, such a trees or events, that are at the infoset level, rather than work with XML at a more concrete level using byte streams or character streams. Although infoset-level APIs are needed for processing XML, when you use infoset-level APIs for interchanging XML between separate components, I believe you pay a significant price in terms of flexibility and generality. In particular, using infoset-level APIs at trust boundaries seems like a bad idea.

My conclusion is this: one aspect of the WS-* approach that should not be carried over to the REST world is the emphasis on XML infosets.


Anonymous said...

I've known Peter for years, and his anti-XML rant is a little off; XML c14n works just fine, for example. :)

You are correct that bytes, not infosets, are the way to go -- even in the WS-* world it all comes down to utf8 xml serialization. Last I checked, you can't feed an infoset into SHA1...
-rich $alz

Hal Lockhart said...

The W3C is interested in correcting problems and making improvements to XML Signature and XML Encryption. As a first step a Public Workshop was held on September 25-26. It was open to the general public, not just W3C members. The program, including links to all the presentations and papers is here:

(The Workshop report will be published soon.)

It is too bad you guys did not participate. (Rich, I know IBM was there, but I don't think your opinions were represented.)

First note that I have also dealt with Peter for many years, but I agree with Rich that he is off base on this one. He claims there is no way to implement XML Signature using existing crypto libraries. This is just a plain factual error. Dozens of such implementations exist, work fine and interop with each other.

It is true you can not do C14N without being aware of XML, but once you have computed the canonical form, the rest of the processing operates directly on the bytes, just like any other signature.

Ok, so if you believe XML is bad and should never be used for anything, I will not argue with you. Perhaps we should get rid of HTML as well since it is essentially a degenerate form of XML. Probably you would be much happier using ASN.1 or something. ;)

However back in the real world where people want to use XML, we security folks have to do something about supporting it. It does no good to simply say that certain requirements are difficult to meet, therefore we will pretend they are not requirements.

In anything but the most simple applications, message contents including html and xml will be assembled using multiple independent software components. This components will almost inevitably use toolkits which make chances to the text which have no semantic significance. Unless you are willing to accept a high frequency of spurious validation errors, some form of C14N is required.

However, you are completely wrong about XML Signature only signing xml and not building upon existing signature mechanisms. Signatures can be over any combination of xml and non-xml objects and there is even the optimization that if a hash has already been computed over something else, it is not necessary to recompute it. This is even true if the original hash was a different algorithm than the one used for the signature.

There is a lot more I could say. (For example, have you looked at the SWA profile of WS-Security?) but I will simply urge you and anyone else interested in this subject to get involved in the W3C work which will likely start in a few months.

Hal Lockhart

James Clark said...

Just for the record, I didn't say I agreed with everything Peter wrote. I said that I found his piece was "thought-provoking".