James Clark's Random Thoughts

2010-02-12

A tour of the open standards used by Google Buzz

The thing I find most attractive about Google Buzz is its stated commitment to open standards:

We believe that the social web works best when it works like the rest of the web — many sites linked together by simple open standards.

So I took a bit of time to look over the standards involved. I’ll focus here on the standards that are new to me.

One key design decision in Google Buzz is that individuals in the social web should be identifiable by email addresses (or at least strings that look like email addresses).  On balance I agree with this decision: although it is perhaps better from a purist Web architecture perspective to use URIs for this, I think email addresses work much better from a UI perspective.

Google Buzz therefore has some standards to address the resulting discovery problem: how to associate metadata with something that looks like an email address. There are two key standards here:

  • XRD. This is a simple XML format developed by the OASIS XRI TC for representing metadata about a resource in a generic way. This looks very reasonable and I am happy to see that it is free of any XRI cruft. It seems quite similar to RDDL.
  • WebFinger. This provides a mechanism for getting from an email address to an XRD file.  It’s a two-step process based on HTTP.  First of all you HTTP get an XRD file from a well-known URI constructed using the domain part of the email address (the well-known URI follows the Defining Well-Known URIs and host-meta Internet Drafts). This per-domain XRD file provides (amongst other things) a URI template that tells you how to construct a URI for an email address in that domain; dereferencing this URI will give you an XRD representation of metadata related to that email address.  There seem to be some noises about a JSON serialization, which makes sense: JSON seems like a good fit for this problem. 

One of the many interesting things you can do with such a discovery mechanism is to associate a public key with an individual.  There’s a spec called Magic Signatures that defines this.  Magic Signatures correctly eschews all the usual X.509 cruft, which is completely unnecessary here; all you need is a simple RSA public key.  My one quibble would be that it invents its own format for public keys, when there is already a perfectly good standard format for this: the DER encoding of the RSAPublicKey ASN.1 structure (defined by RFC 3477/PKCS#1), as used by eg OpenSSL.

Note that for this to be secure, WebFinger needs to fetch the XRD files in a secure way, which means either using SSL or signing the XRD file using XML-DSig; in both these cases it is leveraging the existing X.509 infrastructure. The key architectural decision here is to use the X.509 infrastructure to establish trust at the domain level, and then to use Web technologies to extend that chain of trust from the domain to the individual. From a deployment perspective, I think this will work well for things like Gmail and Facebook, where you have many users per domain.  The challenge will be do make it work well for things like Google Apps for your Domain, where the number of users per domain may be few.  At the moment, Google Apps requires the domain administrator only to set up some DNS records.  The problem is that DNS isn’t secure (at least until DNSSEC is widely deployed).  Here’s one possible solution: the user’s domain (e.g. jclark.com) would have an SRV record pointing to a host in the provider’s domain (e.g. foo.google.com); the XRD is fetched using HTTP, but is signed using XML-DSig and  an X.509 certificate for the user’s domain.  The WebFinger service provider (e.g. Google) would take care of issuing these certificates, perhaps with flags to limit their usage to WebFinger (Google already verifies domain control as part of the Google Apps setup process). The trusted roots here might be different from the normal browser vendor determined HTTPS roots.

The other part of Magic Signatures is billed as a simpler alternative to XML-DSig which also works for JSON. The key idea here is to avoid the whole concept of signing an XML information item and thus avoid the need for canonicalization.  Instead you sign a byte sequence, which is encoded in base64 as the content of an XML element (or as a JSON string).  I don’t agree with the idea of always requiring base64 encoding of the content to be signed: that seems to unnecessarily throw away many of the benefits of a textual format.  Instead, when the byte sequence that you are signing is representing a Unicode string, you should be able to represent the Unicode string directly as the content of an XML element or as a JSON string, using the built-in quoting mechanisms of XML (character references/entities and CDATA sections) or JSON. The Unicode string that results from XML or JSON parsing would be UTF-8 encoded before the standard signature algorithm is applied. A more fundamental problem with Magic Signatures is that it loses the key feature of XML-DSig (particularly with enveloped signatures) that applications that don’t know or care about signing can still understand the signed data, simply by ignoring the signature.  I completely sympathize with the desire to avoid the complexity of XML-DSig, but I’m unconvinced that Magic Signatures is the right way to do so. Note that XRD has a dependency on XML-DSig, but it specifies a very limited profile of XML-DSig, which radically reduces the complexity of XML-DSig processing. For JSON, I think i

There are also standards that extend  Atom. The simplest are just content extensions:

  • Atom Activity Extensions provides semantic markup for social networking activities (such as "liking" something or posting something). This makes good sense to me.
  • Media RSS Module provides extensions for dealing with multimedia content. These were originally designed by Yahoo for RSS. I don't yet understand how these interact with existing Atom/AtomPub mechanisms for multimedia (content/@src, link).

There are also protocol extensions:

  • PubSubHubbub provides a scalable way of getting near-realtime updates from an Atom feed. The Atom feed includes a link to a “hub”.  An aggregator can then register with hub to be notified when a feed is updated. When a publisher updates a feed, it pings the hub and the hub then updates all the aggregators that have registered with it.  This is intended for server-based aggregators, since the hub uses HTTP POST to notify aggregators.
  • Salmon makes feed aggregation two-way.  Suppose user A uses only social networking site X and user B uses only social networking site Y. If user A wants to network with B, then typically either A has to join Y or B has to join X.  This pushes the world in the direction of having one dominant social network (i.e. Facebook). In the long-term I don’t think this is a good thing.  The above extensions solve part of the problem. X can expose a profile for A that links to an Atom feed, and Y can use this to provide B with information about A. But there’s a problem.  Suppose B wants to comment on one of A’s entries.  How can Y ensure that B’s comment flows back to X, where A can see it?  Note that there may be another user C on another social networking site Z that may want to see B’s comment on A’s entry. The basic idea is simple: the Atom feed for A exposed by X links to a URI to which comments can be posted.  The heavy lifting of Salmon is done by Magic Signatures.  Signing the Atom entries is the key to allowing sites to determine whether to accept comments.

Google seems to planning to use the Open Web Foundation (OWF) for some of these standards.  Although the OWF’s list of members includes many names that I recognize and respect, I don’t really understand why we need the OWF. It seems very similar to the IETF in its emphasis on individual participation.  What was the perceived deficiency in the IETF that motivated the formation of the OWF?

2010-02-06

Mac Day 1

I decided to dip my toe in the Mac world and buy a Mac mini. If I decide to make the switch, I will probably end up getting a fully tricked out MacBook Pro, but I'm not ready for that yet and I want to wait for the expected MacBook Pro refresh.

I've been using it for 24 hours.

Likes

  • The hardware is beautiful. The attention to detail is fantastic. Somebody has taken the time to think about even something as mundane as the power cord (it's less stiff than normal power cords and curls nicely). The whole package exudes quality.
  • It's reassuring to have something Unix-like underneath.
  • Mostly things "just work".
  • The dock is quite pretty and intuitive.
  • Set up was smooth and simple.

Dislikes

  • The menu bar is an abomination. When you have a large screen, it makes no sense to have the menus always at the top left of the screen, which may well be far from the application window.
  • On screen font rendering seems less good than Windows. I notice this particularly in Safari. It's tolerable, but the Mac is definitely a step down in quality here.
  • I was surprised how primitive the application install, update and removal experience was. I miss apt-get. Many updates seem to require a restart.
  • I don't like the wired Apple mouse. Although it looks nice, clicking is not as easy as with a cheap, conventional mouse, plus the lead is way too short.

Minor nits

  • How is a new user supposed to find the web browser? The icon is a compass (like the iPhone icon that gives a real compass) and the tooltip says "Safari".
  • A Safari window with tabs looks ugly to me: there's this big band of gray and black at the top of the window.
  • Not convinced DisplayPort has sufficient benefits over HDMI to justify a separate standard.
  • I couldn't find a way of playing a VCD using the standard applications. I ended up downloading VLC, which worked fine.
  • The Magnification preference on the Dock was not on by default, even though it was enabled in the introductory Apple video.

So far I've installed:

  • NeoOffice
  • Adium (didn't work well with MSN, which is the dominant chat system in Thailand, so I will probably remove it)
  • Microsoft Messenger
  • Emacs
  • Blogo, which I am using to write this. Is there a better free equivalent to Windows Live Writer?
  • VLC
  • Skype

I plan to install

  • XCode
  • iWork

Any other software I should install? Should I be using something other than Safari as my Web browser?


2010-01-02

XML Namespaces

One of my New Year’s resolutions is to blog more.  I don’t expect I’ll have much more success with this than I usually do with my New Year’s resolutions, but at least I can make a  start.

I have been continuing to have a dialog with some folks at Microsoft about M.  This has led me to do a lot of thinking about what is good and bad about the XML family of standards.

The standard I found it most hard to reach a conclusion about was XML Namespaces.  On the one hand, the pain that is caused by XML Namespaces seems massively out of proportion to the benefits that they provide.  Yet, every step on the process that led to the current situation with XML Namespaces seems reasonable.

  1. We need a way to do distributed extensibility (somebody should be able to choose a name for an element or attribute that won’t conflict with anybody else’s name without having to check with some central naming).
  2. The one true way of naming things on the Web is with a URI.
  3. XML is supposed to be human readable/writable so we can’t expect people to put URIs in every element/attribute name, so we need a shorter human-friendly name and a way to bind that to a URI.
  4. Bindings need to nest so that XML Namespace-generating processes can stream, and so that one document can easily be embedded in another.
  5. XML Namespace processing should be layered on top of XML 1.0 processing.
  6. Content and attribute values can contain strings that represent element and attribute names; these strings should be handled uniformly with names that the XML parser recognizes as element and attribute names.

I would claim that the aspect of XML Namespaces that causes pain is the URI/prefix duality: the thing that occurs in the document (the prefix + local name) is not the same as the thing that is semantically significant (the namespace URI + local name).  As soon as you accept this duality, I believe you are doomed to a significant extra layer of complexity.

The need for this duality stemmed from the use of URIs for names. As far as I remember, there was actually no discussion in the XML WG on this point when we were doing XML Namespaces: it was treated as axiomatic that URIs were the right thing to use here. But this is where I believe XML Namespaces went wrong.

From a purely practical point of view, the argument for naming namespaces with URIs is that you can do a GET on the URI and get something human- or machine-readable back that tells you about the semantics of the namespace.  I have two responses to this:

  • This is a capability that is occasionally useful, but it’s not that useful.  The utility here is of a completely different order of magnitude compared to the disutility that results from the prefix/URI duality.  Of course, if you are a RDF aficionado, you probably disagree.
  • You can make names resolvable without using URIs.  For example, a MIME-type X/Y can be made resolvable by having a convention that it http://www.iana.org/assignments/media-types/X/Y; or, if you have a dotted DNS-style name (e.g. org.example.bar.foo), you can use DNS TXT records to make it resolvable.

From a more theoretical point of view, I think the insistence on URIs for namespaces is paying insufficient attention to the distinction between instances of things and types of things.  The Web works as well as it does because there is an extraordinarily large number of instances of things (ie Web pages) and a relatively very small number of types of things (ie MIME types).  Completely different considerations apply to naming instances and naming types: both the scale and the goals are completely different.  URIs are the right way to name instances of things on the Web; it doesn’t follow that they are the right way to name types of things.

I also have a (not very well substantiated) feeling that using URIs for namespaces tends to increase coupling between XML documents and their processing.  An example is that people tend to assume that you can determine the XML schema for a document just by looking at the namespace URI of the document element.

What lessons can we draw from this?

For XML, what is done is done.  As far as I can tell, there is zero interest amongst major vendors in cleaning up or simplifying XML. I have only two small suggestions, one for XML language designers and one for XML tool vendors:

  • For XML language designers, think whether it is really necessary to use XML Namespaces. Don’t just mindlessly stick everything in a namespace because everybody else does.  Using namespaces is not without cost. There is no inherent virtue in forcing users to stick xmlns=”…” on the document element.
  • For XML vendors, make sure your tool has good support for documents that don’t use namespaces.  For example, don’t make the namespace URI be the only way to automatically find a schema for a document.

What about future formats?  First, I believe there is a real problem here and a format should define a convention (possibly with some supporting syntax) to solve the problem. Second, a solution that involves a prefix/URI duality is probably not a good approach.

Third, a purely registry-based solution imposes centralization in situations where there’s no need. On the other hand, a purely DNS-based solution puts all extensions on the same level, when in reality from a social perspective extensions are very different: an extension that has been standardized or has a public specification is very different from an ad hoc extension used by a single vendor.  It’s good if a technology encourages cooperation and coordination.

My current thinking is that a blend of registry- and DNS-based approaches would be nice.  For example, you might have something like this:

  • names consist of one or more components separated by dots;
  • usually names consist of a single component, and their meaning is determined contextually;
  • names consisting of multiple components are used for extensions; the initial component must be registered (the registration process can be as lightweight as adding an entry to a wiki, like WHATWG does HTML5 for rel values);
  • there is a well-known URI for each registered initial component;
  • one registered initial component is “dns”: the remaining components are a reversed DNS name (Mark Nottingham’s had a ID like this for MIME types); there’s some way of resolving such a name into a URI.

Some other people’s thinking on this that I’ve found helpful: Mark Nottingham, Jeni Tennison, Tim Bray (and the rest of that xml-dev thread).

2009-03-23

Getting involved with M

I spent last week in Redmond talking to Microsoft about M and Oslo.  The question at the back of my mind before I went was "Does M really have the potential in the long term to be an interesting and useful alternative to XML?".  My tentative answer is yes.  Here's why:

  • Although M, as it is today, is interesting in a number of ways, it is obviously a long way from being a serious alternative to XML (at least for the kinds of application I am interested in). One of my concerns was that I would hear "It's too late to change that". I never did: I was pleasantly surprised that Microsoft are still willing to make fundamental changes to M.
  • Microsoft recognize that M's long-term potential would be severely limited if it was a proprietary, Microsoft-only technology.  I believe they realize that M needs to end up as a genuinely open standard.  They've already made initial steps towards a more open process for M. On the other hand, they don't believe in design by committee.  (And having seen some of the abominations that design by committee can produce, I can certainly sympathise with that.) There's a senior Microsoft guy that gets to make the final call on all design decisions. In other words, it's a benevolent dictator model.  I'm OK with this in principle (although I like it even better when I'm the benevolent dictator).  I think it's worked really well in a number of cases (C# and Python spring to mind).  But obviously it all depends on the qualities of the particular benevolent dictator.  From my interactions so far, he seems like a really smart guy and he's willing to listen.
  • Microsoft is addressing the whole stack.  An alternative to XML needs to provide not only an alternative to XML itself but also alternatives to XSD/RELAX NG and XQuery/XSLT.
  • Microsoft seem to be designing things in a principled way; they are paying attention to the relevant CS theory. For example, ML seems to be a major influence. They are making an effort to produce something clean, elegant, even beautiful, rather than doing just enough to get a product out.
  • Microsoft seem willing to take documents seriously. This is a make or break issue for me, because the kind of data I care about most is documents and M, as it is today, is not useful for documents. This was probably the issue we spent the most time on; I talked a lot about the importance of mixed content. One of the Microsoft team suggested the goal of using M to do the M specification.  I think this sort of dogfooding will be very helpful in ensuring M works well for documents.

Of course, it's early days yet, and it's hard to tell how much leverage M will get, but there's enough potential to make me want to be involved.

2009-01-17

RELAX NG and xml:id

One part of the vision underlying RELAX NG is that validation should not be monolithic: it is not necessary or desirable to have one schema language that can handle every possible kind of validation you might want to do; it is better instead to have multiple specialized languages, each of which does one kind validation, really well. Consistent with this vision, RELAX NG provides only grammar-based validation. There's no implicit claim that other kinds of validation aren't useful and important.

One kind of validation that is clearly useful and important and that can't be done by grammars is checking of cross-references. One possibility is to use Schematron for this. The designers of RELAX NG anticipated that there would be a little schema language specialized to this, which would be created as part of the ISO DSDL effort (as part 6); this wouldn't be a million miles from the kind of thing that XSD provides with xs:key/xs:unique/xs:keyref. Unfortunately this hasn't happened yet.

Since DTDs provide ID/IDREF checking and we wanted people to be able to move easily from DTDs to RELAX NG, we felt we had to provide some transitional support for ID/IDREF checking while awaiting the ultimate "right" solution. We therefore provided a separate, optional spec called RELAX NG DTD Compatibility. Amongst other things, this defines a way in which RELAX NG processors can optionally provide DTD-compatible ID/IDREF checking based on the datatypes of attributes declared in the schema. Note that this can't handled by the XSD datatypes library for RELAX NG, because assignment of types in the schema to values in the instance is not part of the RELAX NG model of validation.

When defining RELAX NG DTD compatibility, we took a fairly hard line about being DTD-compatible. In particular, we made it a requirement that you should be able to generate a DTD subset from the RELAX NG schema that would perform the same type assignment that the process defined by the spec would perform. This creates some problems when you use DTD Compatibility in conjunction with wildcards (which of course aren't a DTD feature). For example:

start = element doc { p* }
p = element p { id?, any* }
id = attribute id { xsd:ID }
any = element * { attribute * { text }*, (any|text)* }

will get a error about conflicting ID-types for p/@id.  This is because the schema allows <p> to contain a <p> element with an id attribute that doesn't have type ID. Instead you would have to write:

start = element doc { p* }
p = element p { id?, any* }
id = attribute id { xsd:ID }
any = element * - p { attribute * { text }*, (any|text)* }

Several years after the DTD compatibility spec was finished, the W3C came out with the xml:id Recommendation. The spec mentions RELAX NG in a non-normative appendix and encourages authors "to declare attributes named xml:id with the type xs:ID". Now on the face of it, this seems pretty reasonable advice.  Unfortunately, from the point of the RELAX NG DTD Compatibility spec it's precisely the wrong thing to do.  For example, this

start = element doc { p* }
p = element p { id?, any* }
id = attribute xml:id { xsd:NCName }
any = element * { attribute * { text}*, (any|text)* }

will work perfectly with RELAX NG with or without DTD compatibility. The XML processor does the xml:id checking, and RELAX NG can ignore ID/IDREFs. But if instead you follow the xml:id Recommendation's suggestion and do:

start = element doc { p* }
p = element p { id?, any* }
id = attribute xml:id { xsd:ID }
any = element * { attribute * { text}*, (any|text)* }

a RELAX NG validator that implements RELAX NG DTD compatibility will give you an error about conflicting ID-types p/@xml:id. You might think you could do

start = element doc { p* }
p = element p { id?, any* }
id = attribute xml:id { xsd:ID }
any = element * { attribute * - xml:id { text}*, id?, (any|text)* }

but that won't work either, because although you can now write a DTD subset that does equivalent type assignment for p, you can't do it for the other elements.

(The xml:id Recommendation also says in the RELAX NG section that "A document that uses xml:id attributes that have a declared type other than xs:ID will always generate xml:id errors.". I don't see why: the xml:id processor is quite likely to be part of the XML parser, which doesn't know anything about RELAX NG, nor does RELAX NG know anything about xml:id.)

Back when RELAX NG DTD compatibility spec came out, I implemented support for the ID/IDREF checking part of DTD Compatibility in Jing.  I also decided to make Jing enforce this by default. There's a -i switch to turn it off. Before xml:id came along, this seemed to work OK: if a schema author specifies ID/IDREF in a RELAX NG schema then they usually want ID/IDREFs to be checked and RELAX NG DTD Compatibility was the only thing that could do this checking. With xml:id this no longer works well: if you

  • use xml:id
  • declare xml:id attributes as type xsd:ID in the RELAX NG schema
  • use wildcards in your RELAX NG schema
  • don't use any special options to Jing

you are very likely to get an error from Jing.

At first, my plan was simply to change Jing not to enforce DTD Compatibility by default. However, Alex Brown pointed out that this isn't completely satisfactory: people who are coming from DTDs and aren't using xml:id lose the sensible ID/IDREF checking that they might reasonably expect to happen by default. So now I'm thinking that a better solution might be to add two boolean options to Jing, both of which would be enabled by default.

The first option would be to make it a warning rather than an error if the schema does not use ID/IDREF in a DTD-compatible way. (If the schema is DTD-compatible, then duplicate IDs or IDREFs to non-existent IDs would still be errors.)

The second option would tell Jing to be "xml:id aware". This would have several effects.

  • It would require attributes named xml:id to be declared with type xsd:ID (or with the ID type from the datatype library defined by the DTD compatibility spec). This isn't strictly necessarily, but it would seem to minimize confusion and be in keeping with the spirit of the xml:id Recommendation. It's slightly tricky to decide what this means with various unusual RELAX NG wildcards. It is obvious that attribute xml:id { text} is an error.  But the following are not all obvious to me:
    • attribute xml:id|id { text }
    • attribute * { text }
    • attribute xml:* { text }
    • attribute *|xml:id { text }
  • When checking whether you can generate an equivalent DTD subset, xml:id attributes would be ignored. In the terms defined by the RELAX NG DTD Compatibility spec, you would ignore xml:id attributes when determining whether the schema is compatible with the ID/IDREF feature.
  • When checking uniquess of IDs, and when checking IDREFs, an attribute named xml:id would always be treated as an ID attribute.

It might also be a good idea to revise the RELAX NG DTD compatibility spec to be xml:id aware in this way.

Labels