2007-12-09

HTTPbis

Mark Nottingham explains the work being done in the IETF to revise HTTP. It sounds to me like they're doing exactly the right thing, focusing on producing a better spec that brings light to some of the darker corners of the protocol and reduces the gap between what the spec says and what you actually need to implement to achieve interoperability. It's good to see that capable people have stepped up to put in the not inconsiderable time and effort that's needed for this unglamorous but very useful work.

2007-12-07

Thai personal names

There's an election coming up in Thailand on December 23rd and the streets are lined with election posters.  As a bit of an i18n geek, I find it interesting that the posters almost all make the candidates' first names at least twice as big as their last names.  If you're also an i18n geek, your reaction might well be: "it must be because Thais write their family name first, followed by their given name". But you would be wrong.  Thais have a given name and a family name; the given name is written first, and the family name last.

The correct explanation that given names play a role in Thai culture that is similar to the role that family names play in many Western cultures. The polite way to address somebody is with an honorific followed by their given name. The Thai telephone book is sorted with given names as the primary key and family names as the secondary key.

(I have to say that this has led me to question what I perceive to be the i18n orthodoxy that it's more i18n-ly correct to talk of given name/family name than first name/last name. Why does it matter whether a name is a family name or a given name? Surely what matters is the cultural role that the name plays.)

I guess that historically the main reason for the dominance of given names in Thai culture is because family names are a relatively recent innovation: they were introduced by King Rama VI towards the beginning of the 20th century. Family names were allocated to families systematically and the use of family names is still controlled by the government. Any two people in Thailand with the same family name are related. This leads to Thai family names being quite a mouthful.  Here's a sample from people in the news over the past couple of days: Leophairatana, Tantiwittayapitak, Boonyaratkalin. Even Thais have difficulty remembering each others family names.

If you become a Thai citizen, you have to choose a new, unused family name.  Just as with domain names, all the good, short names have gone. So the more recently your family has become Thai, the longer and more unwieldy your family name is likely to be.

Thai given names usually have at least two or three syllables. There aren't any given names that are as commonly used in Thai culture as the most popular given names in Western cultures.  I've never come across a situation where two living Thais share the same given name and family name. You would certainly never get the situation of hundreds of people having the same given name and family name (like "James Clark").

Thais rarely use the First.Last@domain convention for email.  It would be too unwieldy. The conventions I've seen most often are First.La@domain and First.L@domain (i.e. use only the first one or two characters of the last name).

Another I18N wrinkle is that Thais' official first and given names are in Thai script not in Roman script. But in many situations Thais use romanized versions of their names.  And while there is a standard way (actually several standard ways) of romanizing Thai, the convention is that the correct romanization of any personal name is what the holder of the name wishes it to be. (Thus, your application may need to store two versions of names: the Thai script version and the romanized version.)

With honorifics, I think the nastiest gotcha from an i18n perspective is that, while the given and family name are conventionally written separated by a space, there is no separator between the honorific and the given name. (Words in Thai are normally not separated by spaces.) This applies only in Thai script.  When romanized, you would need a space between the honorific and the given name.

Since given names are used in Thai culture somewhat like family names are used in some Western cultures, you might be wondering what serves the role that given names serve in Western cultures.  All Thais have a name referred to as a "chue len". This is typically translated as "nickname", but it has a more important role in Thai culture than a nickname does in Western culture.  I think it would be more accurate to describe it as an "informal given name". Parents give each of their children a chue len, in addition to a formal given name.  You would typically use a chue len to address somebody in contexts where in England you might use their first name.

Whereas formal given names are restricted to names that the bureaucrats of the interior ministry deem appropriate, parents can and do follow their personal whims when it come the chue len. For example, a former employee of mine was called "Mote", which was abbreviated from "remote", as in TV remote control. (This illustrates another interesting aspect of Thai culture: words are commonly shortened by omitting all except the last syllable. For example, a kilo is often referred to as a  "lo".)

In perhaps 80% of cases the chue len is a single syllable. It's often very difficult to romanize these.  Thai has tones as well as one of the richest collection of vowels of any language. Most romanization schemes don't preserve subtle differences in tones and vowels.  Whereas this is workable with formal given names and family names, which usually have many syllables and some redundancy, if you don't get the vowel or tone of a chue len exactly right, it becomes another name. For example, another of my employees has a name that sound like the second syllable of the word "apple", but with the "l" changed to a "n", and pronounced in an emphatic (falling) tone. I can write that sound unambiguously in Thai, but I've no idea how to write it in English.

Occasionally the chue len is a shortened version of the given name, but more often it is completely unrelated.  If you know somebody only in a relatively informal social context, it is quite likely that you will know only their chue len and not their formal given name or family name.

I think it would be quite challenging to design an address book application that deals with all this naturally.  No application I've used does a good job and indeed it's not immediately obvious to me what the right approach to handling this is.  (However, I suspect an approach based on adding markup to the display name will work better than trying to figure out a set of database fields.)

Of course, it becomes even more difficult if you want to deal with complexities that arise in other cultures. I'm sure that just as personal names in Thai culture have some features that are surprising from a Western perspective, there must be many other cultures where personal names have equally surprising features.  I would love to learn more about these. If anybody can blog or comment with additional information, that would be great.

(Any Thais reading this, please feel free to add comments correcting anything I've got wrong or adding any important points I've missed.)