This is part two of our writeup and perspective on IDNs, especially double byte Japanese domain names. See Part One (where we went into some detail of explaining the history of the technology surrounding IDNs) before reading on. Without going into too much more history or spurting out much more technical jargon, lets explore some day to day aspects of double byte domains, how they are (or rather aren’t) used, and why they just aren’t the glowing future of Japanese internet real estate that initially they may seem to be.
As explained in part one, IDNs rely completely on your PC’s local Internet browser software to convert that Japanese domain name into a real single byte domain name. If you don’t understand what that means, it means アニメ.com (anime.com) is not actually アニメ.com at all, it is xn--cck5dwc.com. Yes, that’s right, it gets converted into our beloved combination of 37 Latin characters, even before it leaves your computer! If you are not using the latest internet browser software, then sorry, you are left at the front gate, because only the latest versions translate アニメ.com properly for you… that is, unless you are the lucky autistic one among us who is more comfortable with remembering xn--1sqt31d.com as a homepage address! (Take a look at the site 価格.jp, strangely, they display their domain name in Japanese, AND the real domain name – www.xn--1sqt31d.jp – in the title of their page!)
On the other side of the “internet tubes”, at the destination of your command (or message), it is servers that form the foundations of internet infrastructure. We are all working on servers whether we realise it or not. 70% of servers on the internet run unix. You are using unix now when you read this homepage, and likely to be when you check your email, or stream some (raunchy?) video down through bittorrent. For those who have ever dabbled in setting up their own unix internet server (or Windows server for that matter!), you know that one of the first things that you have to input when doing initial setup is the server’s domain name. A computer has to know who it is, before others can talk to it, and before it can send out information to you. Servers will not accept a double byte characters as their own name. They also will only accept the hallowed combinations of 37 characters and numbers. There is a detailed, but cumbersome JPRS guide to setting up a 日本語.jp domain name on your web server here (PDF).
So, let’s get this straight – your computer doesn’t actually look for internet servers using double byte domain names, it converts them to single byte first, (if you are lucky – this person commenting on the part one found one example where the conversion doesn’t go quite as expected), and servers wont recognize their “internationalized nickname” except for in its garbled 37 character state. But, Japanese people like using their own language on computers right? Well, it seems not so, at least when it comes to typing domain names.
The point being missed here is that Japanese people, simply don’t use kanji domain names. Internationalized Domain Names, or “IDNs” have now been around for just over 4 years. Kakaku.com (huge electronics price comparison site) was one of the first to stake their claim by snapping up 価格.com (actually xn--1sqt31d.com) in the fear that someone would take it from them. But the vast majority of Japanese computer users kept using the “romaji” – kakaku.com. Why? Well, in Japan, apart from the dying breed of post baby boomers with their yokomoji allergies, young people (the majority of computer users) here are actually extremely comfortable with typing in domain names in romaji – as opposed to the Chinese, who feel that using their own double-byte language as easier.
An interesting article on whether or not Japanese people actually want kanji IDNs found at Japaninc notes:
…Japanese who don’t use the Internet don’t do so because they are not familiar with computers, not because they are unfamiliar with English. The Internet itself is already Japanese capable — software menus and commands have long since been localized, and a universe of online information is already available in Japanese. You don’t really have to know English to use the Internet, you just need to know how to type. In fact, entering words in Japanese is often more complicated than entering them in English…
Although the article itself is not so new, the point remains. People in Japan who are not net savvy, generally the elderly, and young kids, are that way because they do not understand the concept of the Internet, and computers in general, and not because they can not tackle 26 characters of the alphabet.
For many Japanese who do consider themselves Internet capable, the factor that holds them back from IDNs is the henkan, or conversion factor. To actually get to 価格.com using its IDN, you have to type the following key strokes into your browser’s address bar:
2. SPACE (to convert “kakaku” into 価格)
3. ENTER (to tell the computer that 価格 is actually the correct kanji out of several possibilities. Thank god 価格 was the first choice.)
4. HANKAKU/ZENKAKU key (to convert input method back into romaji to enter the single byte TLD, the .com part)
5. and then finally the “.com” on the end. You end up tapping the keyboard that many more times to get to the exactly the same site, not to mention that your computer then has to translate what you typed back to a convoluted single byte web address!. Less tapping is better and surely more user friendly – romaji here is clearly the better choice.
The above example using 価格.com is an easy one though. Imagine if you owned 孝治.com (孝治 is read koji, a common name in Japan). Step 2. above took thirteen taps of the space bar for me to find the correct kanji character. There are tens of thousands of Japanese words with same reading, but different characters.
What about non-Japanese people, living outside of Japan? In Western countries the hurdle is not just having to tap the keyboard excessively, or a browser compatibility issue. It is a Japanese language input issue. Without the ability to choose to input double-byte characters (IME) which needs to be installed (it is not automatically included with English windows). Even when you do have IDN support in IE6 (yes IE6 also sketchily supports kanji domain names in my tests), IE7, or maybe Firefox, if you are unable to TYPE Japanese, which is the case most English windows machines, then you cant type double byte characters and therefore you are blocked from accessing IDN websites… If you are a domain owner, it really doesn’t seem to be the brightest thing to do for your international users. This is probably a no-brainer anyway as most smart companies would make sure they have the romaji version of the site (kakaku.com) registered before they embarked on trying to woo the yokomoji-phobic generation with its double byte cousin (価格.com) or there would be no point right? Who in their right mind would try to build a business with 会社.com without also owning kaisha.com, and risking loosing a large percent of their business to Japanese users who are going to enter it in romaji first anyway.
There is only one point that should be considered here. That is, that owning 会社.com along with kaisha.com would likely increase the SEO value of your content in search engine eyes, as the big names like google.com and baidu.com do appear to be indexing the IDNs as double byte words. Google search results in the picture to the left, clearly show Japanese domain name URLs in Japanese script. Whether this is beneficial for SEO (Search Engine Optimisation) remains a mystery however, with a Google search for 新宿駅 “shinjuku station” showing 7 romaji domain names above 新宿駅.com in my browser.
Tim Romero, a Tokyo-based serial entrepreneur, however has the final say on this topic. He wrote this piece about the dirtier side of these so called Internationalized Domain Names way back in Feb 2001, only months after double byte domain names were widely available for general registration. In hindsight, how very right he was. The bottom line – which is something that can not be ignored – is that Internationalized Domain Names are another means for the huge domain registries like ICANN (that sell domain names to companies which in turn sell them to you and I) to play on the fear of companies and individuals in order to line their own pockets with cash. They charge what the market can bear, and will sell you anything they can. .jp domain names for example, cost around 10 times more than .com ones, for no particular reason. Internationalized Domain Names are superficial eye candy, patched on top of a solid Internet foundation to lure short sighted consumers, with the proverbial corporate noose around their necks. I will leave you with Tim’s full article. Despite being old, it really is a great read considering how right his predictions turned out to be:
Last December domain registrars the world over began accepting registrations for double-byte domain names. Until now, domain names have been restricted to standard ASCII characters, but the new domains allow names in Japanese, Korean, and Traditional and Simplified Chinese characters as well.
Proponents claim that double-byte domain names will help break down language barriers, increase the amount of multilingual content on the Internet, and make it easier for non-English speakers to use the Net. I admit that I am rather cynical by nature, but as far as I can tell, these new domains serve only to line the pockets of the domain registrars and provide no substantive benefit to the Internet community.
Contrary to the claims of some proponents, restricting domain names to ASCII letters, numbers and a few punctuation marks is not a linguistic or cultural issue. It is simply a way of ensuring interoperability. Just about every international standard in existence, from ISO country designations to airport call letters, restricts itself to similar characters.
The characters in question can be entered using almost any computer system on the planet and, as such, they represent a least common denominator. The characters do, of course, come from English alphabet, but you would be hard pressed to convince any Japanese that “Mitsubishi” is not a Japanese word, or that entering www.mitsubishi.com into a browser is anything but trivial. In fact, to enter Mitsubishi’s double-byte domain requires that it first be entered in ASCII letters and then converted to Japanese characters. The new domains are actually harder to use in that sense.
The winners here are not non-English-speaking Internet users, but the domain name registrars. Companies who held ASCII domains are now forced to register multiple double-byte variants to protect their brands. As a result, the registration of these new domains has proceeded with all the clam and order of a gold rush.
Over a million domains were registered in the first month at prices ranging from $35 to $100/year. Keep in mind that these figures are not a one-time charge. The domain holders will have to pay a like amount each and every year to maintain the rights to those domains. And, if that were not enough guaranteed annual revenue, Network Solutions has announced that it will soon be accepting domain registrations in Portuguese, Spanish and Arabic. More languages will follow whenever they feel the need to add a few hundred million dollars in recurring revenue to their bottom line.
It’s rather tempting to shrug this whole thing off as one big corporation squeezing other big corporations for relatively paltry sums. However, there is a bigger issue here. We might be seeing the beginning of the Balkanization of the Internet.
Removing the least common denominator requirement effectively partitions off portions of the Internet. I can’t even give examples of the new domain names because most of my readers’ will be unable to display them, let alone visit them. Sites using a double-byte domain will be effectively unreachable by the majority of Internet users.
It can be argued that this is not a problem since a web site with a Japanese domain name will be in Japanese, but current trends speak against that claim. More and more sites are multilingual. I suppose a different domain could be used for each language supported, but I fail to see any advantage in such a scheme.
The Internet, however, is more than the Web, and it certainly seems likely that employees of a Japanese company with a double-byte domain name will need to communicate with a someone whose computer does not support Japanese. Likewise, there will be those outside Japan who will want to download a file from a Japanese FTP server. Double-byte domains will make this difficult. Fortunately, for the moment, the new domain names do not work with email or FTP.
The most amazing aspect of the Internet, that from which all else springs is ease and freedom of communication: The ability for a person in Minsk to communicate with someone in Osaka, Dallas, Seoul or Johannesburg. Double-byte domain names hinder this ability since they can only be entered by computers running a specific language. Extensive use of these new domains will effectively prevent communication between individuals who find themselves behind the walls of their national domain name schemes. Hardly the World-Wide Web we have come to know.
Any questions or comments are welcome below. I am especially looking forward to people in the other side of the Japanese IDN fence to tell me why they think I am wrong. Apart from the potential SEO value of IDNs (mentioned in the article), what else do Japanese Kanji domain names have going for them? Can they be used in email addresses yet? I couldn’t test this, as my email client (Thunderbird) told me that the address I entered as invalid when I tried to send a mail to info@ヒルズ族.com to get some pricing information..