Japanese writing

A large part of the difficulty of learning Japanese has to do with learning its writing system. Put simply, it is a lovely mess.

kanjis (漢字)

There is no original Japanese writing system; instead, around the 6th century, the Japanese adapted the Chinese characters to their own language; these borrowed characters are called kanjis (漢字), which means ‘Han’s characters’, with the Han being the Chinese people. Kanjis represent ideas, e.g., 人 represents a man (it looks like a standing man), while 川 represents a river (it looks like a stream). Hence, kanjis are ideograms. Kanjis are used to write nouns, the stems of verbs and adjectives, and adverbs.

ni-hon-go (the Japanese language)

In the image, the first kanji, 日 (ni), stands for ‘sun’; the second one, 本 (hon), means ‘origin’; and the third one, 語 (go), means ‘language’. 日本 (nihon) means the place where the sun originates (or rises), i.e., Japan, so 日本語 (nihongo) is ‘the language of Japan’, i.e., the Japanese language.

Many kanjis are difficult to tell apart, most kanjis are difficult to write, and there are many kanjis; to be able to read a newspaper, we need to learn about 2,000 kanjis, but there are thousands more, around 40,000-50,000. Daunting as all of this sounds, the number of kanjis is not the main difficulty of learning them. What makes them challenging is that their meaning and sound changes depending on the context. For example, 日, the kanji for ‘sun’ sounds ‘ni’ when we use it to write 日本 (nihon, Japan); however, 日 appears two times in the word 日曜日 (nichi-you-bi, Sunday), and it does not sound ‘ni’ in either of them: the first time it sounds ‘nichi’ and the second time it sounds ‘bi’. In other contexts 日 will have other meanings and sounds. These changes are not just simple variations in pronunciation but completely different sounds.

Part of the reason for these variations is that some pronunciations come from the original Japanese sound of the word, while others come from the Chinese sound of the word. The ‘native reading’ of a kanji, or kun-yomi, is the Japanese sound of the thing that the character represents, while its ‘sound reading’, or on-yomi, is the Chinese sound. For example, ‘person’ in Japanese is ‘hito’, and in Chinese is ‘ren’; hence, the kanji for person, 人, has the kun-yomi ‘hito’, and the on-yomis ‘jin’ and ‘nin’, which sound like ‘ren’ [jisho.com]. We usually use kun-yomis when the word is by itself, e.g., ‘ano hito’ (あの人 – that person), and on-yomis when the kanji is part of a compound word, e.g., ‘america-jin’ (アメリカ人 – American person), or ‘san-nin’ (三人 – three persons). This is similar to English having words from different origins meaning the same thing, e.g., ‘water’ (germanic), ‘aqua’ (latin), and ‘hydro'(greek) all mean water, but in different contexts, and ‘aqua’ and ‘hydro’ only appear as parts of compound words.

Another reason for a change in pronunciation is to make the sound ‘get along’ with the sounds that precede it or follow it. For example, 本, hon, which means ‘origin’ when we write Japan, is also used as the counter for long thin objects (like bottles, or cigarettes), but depending on the number of items it is pronounced ‘hon’, ‘ppon’, or ‘bon’, e.g., ‘ippon’ (一本) might be one bottle, ‘nihon’ (二本) two bottles, and ‘sanbon’ (三本) three bottles. Notice that now we have two ways to write ‘nihon’: 日本, meaning ‘Japan’, and 二本, meaning ‘two long thin objects’. Hence, we really cannot write the word ‘nihon’ using kanjis unless we know the context in which we are writing it.

This variation of the meaning and sound of a kanji depending on the context in which it appears is what makes kanjis really challenging to learn. In essence, we have to memorize each different kanji in each different context in which it may appear.

Hiragana (ひらがな)

The kanjis were (and are) difficult to learn, and only the very educated people were able to read and write them. However, the Japanese language has only about 100 sounds, so someone figured out that we could take kanjis that ‘sounded’ like one of the native Japanese sounds, and make a new writing system with simplified versions of those few kanjis. This was the origin of hiragana.

ni-ho-n-go (the Japanese language)

Hiragana characters are curvilinear. Unlike the english alphabet that encodes individual sounds (e.g., ‘n’ and ‘i’), hiragana characters encode syllables (e.g., ‘ni’), so hiragana is not an alphabet but a syllabary. The only characters that are encoded by themselves are the five vowels – a, i, u, e, o (as they are ordered in Japanese), and the ‘n’.

We use hiragana to write all the particles (things like ‘to’, ‘from’, ‘with’, etc.) and the conjugations of verbs and adjectives. We also use it to write many words of frequent use, like greetings. However, hiragana covers every sound of the Japanese language so, in theory, there is no longer a need for kanjis. Still, kanjis remain, in part because of the deep respect that the Japanese people have for tradition and heritage.

Katakana (カタカナ)

ni-ho-n-go (the Japanese language)

A second syllabary, called katakana, is mostly a mirror image of hiragana so, in theory, either syllabary could be used to write anything in Japanese. These two syllabaries – hiragana and katakana, together, are called ‘the kanas’. Unlike the curvilinear characters of hiragana, katakana characters tend to be angular.

Some katakana characters resemble their hiragana counterparts because both might have been derived from the same kanji:

sound hiragana katakana

However, other characters of the syllabaries were derived from different kanjis and don’t resemble each other:

sound hiragana katakana

We use katakana to write

  • foreign words, e.g., ramen (of Chinese origin) is ラーメン (ra-a-me-n)
  • onomatopoeias, e.g., the meow of a cat is ニャーニャー (nya-a-nya-a)
  • mimetic words, e.g., the sound of a heart pounding is ドキドキ (do-ki-do-ki)
  • scientific names, e.g., a quasar is クエーサー (ku-e-e-sa-a)
  • and to add emphasis

Romaji (ローマ字)

The Japanese language

People that are learning Japanese sometimes write the Japanese words using the Roman alphabet, e.g., we would write ‘nihongo’ (the Japanese language) instead of にほんご, or 日本語; this use of the roman alphabet to write Japanese sounds is called ‘romaji’, lit. ‘Roman character’.

Romaji is a reencoding of the way that the word is written in hiragana or katakana. For example, Tokyo is written in hiragana as とうきょう, so the romaji version of Tokyo is the syllable-by-syllable translation of とうきょう, i.e., to-u-kyo-u. To get the correct pronunciation of the word, i.e., to-o-kyo-o, we have to further know the rule that in Japanese an ‘o’ followed by a ‘u’ means that the ‘o’ has to be doubled, i.e., ‘ou’ sounds ‘oo’. Still, the use of roman characters in romaji goes a long way to indicate how a Japanese word would sound.

The problem is that the Japanese do not use romaji at all, so learning Japanese using romaji can be a hindrance. Romaji indicates how to pronounce a word without having to learn the kanas but, if we are serious about learning Japanese, learning the kanas is a small investment for a large return.

Roman alphabet

Japanese people use the roman alphabet when they want to write the original foreign version of the word. Sometimes they do this to create an effect, others times for clarification. Likewise, Arabic numerals (1,2,3,4,…) have become as common as the kanji numerals (一, 二, 三, 四, …).

The roman alphabet is used often at airports, train stations, and other places frequented by tourists. Sometimes vowels have a macron, i.e., ā, ī, ū, ē, ō; this indicates that we need to double the vowel. For example, Tokyo, written in romaji as toukyou, is pronounced to-o-kyo-o, so the English version of the word would be written in the sign as Tōkyō:

Unlike the romaji version of the word, the ō does not give a hint of how the word is actually written in Japanese, but only of how it is pronounced. This ō sound might come from a duplication of the o vowel, like in oosaka (ōsaka); or from an o followed by a u, like in the sign’s yuurakuchou (Yūrakuchō), or like both ō’s in toukyou (Tōkyō). Yūrakuchō also shows a ū representing a double u sound, i.e, yuurakuchou.

As a second example, Osaka is actually written and pronounced o-o-sa-ka (おおさか), and Kobe is written as ko-u-be (こうべ), so it is pronounced ko-o-be:

Selecting the script

Sometimes, the writer might choose to write a word that is usually written in one script, using another one; this choice is similar to the choice of writing an English word in upper-case, or italics, or bold, or in a different font, to convey something. As an example, in English, robot-talk might be written in block letters, to indicate something ‘mechanical’ and ‘digital’. The following is a relevant quote from ‘kandyman’:

(The) choice of script in Japanese writing is often a stylistic attempt by a writer to convey some non-standard subliminal meaning. There have been some interesting analyses on this topic done over the years. Just to give you an idea of some of the findings, one study found that certain characteristics were often attributed to script usage, as follows:

Hiragana: feminine, soft, smooth, round, tender, simple, childish, lovely, elegant, etc.
Katakana: novel, foreign, emphasizing, hard, fake, male, futuristic, sharp, jarring, angular, etc.
Kanji: scientific, rigid, masculine, formal, hard, difficult, intellectual, visual, substantial, etc.