Japanese pronunciation

Character sounds

Japanese pronunciation is very regular. The five vowels – a, i, u, e, o – are always a pure sound, i.e., there is no transition from a vowel to another.

  • the ‘a’ (あ, ア) sounds like the a in ‘spa’
  • the ‘i’ (い, イ) sounds like the e in ‘me’
  • the ‘u’ (う, ウ) sounds like the o in ‘who’
  • the ‘e’ (え, エ) sounds like the e in ‘men’
  • the ‘o’ (お, オ) sounds like the o is ‘so’
  • the ‘r’ is soft, like the Spanish ‘r’ in ‘cara’, not like the Spanish ‘r’ in ‘rama’ nor the English ‘r’ in ‘ram’.
  • a small ‘tsu’ makes a small pause before the consonant that follows it
    • hiragana:
    • kekkou (けっこう – it’s fine) sounds ‘ke-koo’
    • chotto (ちょっと – a little) sounds ‘cho-to’
    • kippu (きっぷ – ticket) sounds ‘ki-pu’
    • katakana:
    • kicchin (キッチン – kitchen) sounds ‘ki-chin’
    • poketto (ポケット – pocket) sounds ‘poke-to’
    • kukkii (クッキイ – cookie) sounds ‘ku-kii’

Doubling vowels

In English two vowels often form a sound in a single syllable, but in Japanese the additional vowel is considered an additional syllable, e.g, the English word ‘too’ (also) has one syllable, while the Japanese word ‘too’ (far, distant) has two syllables.

  • in hiragana, an ‘i’ after an ‘e’ sound repeats the ‘e’ sound
    • eigo (えいご – the English language) sounds ‘eego’
    • eiga (えいが – movie) sounds ‘eega’
    • sensei (せんせい – teacher) sounds ‘sensee’
  • in hiragana, a ‘u’ after an ‘o’ sound repeats the ‘o’ sound
    • ohayou (おはよう – good morning) sounds ‘ohayoo’
    • doumo (どうも – very) sounds ‘doomo’
    • kouen (こうえん – park) sounds ‘kooen’
    • arigatou (ありがとう – thanks) sounds ‘arigatoo’
  • in katakana, a ‘ー’ (dash) repeats the previous vowel
    • apaato (アパート – apartment) sounds ‘apaato’
    • biiru (ビール – beer) sounds ‘biiru’
    • keeki (ケーキ – cake) sounds ‘keeki’
    • nyuuyooku (ニューヨーク – New York) sounds ‘nyuuyooku’

Pronunciation tips

  • ha (は) is pronounced ‘wa’ when used as a particle
  • he (へ) is pronounced ‘e’ when used as a particle
  • wo (を) is pronounced ‘o’ when used as a particle
  • the ‘u’ (う) sound is often faint or omitted, specially at the end of the word:
    • ..desu (..です) sounds ‘..des’
    • ..masu (..ます) sounds ‘..mas’
    • sukoshi (すこし – a little) sounds ‘skoshi’
  • fu (ふ) sounds like the English word ‘who’
    • fune (ふね – boat) sounds ‘who-ne’
    • futon (ふとん) sounds ‘who-ton’
  • the ‘i’ (い) sound in ‘shi’ (し) and ‘chi’ (ち) is often faint or omitted:
    • watashi (わあし – I, me) sounds ‘watash’
    • ashita (あした – tomorrow) sounds ‘ashta’
  • the ‘n’ (ん) before a ‘b’, ‘m’, or ‘p’ sounds like an ‘m’:
    • tonbo (とんぼ – dragonfly) sounds ‘tombo’
    • sanpo (さんぽ – stroll’) sounds ‘sampo’
    • sanmai (さんまい – 3 flat things) sounds ‘sammai’

    てんま (tenma) sounds ‘temma’


The pitch, or accent, is the vowel that is stressed in a word. Some Japanese words truly have no accent, they are flat, but many words have specific ones [wikipedia]:




god, deity, spirit
paper; hair

hard candy

living room




紙; 髪


Neither kana has accents that indicate pitch; the kanjis do not give a clue either; thus, there is no alternative but to listen to a native speaker and memorize the pitch. Still, there are a few hints that can help in certain cases.

In English, when we put together two or more words to form a compound word, the compound word preserves the tones of its component words, e,g,

belly + button → belly-button
carry + over → carryover

This works the same in Japanese, i.e., the word should be pronounced like tow different words:

kami (God) + sama (lord) → kami-sama (God)
hachi (8) + hyaku (100) → hachi-hyaku (800)
ashi (foot) + kubi (neck) → ashi-kubi (ankle)

If the component words happen to be 1-syllable long, then we might end up with what appear to be different pronunciations of the same word, when in reality all we are doing is stressing one of the component words. In English, suppose that we have the word ‘twenty-five’. We could stress ‘twenty’ or ‘five’ to draw attention to that particular component of the word, or pronounce them flat. This is more difficult to see in Japanese where the compound words can be so small that we would tend to think of them as non-compound words:


compound word

1st syllable
go (honorific)
kon (this)
ten (sky)
den (electricity)

2nd syllable
han (cooked rice)
ban (evening)
ki (spirit)
wa (talk)

However, I have the impression that Japanese takes this a bit further. In English we can modify a word and end up changing the location of its pitch:

pretty → prettification

but in Japanese the root word and the modifiers preserve their individual pitches:

I drink
I don’t drink
I want to drink
I don’t want to drink

nomi + masu → nomi-masu
nomi + masen → nomi-masen
nomi + tai → nomi-tai
nomi + taku + nai → nomi-taku-nai

The pronunciation is correct if we treat the components of the word as separate words (e.g., nomi-masu), each with its own pitch, instead of considering the word as a single unit (e.g., nomimasu).

Finally, as if Japanese pitch wasn’t already difficult enough, native speakers from different regions of Japan often pronounce words in different ways. For example, the Tokyo dialect tends to stress the first syllable, while the Kansai dialect tends to stress the last one:

Tokyo dialect
Kansai dialect