Wednesday, 13 April 2022

Unicode Trivia U+1090

Codepoint: U+1090 "MYANMAR SHAN DIGIT ZERO"
Block: U+1000..109F "Myanmar"

The "Myanmar" Unicode block contains glyphs used in various regional writing systems including the Burmese and Shan scripts. In this post, I'm going to play fast and loose with the script names and just call them "Burmese" and "Shan". Unicode Technical Note 11 describes some of the intricacies involved with the various scripts, weighing in at a healthy 67 pages.

The "Myanmar" block contains two sets of digits: one for Burmese (second row below) and one for Shan (bottom row):

If you have appropriate fonts installed, these are the Burmese digits (U+1040..1049):

၀၁၂၃၄၅၆၇၈၉

and Shan digits  (U+1090-1099):

႐႑႒႓႔႕႖႗႘႙

Burmese digits have the advantage (over Hindu-Arabic and Shan digits) of having ascenders and descenders which help to differentiate them. They are very similar to "Tai Tham Hora" digits (U+1A80..1A89). See here.

The Shan script supposedly evolved from the Burmese, but their digits are markedly different. To my eyes, they appear to resemble the Hindu-Arabic digits, but the "8" and "9" are inexplicably similar.

The Burmese language has words for very large numbers: powers of ten up t107 and then increasing multiplicatively by factors of 107 up to 10140 ("athinche", fittingly this is a synonym for "countless number"). I could find no reason why the names progress in multiples of 107. Most other languages use 103 (e.g. English "thousand", "million", "billion", etc.) or sometimes 102 (e.g. Indian "lakh", "crore", "arab", etc.)

Another curiosity is that the tonal pronunciation of digits changes depending on the denary position of the digit within the number. The tone generally changes from "low" to "creaky" (no, really!) for digits in the 101102 and 103 places.

No comments:

Post a Comment