Saturday, 29 January 2022

Unicode Trivia U+08C8

Codepoint: U+08C8 "ARABIC LETTER GRAF"
Block: U+08A0..08FF "Arabic Extended-A"

Unicode blocks are not allocated sequentially. Consequently, "Arabic Extended-A" (U+08A0..08FF, originating in Unicode 6.1, January 2012) comes numerically after "Arabic Extended-B" (U+0870..089F, originating in Unicode 14.0, September 2021).

Even more confusing, codepoints within a block can be allocated at different times. For example, U+08C8 "ARABIC LETTER GRAF" was assigned in Unicode 14.0 (September 2021); but its neighbour, U+08C7 "ARABIC LETTER LAM WITH SMALL ARABIC LETTER TAH ABOVE" was assigned in Unicode 13.0 (March 2020).

U+08C8 "ARABIC LETTER GRAF" is an addition to the Arabic script for writing the Balti language:

Isolated form of U+08C8 [source]

For example, the Balti word for knife (U+08C8 U+06CC):

[source]

U+08C8 is the only specific addition required for the Arabic script to be able to write Balti, although two other codepoints (U+0F6B "TIBETAN LETTER KKA" and U+0F6C "TIBETAN LETTER RRA", added in Unicode 5.1, April 2008) are needed to write Balti using the Tibetan script.

Balti is spoken in Baltistan, or Little Tibet, a mountainous region in the Gilgit-Baltistan part of Pakistan-administered Kashmir. This may be the origins of the balti curry dish popular in the UK since the nineteen-seventies.

No comments:

Post a Comment