Thursday 20 January 2022

Unicode Trivia U+0640

Codepoint: U+0640 "ARABIC TATWEEL"
Block: U+0600..06FF "Arabic"

The FIFA World Cup Qatar 2022 logo applies kashida to the Latin script word for "Qatar":

[source]

This is the elongation of the connection between the "t" and "a". In Unicode, U+0640 "ARABIC TATWEEL" (alias "kashida") can be used to represent this elongation. Tatweels are usually only used in Arabic (or similar) scripts, so it's a nice cross-cultural reference in this context.

Here's an extreme example in the form of an Arabic script basmala:

[source]

Tatweels could be considered typographical formatting, but, because a tatweel character was part of ISO/IEC 8859-6 at position 0xE0, it was "inherited" by Unicode as a separate graphical codepoint.

Arabic tatweels are similar to Latin hyphens when used for text justification, but the rules are obviously very different. An excellent history of the topic is given by Titus Nemeth.

[At this point, my complete lack of understanding of Arabic will shine through. Apologies.]

Like its Unicode block-neighbour Hebrew, Arabic script is a right-to-left abjad. The name "Qatar" in Arabic is made up of three Arabic consonants:

  • U+0642 "ARABIC LETTER QAF"
  • U+0637 "ARABIC LETTER TAH"
  • U+0631 "ARABIC LETTER REH"

قطر

If, as part of text justification or for aesthetic effect, we want to widen the word, we could insert a tatweel between the tah and reh:

قطـر

In fact, we can add more tatweels in sequence:

قطـــــر

This is, of course, an artificial example; words of only three consonants are rarely stretched.

Straight line tatweels are not the only mechanism that can be used to justify Arabic text. Others include:

  1. Whitespace
  2. Letterform lengthening/shortening
  3. Ligature variation



No comments:

Post a Comment