Codepoint: U+02DB "OGONEK"
Block: U+02B0..02FF "Spacing Modifier Letters"
The English language doesn't really have diacritics except in loanwords (such as "café", "naïve", "façade" and "piñata") or in poetry (as with "belovèd"). As a consequence, many English-speakers struggle with the whole concept.
The many scripts and languages supported by Unicode make diacritics a thorny issue here too. Trawling though the UCD comes up with the following major instances:*
- ACUTE
The acute accent:
ÓU+00D3 "LATIN CAPITAL LETTER O WITH ACUTE" - DOUBLE ACUTE
The double acute accent (sometimes called the hungarumlaut):ŐU+0150 "LATIN CAPITAL LETTER O WITH DOUBLE ACUTE" - GRAVE
The grave accent:ÒU+00D2 "LATIN CAPITAL LETTER O WITH GRAVE" - DOUBLE GRAVE
The double grave accent (mainly used in Serbo-Croatian and Slovenian):ȌU+020C "LATIN CAPITAL LETTER O WITH DOUBLE GRAVE" - CIRCUMFLEX
The circumflex (easily confused with the inverted breve):ÔU+00D4 "LATIN CAPITAL LETTER O WITH CIRCUMFLEX" - TILDE
The tilde (in the Estonian alphabet "õ" is an independent letter):ÕU+00D5 "LATIN CAPITAL LETTER O WITH TILDE" - DIAERESIS
The diaeresis or umlaut:
[In Unicode, the term "DIAERESIS" is preferred over "UMLAUT"]ÖU+00D6 "LATIN CAPITAL LETTER O WITH DIAERESIS" - STROKE
The stroke (in some Scandinavian alphabets "ø" is an independent letter):ØU+00D8 "LATIN CAPITAL LETTER O WITH STROKE" - MACRON
The macron or line above:ŌU+014C "LATIN CAPITAL LETTER O WITH MACRON" - BREVE
The breve (easily confused with the caron or háček):ŎU+014E "LATIN CAPITAL LETTER O WITH BREVE" - INVERTED BREVE
The inverted breve or arch (easily confused with the circumflex):ȎU+020E "LATIN CAPITAL LETTER O WITH INVERTED BREVE" - HORN
The horn (used in Vietnamese):ƠU+01A0 "LATIN CAPITAL LETTER O WITH HORN" - CARON
The caron or háček (easily confused with the breve):
[Since Unicode 1.1, the term "CARON" is preferred over "HACEK"]ǑU+01D1 "LATIN CAPITAL LETTER O WITH CARON" - DOT ABOVE
The dot above or overdot:ȮU+022E "LATIN CAPITAL LETTER O WITH DOT ABOVE" - DOT BELOW
The dot below or underdot:ỌU+1ECC "LATIN CAPITAL LETTER O WITH DOT BELOW" - HOOK ABOVE
The hook above (used in Vietnamese):ỎU+1ECE "LATIN CAPITAL LETTER O WITH HOOK ABOVE" - LONG STROKE OVERLAY
The long stroke overlay ("ꝋ" was a medieval abbreviation for the Latin obiit "he died"):ꝊU+A74A "LATIN CAPITAL LETTER O WITH LONG STROKE OVERLAY" - LOOP
The loop ("ꝍ" is used for transliterating medieval Nordic vowels):ꝌU+A74C "LATIN CAPITAL LETTER O WITH LOOP" - BELT
The belt ("ɬ" is used in IPA for the voiceless alveolar lateral fricative):ꞭU+A7AD "LATIN CAPITAL LETTER L WITH BELT" - LINE BELOW
The line below (or macron below):ḺU+1E3A "LATIN CAPITAL LETTER L WITH LINE BELOW" - STROKE
The stroke ("ł" is a Polish dark L):ŁU+0141 "LATIN CAPITAL LETTER L WITH STROKE" - CEDILLA
The cedilla:ÇU+00C7 "LATIN CAPITAL LETTER C WITH CEDILLA" - RING ABOVE
The ring above or overring (used in many Scandinavian languages):ÅU+00C5 "LATIN CAPITAL LETTER A WITH RING ABOVE" - RING BELOW
The ring below or underring:ḀU+1E00 "LATIN CAPITAL LETTER A WITH RING BELOW" - OGONEK
The ogonek (usually applied to vowels):ǪU+01EA "LATIN CAPITAL LETTER O WITH OGONEK"
The Polish ogonek (literally "little tail") is applied to the letters "A" and "E":
Ąą Ęę
According to Adam Twardoch, the Polish ogonek isn't simply an accent...
It's much more a character element, just like a stem, a serif or a descent. In a vast majority of cases ogonek should be smoothly connected with the base glyph, it should be a part of the glyph.
Wikimedia Commons |
If you search the UCD, you'll find 18 references to "ogonek":
- U+0104 "LATIN CAPITAL LETTER A WITH OGONEK"
- U+0105 "LATIN SMALL LETTER A WITH OGONEK"
- U+0118 "LATIN CAPITAL LETTER E WITH OGONEK"
- U+0119 "LATIN SMALL LETTER E WITH OGONEK"
- U+012E "LATIN CAPITAL LETTER I WITH OGONEK"
- U+012F "LATIN SMALL LETTER I WITH OGONEK"
- U+0172 "LATIN CAPITAL LETTER U WITH OGONEK"
- U+0173 "LATIN SMALL LETTER U WITH OGONEK"
- U+01EA "LATIN CAPITAL LETTER O WITH OGONEK"
- U+01EB "LATIN SMALL LETTER O WITH OGONEK"
- U+01EC "LATIN CAPITAL LETTER O WITH OGONEK AND MACRON"
- U+01ED "LATIN SMALL LETTER O WITH OGONEK AND MACRON"
- U+02DB "OGONEK"
- U+0328 "COMBINING OGONEK"
- U+04BE "CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER"
- Was named "CYRILLIC CAPITAL LETTER IE HOOK OGONEK" in Unicode 1.0
- U+04BF "CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER"
- Was named "CYRILLIC SMALL LETTER IE HOOK OGONEK" in Unicode 1.0
- U+1AB7 "COMBINING OPEN MARK BELOW"
- Notes include "see also combining ogonek - 0328"
- U+1DCE "COMBINING OGONEK ABOVE"
Codepoint U+02DB is interesting. It's in the "Spacing clones of diacritics" column of the "Spacing Modifier Letters" block. This column contains six codepoints (including U+02DB) which fill in the gaps for "standalone" diacritics; that is, codepoints for diacritics that take up space without the need for the combining equivalent being applied to a letter.
So, if you're talking about ogoneks in general and want to include them in text without being attached to another glyph, you can just use U+02DB:
An ogonek looks like “˛”
There will be visible differences between this and a combining ogonek with a standard space (U+0020):
An ogonek looks like “̨”
A combining ogonek with a non-breaking space (U+00A0):
An ogonek looks like “ ̨”
And a combining ogonek with a dotted circle (U+25CC):
An ogonek looks like “◌̨”
One of the few times you depict ogoneks in isolation is when you're talking about how to depict ogoneks in isolation.
* Good luck with the text rendering in your browser here!
No comments:
Post a Comment