Unicode

From Glossing Ancient Languages

Revision as of 16:21, 28 July 2016 by Werningd (talk | contribs) (→‎Required Unicode characters: Clarification of recommendation for Egyptological Alif)
Jump to: navigation, search

Transliterations and transcriptions of ancient languages usually use some special characters, e.g. Greek signs, like alpha (α), ‘s’ with caron (š), ‘h’ with breve below (), ‘t’ with dot below (), Semitic Ain (), Egyptological Alif ().

In order to produce a robust file for e.g. a publisher, i.e. a file that is going to be displayed correctly on any electronic device, it is advisable that the encoder uses strictly Unicode fonts to enter his/her data. Characters that have been entered with a Unicode font cannot be confused on any other device. This is because the file actually contains the unique Unicode number of the intended character, e.g. U+1E6D. This number is unique and reserved for this very character all across the world in any Unicode font. With Unicode fonts, the only things that can happen are: a) The character cannot be displayed at all on the receiver’s device; b) The style of the font may be different, depending on the actual Unicode font that is installed on the receiver’s device.)

Recommended fonts

For glossing in general or for editing specific languages, we recommend to install the following fonts:

Language Fonts Character examples
Glossing in general Charis SIL diacritics, IPA symbols, ā ē ī ō ū, ⌈ ⌉ 〈 〉
Akkadian Charis SIL
New Athena Unicode
Antinouu
ʾ ʿ ḫ ṣ š ṭ
Coptic Antinouu ⲁ ⲉ ⲙ ϣ ϩ ϭ ϯ ⳉ ⳁ
Ancient Egyptian[1] Charis SIL
New Athena Unicode
Ꜣ ʾ ı͗ i̯ ï Ꜥ u̯ ḥ ḫ ẖ h̭ ś š ḳ č ṯ ṭ ṱ č̣ ḏ
Greek New Athena Unicode α ζ ς ή ω ᾆ ἧ ὧ
but uses non-standardized Private Use Area
for some few characters:
e.g.    .
Hittite Arial Unicode MS
Charis SIL
New Athena Unicode
ḫ š

Recommended programs and links

Keyboard layout creators

Unicode character programs and links

Required Unicode characters

Character Unicode
number
Character Unicode
number
Fonts Languages
ʿ U+02BF Charis SIL, New Athena Unicode Akkadian
ʾ U+02BE Charis SIL, New Athena Unicode Akkadian, Ancient Egyptian
 [2] U+A725 U+A724 Charis SIL, New Athena Unicode Ancient Egyptian
 [2] U+A723 U+A722 Charis SIL, New Athena Unicode Ancient Egyptian
č U+010D Č U+010C Charis SIL, New Athena Unicode Ancient Egyptian
č̣ U+010D&U+0323 Č̣ U+010C&U+0323 Charis SIL, New Athena Unicode Ancient Egyptian
U+1E0F U+1E0E Charis SIL, New Athena Unicode Ancient Egyptian
U+1E25 U+1E24 Charis SIL, New Athena Unicode Ancient Egyptian
U+1E2B U+1E2B Charis SIL, New Athena Unicode Akkadian, Ancient Egyptian, Hittite
U+1E96 'H'+ U+0331 Charis SIL, New Athena Unicode Ancient Egyptian
'h'&U+032D 'H'&U+032D Charis SIL, New Athena Unicode Ancient Egyptian
ı͗ U+0131&U+0357 'I'&U+0357 Charis SIL, New Athena Unicode Ancient Egyptian
'i'&U+032F Charis SIL, New Athena Unicode Ancient Egyptian
ï U+00EF Charis SIL, New Athena Unicode Ancient Egyptian
U+1E33 U+1E32 Charis SIL, New Athena Unicode Ancient Egyptian
U+032E U+032D Charis SIL, New Athena Unicode Akkadian
ś U+015B Ś U+015A Charis SIL, New Athena Unicode Ancient Egyptian
š U+0161 Š U+0160 Charis SIL, New Athena Unicode Akkadian, Ancient Egyptian, Hittite
U+1E6F U+1E6E Charis SIL, New Athena Unicode Ancient Egyptian
U+1E6D U+1E6C Charis SIL, New Athena Unicode Akkadian, Ancient Egyptian
‘u’&U+032F Charis SIL, New Athena Unicode Ancient Egyptian
U+2E17 New Athena Unicode Ancient Egyptian
U+2308 U+2309 Charis SIL, New Athena Unicode text critical markup
U+2329 U+232A Charis SIL, New Athena Unicode text critical markup

References

  1. Daniel A. Werning, Egyptological transliteration in Unicode; Wikipedia, Ägyptische Hieroglyphen. In der elektronischen Datenverarbeitung.
  2. 2.0 2.1 The characters U+A725 LATIN SMALL LETTER EGYPTOLOGICAL AIN and U+A723 LATIN SMALL LETTER EGYPTOLOGICAL ALEF are not supposed to be smaller than the respective capitals. This seems to be a mistake in the original Unicode definition. For print typesetting (but not for database encoding), we recommend to use the respectice CAPITAL letters instead.