Glossing Rules
From Glossing Ancient Languages
Core rules
Alignment
- (1) Alignment Rule
- An object language word and its gloss need to be arranged vertically left-aligned. [1]
- Example
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
the.day.before.yesterday | am | I | out | a | short.vacation | back.come |
But not:
Vorgestern bin ich von einem Kurzurlaub zurückgekommen. |
the.day.before.yesterday am I out a short.vacation back.come |
The best way to edit this in text editing programs is by the means of invisible tables, i.e. tables without border lines. (Whitespaces like blanks or tabs are not very helpful for this purpose.)
One-to-Many Correspondences
One object language word = many gloss elements
- (2a) Standard Joining Rule (for the gloss)
- Within a pair of an object language word and gloss, neither the word nor the gloss may contain any whitespaces (blanks, tabs).
- If one object language word corresponds to two or more elements in the gloss, these elements have to be joined by a punctuation mark.
- The standard punctuation mark for joining elements in the gloss is the period “.”. [2]
- Example
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
the.day.before.yesterday | am | I | out | a | short.vacation | back.come |
But not:
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
the day before yesterday | am | I | out | a | short vacation | back come |
- Standard exception to the Standard Joining Rule
- The sequence PERSON – NUMBER is usually spelled simply without a period “.”, i.e. abbreviated as e.g. “3PL” (instead of “3.PL”). [3]
For another meaning of the period “.” in cases in which other punctuation marks like colons “:”, hyphens “-” or else are also used, see The Period in the Expert Mode section below.
Compact translation phrases in the gloss
- (2b) Compact phrase joining recommendation
- If one object language word corresponds to a compact multi-word phrase translation in the gloss, these elements should rather be joined by an underscore “_” than by a period “.”. [4]
- Example
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
the_day_before_yesterday | am | I | out | a.SG.M.DAT | short.vacation.M.SG.DAT | back.come.PTCP.PRF |
Rather than
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
the.day.before.yesterday | am | I | out | a.M.DAT | short.vacation.M.SG.DAT | back.come.PTCP.PRF |
- FAQ
- What is the difference between the case of “vorgestern – the_day_before_yesterday” and “Kurzurlaub – short.vacation”?
- In the case of “Kurzurlaub – short.vacation”, the object language word “Kurzurlaub” actually contains the separate elements “short” (“kurz”) and “vacation” (“Urlaub”) – and only these elements. In the case of “vorgestern – the_day_before_yesterday – vorgestern”, on the other hand, the object language word “vorgestern” does not contain the elements “the”, “day”, “before”, and “yesterday” as four separate units. “The day before yesterday” is rather a fixed combined phrase.
- But “vorgestern” does contain the elements “before” and “yesterday”!
- In the spirit of the Compact Phrase Joining Rule, one may therefore gloss “vorgestern” either as “before.yesterday” or “the_day_before_yesterday”.
Many object language words = one gloss element
- (2c) Standard Joining Rule for object language words
- If two or more object language words corresponds to one elements in the gloss, these elements have to be joined by a punctuation mark. The standard punctuation mark for joining object language words that correspond to one single gloss is the underscore “_” (rather than the period “.”). [5]
- Example
I | came | back | from | a | short | vacation | the_day_before_yesterday |
ich | kam | zurück | von | ein | kurz | Urlaub | vorgestern |
But not:
I | came | back | from | a | short | vacation | the day before yesterday |
ich | kam | zurück | von | ein | kurz | Urlaub | vorgestern |
Analyzing grammatical categories
- (3) Categories Markup Rule
- Grammatical categories marked on or inherent to the object language word may be analyzed in the gloss. These grammatical categories have to be typeset in small caps (small capital letters), or else – but less elegant – in normal capital letters.
- For the sake of space, frequent grammatical categories are usually abbreviated. [6]
For common glossing abbreviations, see the Glossing Abbreviations section.
- Example
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
the_day_before_yesterday | be.PRS.1SG | 1SG.NOM | out | ART.INDF.SG.M.DAT | short.vacation.M.SG.DAT | back.come.PTCP.PRF |
Alternatively, one might want to leave some elements unanalyzed:
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
the_day_before_yesterday | am | I | out | a.SG.M.DAT | short.vacation.M.SG.DAT | back.come.PTCP.PRF |
Optional expert recommendations
Inflection markup recommendations
Affixes and clitics
- (4a) Affix markup recommendation
- If one object language word contains a clearly and neatly separable affix (suffix or prefix), this affix should be attached to its stem in both, in the transcription and in the gloss, by a hyphen “-” (rather than by a period “.”). [7]
- (4b) Clitic morpheme markup recommendation
- If an object language morpheme attaches to another word as a clitic (enclitic or proclitic), this clitic should be attached to its base in both, in the transcription and in the gloss, by an equal sign “=” (rather than by a hyphen “-”). [8]
- Example
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
Vorgestern | bin | ich | aus | ein-em | Kurz=urlaub | zurück=ge-komm-en. |
the_day_before_yesterday | be.PRS.1SG | 1SG.NOM | out | a-M.DAT | short=vacation.M.SG.DAT | back=PTCP.PRF-come-PTCP.PRF |
Rather than
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
the_day_before_yesterday | be.PRS.1SG | 1SG.NOM | out | a.M.DAT | short.vacation.M.SG.DAT | back.come.PTCP.PRF |
For “einem”, cf. the following paradigm:
Gender, case | Word | Glossing transcription | Gloss |
---|---|---|---|
M, NOM | ein | ein | ART.INDF.SG.M.NOM (or rather ART.INDF.SG[M.NOM], see below) |
M, GEN | eines | ein-es | ART.INDF.SG-M.GEN |
M, ACC | eine' | ein-en | ART.INDF.SG-M.ACC |
M, DAT | eine' | ein-em | ART.INDF.SG-M.DAT |
F, NOM | eine | ein-e | ART.INDF.SG-F.NOM |
... | ... | ... | ... |
Note that e.g. “bin” cannot neatly be separated into different morphemes. Therefore, all its semantic elements are still fused by periods ‘.’ in the gloss.
Circumfixes and other circum-morphemes
- (4c) Circum-morpheme markup recommendation
- If a split object language morpheme encircles another word from both sides as a circumfix or ‘circum-clitic’, or if two object language words encircle other words, we recommend to simply repeat the same gloss for both elements identically and mark both glosses with the same superscript index. [9]
- Examples
šipram | taštaprī |
šipr-am | ta-štapr-ī |
writing(M)-ACC.SG | 2SG.F1-write.PRF-2SG.F1 |
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
Vorgestern | bin | ich | aus | ein-em | Kurz=urlaub | zurück=ge-komm-en. |
the_day_before_yesterday | be.PRS.1SG | 1SG.NOM | out | a-M.DAT | short=vacation.M.SG.DAT | back=PTCP.PRF1-come-PTCP.PRF1 |
Je | ne | sais | pas | pourquoi. |
1SG | NOT1 | know.PRS.1SG | NOT1 | why |
Infixes
- (4d) Infix markup recommendation
- If one object language word contains a clearly and neatly separable infix, this infix may optionally be marked in both, in the transcription and in the gloss, by angle brackets “< >” (rather than by a period “.” or hyphens “-”). In the gloss, the brackets might either follow or precede the gloss of the element with the infix.[10]
- Example
šipram | taštaprī |
šipr-am | ta-š<ta>pr-ī |
letter(M)-ACC.SG | 2SG.F1-write<PRF>-2SG.F1 |
Reduplication phenomena
- (4e) Reduplication markup recommendation
- Categories that are expressed by a regular reduplication phenomenon in a paradigm may optionally be marked in both, in the transcription and in the gloss, with a tilde “~” (rather than with a hyphen “-” or period “.”). [11]
Cf. the following Egyptian paradigm:
Verbal Number | Word | Glossing transliteration | Gloss | Translation |
---|---|---|---|---|
(unmarked) | jrt | jr-t | do.PTCP-F | ‘(she) who does/did’ |
DISTR | jrrt | jr~r-t | do~PTCP.DISTR-F | ‘(she) who (repeatedly, ...) does/used to do’ |
Correct sequential alignment rule
If affixes, clitics, reduplications and/or infixes are marked by “-”, “=”, “~”, and “< >”, respectively, it is mandatory to obey the following rule:
- Correct sequential alignment rule
- The number and sequence of hyphens “-”, equal signs “=”, tildes “~”, and angle brackets “< >” must always be exactly the same in the object language transliteration and the gloss.
Cf. for example:
Correct | Wrong | Wrong | Correct | Wrong |
---|---|---|---|---|
jrrtf | jrrtf | jrrtf | ambulabam | ambulabam |
jr~r-t=f | jrr-t=f | jr~r.t=f | ambula-ba-m | |
do~DISTR.REL-F=3SG.M | do~DISTR.REL-F=3SG.M | do~DISTR.REL-F=3SG.M | walk-IPFV-1SG | walk-IPFV-1SG |
‘what he used to do’ | ‘I walked’ |
Missing inflection
- (5) Missing inflection markup recommendation
- If one can determine a grammatical category of a word only by the fact that a morpheme (ending, affix, infix, ...) is missing, this category should be attached to the gloss in square brackets “[ ]” (rather than by a period “.” or else). [12]
Cf. the following paradigm:
Number, case | Word | Glossing transliteration | Gloss |
---|---|---|---|
SG, NOM | Urlaub | Urlaub | vacation[SG.NOM] (or vacation[SG.NGEN]) |
SG, GEN | Urlaubs | Urlaub-s | vacation-SG.GEN |
SG, ACC | Urlaub | Urlaub | vacation[SG.ACC] (or vacation[SG.NGEN]) |
SG, DAT | Urlaub | Urlaub | vacation[SG.DAT] (or vacation[SG.NGEN]) |
PL, NOM | Urlaube | Urlaub-e | vacation-PL.NOM (or vacation-PL.NOM;GEN) |
PL, GEN | Urlaube | Urlaub-e | vacation-PL.GEN (or vacation-PL.NOM;GEN) |
PL, ACC | Urlauben | Urlaub-e | vacation-PL.ACC (or vacation-PL.ACC;DAT) |
PL, DAT | Urlauben | Urlaub-e | vacation-PL.DAT (or vacation-SG.ACC;DAT) |
- Example
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
Vorgestern | bin | ich | aus | ein-em | Kurz=urlaub | zurück=ge-komm-en. |
the_day_before_yesterday | be.PRS.1SG | 1SG.NOM | out | a-M.DAT | short=vacation.M[SG.DAT] | back=PTCP.PRF1-come-PTCP.PRF1 |
Rather than
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
Vorgestern | bin | ich | aus | ein-em | Kurz=urlaub | zurück=ge-komm-en. |
the_day_before_yesterday | be.PRS.1SG | 1SG.NOM | out | a-M.DAT | short=vacation.M.SG.DAT | back=PTCP.PRF1-come-PTCP.PRF1 |
Alternatively, one might want to explicitly mark the paradigmatic missing of a morpheme (ending, affix, infix, ...) in the transcription by the means of a ‘zero-morpheme’ affix “-ø”. (Cf. the Affix Markup Recommendation above.)
- Example
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
Vorgestern | bin | ich | aus | ein-em | Kurz=urlaub-ø | zurück=ge-komm-en. |
the_day_before_yesterday | be.PRS.1SG | 1SG.NOM | out | a-M.DAT | short=vacation.M-SG.DAT | back=PTCP.PRF1-come-PTCP.PRF1 |
Covert, inherent categories
- (6) Inherent categories markup recommendation
- Categories that are never expressed by a morpheme in a paradigm, i.e. categories that are rather inherent to a lexeme, may optionally be attached to the respective gloss in parentheses “( )” (rather than by a period “.” or square brackets “[ ]”). [13]
- Example
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
Vorgestern | bin | ich | aus | ein-em | Kurz=urlaub | zurück=ge-komm-en. |
the_day_before_yesterday | be.PRS.1SG | 1SG.NOM | out | a-M.DAT | short=vacation(M)[SG.DAT] | back=PTCP.PRF1-come-PTCP.PRF1 |
Rather than
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
Vorgestern | bin | ich | aus | ein-em | Kurz=urlaub | zurück=ge-komm-en. |
the_day_before_yesterday | be.PRS.1SG | 1SG.NOM | out | a-M.DAT | short=vacation.M[SG.DAT] | back=PTCP.PRF1-come-PTCP.PRF1 |
Caution: This rule may eventually generate some difficult questions, like whether e.g. “ich” should be glossed as “1SG.NOM” or “1SG(NOM)”.
Ablaut phenomena
- (7) Ablaut markup recommendation
- Categories that are expressed by a regular ablaut phenomenon in a paradigm may optionally be attached to the gloss with a back slash “\” (rather than with a period “.” or colon “:”). [14]
Cf. the following paradigm:
Number | Word | Gloss |
---|---|---|
SG | Vater | father(M).SG or better father(M)[SG] |
PL | Väter | father(M):PL or better father(M)\PL |
Root-and-pattern morphology
In some languages, like many Afro-Asiatic languages, one can neatly separate (a) a word root and (b) a vowel pattern, although they are intertwined like two cogwheels.
- (8) Root-and-pattern morphology markup recommendation
- Categories that are expressed by a complex regular vocalic pattern applied to a (consonantal) root in a paradigm may either be marked as ablaut phenomenon (“\”; see above) or – preferably – left unspecified (“:”, see below).
Cf. the following examples from Akkadian:
šapārum | šapār-um | write:INF-NOM.SG | ‘(to) write; (to) send’ |
ašpur | a-špur | 1SG-write:PST | ‘I sent’ |
ašappar | a-šappar | 1SG-write:IPFV | ‘I send, I will send’ |
aštapar | a-š<ta>par | 1SG-write<PRF> | ‘I have sent’ |
šiprum | šipr-um | writing(M)-NOM.SG | ‘message, writing; work’ |
Leaving inflection type unspecified
- (9) Unspecified inflection markup recommendation
- If one object language word corresponds to two or more elements in the gloss that can be distinguished between theoretically, but the encoder is not able to or does not want to specify the type of inflection or the morpheme boundary, these elements may be joined (or rather separated) in the gloss by a colon “:” (rather than by a period “.”). [15]
Consequently, the encoder may choose to use the colon “:” instead of any of the other indications of separable morphemes (“-”, “=”, “< &rt;”, “~”, “\” – but not for [#Portmanteau morphemes|Portmanteau morphemes]). Note that, differently to the cases of “-”, “=”, “< &rt;”, and “~”, the colon “:” in the gloss is not supposed to match with a colon “:” in the Glossing transliteration line. Note that a separate glossing transcription line is not necessary if the encoder uses only periods “.”, colons “:”, backslashes “\”, parentheses “( )”, and square brackets “[ ]”.
- Example
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
the_day_before_yesterday | be.PRS.1SG | 1SG.NOM | out | ART.INDF.SG:M.DAT | short:vacation.M:SG.DAT | back:come:PTCP.PRF |
Rather than
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
the_day_before_yesterday | be.PRS.1SG | 1SG.NOM | out | a-M.DAT | short=vacation(M)[SG.DAT] | back=PTCP.PRF1-come-PTCP.PRF1 |
The period in the expert mode
- (10) Portmanteau morpheme rule
- If one chooses to mark affixes, clitics, reduplications and/or infixes by “-”,“=”, “~”, and “< >”, respectively, or if one chooses to mark separable morpheme by a colon “:” (leaving the type of inflection unspecified), elements in a gloss should only be joined by a period “.”, if these are inseparably fused in the object language word (Portmanteau morpheme).
- Example
Vorgestern | bin | ich | aus | einem | Kurzurlaub | zurückgekommen. |
Vorgestern | bin | ich | aus | ein-em | Kurz=urlaub | zurück=ge-komm-en. |
the_day_before_yesterday | be.PRS.1SG | 1SG.NOM | out | a-M.DAT | short=vacation(M)[SG.DAT] | back=PTCP.PRF1-come-PTCP.PRF1 |
References
- ↑ LGR (2008): rule 1.
- ↑ Cf. LGR (2008): rule 4.
- ↑ Cf. LGR (2008): rule 5.
- ↑ LGR (2008): rule 4a.
- ↑ Kutscher & Werning (forthc.): xxv. This rule is not part of the LGR (2008); but cf. LGR (2008): rule 4.
- ↑ Cf. LGR (2008): rule 3.
- ↑ LGR (2008): rule 2.
- ↑ LGR (2008): rule 2.
- ↑ Cf. LGR (2008): rule 8 (there without index) with some alternative suggestions.
- ↑ LGR (2008): rule 9.
- ↑ LGR (2008): rule 10.
- ↑ LGR (2008): rule 6.
- ↑ LGR (2008): rule 7.
- ↑ LGR (2008): rule 4d.
- ↑ LGR (2008): rule 4c.
Bibliography
- Di Biase Dyson, Camilla, Frank Kammerzell & Daniel A. Werning (2009). Glossing Ancient Egyptian. Suggestions for Adapting the Leipzig Glossing Rules. In: Lingua Aegyptia. Journal of Egyptian Language Studies 17: 243–266.
- Kutscher, Silvia & Daniel A. Werning (eds.) (forthc.). On Ancient Grammars of Space: Linguistic Research on the Expression of Spatial Relations and Motion in Ancient Languages, Topoi. Berlin Studies of the Ancient World, Berlin: de Gruyter, ISBN 978-3110311358.
- LGR (2008) = The Leipzig Glossing Rules: Conventions for Interlinear Morpheme-by-Morpheme Glosses, ed. by the Department of Linguistics of the Max Planck Institute for Evolutionary Anthropology (Bernard Comrie, Martin Haspelmath) and by the Department of Linguistics of the University of Leipzig (Balthasar Bickel), http://www.eva.mpg.de/lingua/resources/glossing-rules.php, Leipzig, 12. Sept. 2008.