You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a summary of Arppe, Cox, Hulden et al. (2017) (Arppe_Cox_Hulden_et_al_DLC_2017.pdf) providing some general points for consideration in generalizing the Cree model to other Indigenous languages such as Tsuut'ina (Dene).
Tsuut'ina (and other Dene languages) has disjoint morphology at both the lexical and inflectional levels (like Arabic, but interdigitation syllables). The lexical tier consists of a stem and up to three lexical prefixes (potential disjoint from the stem), that together specify the meaning. The lexical tier can have allomorphy both for the stem and any of the lexical prefixes for the various aspects/moods (see below). The inflectional tier consists of three disjoint slots/chunks, that are inserted within the lexical tier. The morphotax is as follows:
This is implemented as an FST using a regular appearing lemma (the third person imperfective form - N.B. no tense but rather aspect and mode), an internal representation of the disjoint four-part lexical structure, and some clever but quite simple FST calculus with flag-diacritics to put the whole together, so that inflected forms can actually be specified with a lemma + features (mostly suffixed now as far as I know, but there potentially could be some dynamic prefixing).
Examples of the lemma + lexical structure are below. The non-alphabetic characters '.', '_', and '=' on the internal (right now) represent where the inner, middle, and outer inflectional morphology goes. In addition, the crude lexical meaning can be presented after the lemma in brackets [...].
What the above means is that the FST analysis and generation works quite nicely with the following format:
lemma[gloss]+POS+ASPECT+SUBJECT(+OBJECT)
On the other hand, on the Tsuut'ina dictionary side, we would want to be able to link to the lemma and represent the different allomorphs (lexical prefixes + stem) for the different aspects. This structure will be coded using some for XML-style formatted specification that will be implemented by Chris Cox.
nàgudiitłod 'jump-down'
-> Outer: nà + Middle: gu + Inner: di + Stem: tłod (Imperfective)
-> Outer: nà + Middle: gu + Inner: di + Stem: tłòt (Perfective)
-> Outer: nà + Middle: gu + Inner: di + Stem: tłíł (Progressive)
-> Outer: nàná + Middle: gu + Inner: di + Stem: tłiizh (Repetitive)
-> Outer: nìná + Middle: gu + Inner: di + Stem: tłiizh (Repetitive)
For an inflected form, we would like to see in association with its analysis not just the lemma nàgudiitłod, but also that its lexical structure consists of the stem tłod but also the lexical prefixes nà, gu, and di. These lexical prefixes might be provided with some meaning, similar to preverbs in Cree/Algonquian, but likely that would be provided in the dictionary source. However, note that the meaning of the lexical prefixes can vary according to morphological context - while nà often indicates a repetitive form, sometimes it does not, so the meanings of the lexical prefixes might need to be determined per each lemma. Conveniently, the stems will allow for the creation of sets of semantically related lemmas. Finally, recall that the lexical tier will vary according to aspect.
In sum, the treatment of the FST analysis is relatively straight-forward, and whatever is implemented for crk should work for srs. However, for the dictionary content side, one should be able to present the structure of the lexical tier (stem + lexical prefixes), as well as provide specific information on the meaning of the lexical prefixes (and the stem), and which can/will vary according to aspect/mode.
The text was updated successfully, but these errors were encountered:
Moving UAlbertaALTLab/morphodict#189 here where it belongs:
This is a summary of Arppe, Cox, Hulden et al. (2017) (Arppe_Cox_Hulden_et_al_DLC_2017.pdf) providing some general points for consideration in generalizing the Cree model to other Indigenous languages such as Tsuut'ina (Dene).
LEX(outer) + INFL(outer) + LEX(middle) + INFL(middle) + LEX(inner) + INFL(inner) + STEM
This is implemented as an FST using a regular appearing lemma (the third person imperfective form - N.B. no tense but rather aspect and mode), an internal representation of the disjoint four-part lexical structure, and some clever but quite simple FST calculus with flag-diacritics to put the whole together, so that inflected forms can actually be specified with a lemma + features (mostly suffixed now as far as I know, but there potentially could be some dynamic prefixing).
Examples of the lemma + lexical structure are below. The non-alphabetic characters
'.'
,'_'
, and'='
on the internal (right now) represent where the inner, middle, and outer inflectional morphology goes. In addition, the crude lexical meaning can be presented after the lemma in brackets[...]
.lemma[gloss]+POS+ASPECT+SUBJECT(+OBJECT)
For an inflected form, we would like to see in association with its analysis not just the lemma
nàgudiitłod
, but also that its lexical structure consists of the stemtłod
but also the lexical prefixesnà
,gu
, anddi
. These lexical prefixes might be provided with some meaning, similar to preverbs in Cree/Algonquian, but likely that would be provided in the dictionary source. However, note that the meaning of the lexical prefixes can vary according to morphological context - whilenà
often indicates a repetitive form, sometimes it does not, so the meanings of the lexical prefixes might need to be determined per each lemma. Conveniently, the stems will allow for the creation of sets of semantically related lemmas. Finally, recall that the lexical tier will vary according to aspect.The text was updated successfully, but these errors were encountered: