Phonotactics and Morphotactics

If you're new to these arts, this is the place to ask "stupid" questions and get directions!
Post Reply
User avatar
eldin raigmore
korean
korean
Posts: 6352
Joined: 14 Aug 2010 19:38
Location: SouthEast Michigan

Phonotactics and Morphotactics

Post by eldin raigmore »

Phonotactics
Once you’ve established your language’s phonemic inventory, the next thing to do is to establish its phonotactics.
Syllables
The first thing usually asked about phonotactics is, what is the maximal syllable-skeleton?
Usually this means things like:
* can syllables have onsets?
* must syllables have onsets?
* can onsets be clusters of more than one consonant?
* what’s the longest onset-cluster allowed?
* can syllables have codas?
* must syllables have codas?
* can codas be clusters of more than one consonant?
* what’s the longest coda-cluster allowed?
* can nuclei be clusters of more than one vowel?
* what’s the longest vowel-cluster allowed in a nucleus?

Note that the usual limits are:
* no tautosyllabic consonant-cluster can be longer than four consonants
* if onsets can be four consonants long then codas can’t be more than three consonants long
* if codas can be four consonants long then onsets can’t be more than three consonants long
* nuclei can’t be more than four vowels long.

Two-vowel nuclei are called diphthongs; three-vowel nuclei are called triphthongs; and four-voweled nuclei are called tetraphthongs.
A word can contain vowel-clusters longer than that; but a syllable can’t. If a vowel-cluster is longer than the maximum allowed nucleus-length, a syllable-boundary intervenes between an open (ie codaless) syllable and a following onsetless syllable. Such an intervocalic syllable-boundary is called a hiatus.

Likewise a word can contain a consonant-string longer than the longest legal onset-cluster and longer than the longest-legal coda-cluster; but if it does, it must cross a syllable-boundary.

Some diphthongs are phonemes. As far as I know no language has phonemic triphthongs nor longer phonemic polyphthongs.
If a language has diphthongs that doesn’t mean any of them are phonemic.
If a language has no phonemic diphthongs that doesn’t mean it doesn’t have diphthongs.

Sometimes the answers vary depending on where in the word the syllable occurs.
For instance at one time the French Wikipedia said the maximal syllables of French were as follows;
* (C)(C)(C)V(C)(C)(C) for one-syllable words
* (C)(C)(C)V(C)(C) for first syllables
* (C)(C)V(C)(C)(C) for last syllables
* (C)(C)V(C)(C) for word-internal syllables

Note that some languages have some syllables without vowels; or with consonants in their nuclei.
In general the rules are:
* a consonantal syllable has to be unstressed and so it can’t be a whole word.
* if a nucleus contains a vowel it can’t contain a consonant; if a nucleus contains a consonant it can’t contain a vowel.
* a nucleus can never contain more than one consonant.
* if a syllable has a consonantal nucleus, its onset (if it has one) can’t be a cluster.
* if a syllable has a consonantal nucleus, its coda (if it has one) can’t be a cluster.
* if a syllable has a consonantal nucleus, it can’t have both an onset and a coda.
* if a syllable has a consonantal nucleus, it can’t be longer than two phonemes.

What phonemes can go where
The next question, usually, is which phonemes can be the only, or first, or the last, or some interior, phoneme in a word, or a morpheme, or a syllable, or a syllable-body (the part that’s not the coda), or a syllable-rime (the part that’s not the onset), or a syllable-onset, or a syllable-nucleus, or a syllable-coda.

What phoneme-pairs can go where
The next question, usually, is which pairs of consecutive phonemes can be all of, or the first two, or the last two , or some interior pair, of consecutive phonemes in a word, or a morpheme, or a syllable, or a syllable-body (the part that’s not the coda), or a syllable-rime (the part that’s not the onset), or a syllable-onset, or a syllable-nucleus, or a syllable-coda.

Note that if we’re talking about onsets or codas we’re necessarily talking about pairs of consonants, and if we’re talking about nuclei we’re necessarily talking about pairs of vowels.
If we’re talking about bodies we can discuss CV pairs, and if we’re talking about rimes we can discuss VC pairs.
And of course if we’re discussing words or morphemes or whole syllables we can discuss any kind of phoneme pairs.

Usually anything about which phonemes and/or phoneme-pairs can go where in words, is left to be deduced from what answers we’ve given to those same sorts of questions about morphemes and syllables and syllable-parts. Or handled individually.

What strings of three or more phonemes can go where
Usually anything we want to say about strings of three or more consecutive morphemes is left to be deduced from what we’ve already said above, or is handled individually.
There are only two general rules that are both consistent enough and simple enough to put here.
* if a cluster longer than a single consonant occurs as an onset, then either the one-consonant-shorter cluster obtained by omitting the first consonant, or the one obtained by omitting the last consonant, also occurs as an onset; usually both.
* if a cluster longer than a single consonant occurs as a coda, then either the one-consonant-shorter cluster obtained by omitting the first consonant, or the one obtained by omitting the last consonant, also occurs as a coda; usually both.

There are other interesting universals but they’re in my opinion too conditional or too statistical to digress upon here.

==========================================

Morphotactics
Once you’ve decided what morphemes are in your language’s lexicon, the distribution of single morphemes is a lot like the distribution of single phonemes.
Namely, which morphemes can be all of, or the first-but-not-last morpheme of, or the last-but-not-first morpheme of, or some interior or medial morpheme of, a word?

(Questions about pairs of morphemes, which I will talk about later, don’t at all resemble the questions about pairs of phonemes above!)

If we pretend, or simplify, inflectional morphology, to being all affixation, we might get the following.

A morpheme that can be a whole word but can’t be a part of a word is a particle.
Technically a particle is a word that can’t take any inflection. If we leave out tonemes and chronemes and stressemes and infixes and circumfixes and transfixes as forms of inflection — for instance, either our language doesn’t have them, or the morphology they accomplish is called derivation instead of inflection— then these words which can’t take prefixes or suffixes must be particles.

A morpheme which can be a whole word or the first part of a word or the last part of a word or an interior part of a word, must be a root morpheme.

A morpheme which can be the first or a medial part of a word but can’t be a whole word and can’t be the last part of a word must be a prefix.

A morpheme which can be the last or a medial part of a word but can’t be a whole word and can’t be the first part of a word must be a suffix.

There may be morphemes which can’t be whole words nor last morphemes nor medial morphemes but can be first morphemes. These are prefixes that can only be the first or only prefix in any word they’re part of.

There may be morphemes which can’t be whole words nor first morphemes nor medial morphemes but can be last morphemes. These are suffixes that can only be the last or only suffix in any word they’re part of.

There may be morphemes that can be whole words or the first morpheme of a word, but can’t be the last nor a medial morpheme of any word with other morphemes. These would be roots that can’t take prefixes.

There may be morphemes that can be whole words or last morphemes of words, but can’t be first morphemes or medial morphemes of words that also contain other morphemes. These would be roots that can’t take suffixes.

.....

Any morpheme of any of the seven other positional-types of morphemes is, I’m guessing, pretty rare.
I doubt there are many that can be first and can also be last but can’t be only or can’t be medial.
And I doubt there are many that can be only and can also be medial but can’t be first or can’t be last.
There’s another mathematical possibility, if I haven’t miscounted, but can’t figure out what it is.

Only or first or last or medial — a root

Only only — particles
First only — prefixes that have to be first
Last only — suffixes that have to be last
*Medial only — *weird type not encountered before

*Only or first or last — *a root that can take prefixes without suffixes or suffixes without prefixes but not both together
*Only or first or medial — *a root that can take suffixes without prefixes but can’t take prefixes unless it also has a suffix
*Only or last or medial — *a root that can take prefixes without suffixes but can’t take suffixes unless it also has a prefix
First or last or medial — an ambifix

Only or first — a root that can take suffixes but not prefixes
Only or last — a root that can take prefixes but not suffixes
*Only or medial — *a root that can take a prefix with a suffix or a suffix with a prefix, but not just a prefix and not just a suffix

*First or last — *an extremal kind of ambifix that has to be the first prefix or the last suffix
First or medial — a prefix
Last or medial — a suffix.

The ones marked with asterisks are the weird ones I’ve never encountered before.
I do not claim they’re impossible.
I also do not claim that all those not so marked are common.

........................................

Pairs of Morphemes

For any two morphemes in the language, ask the following questions;
* can these two morphemes both co-occur in the same word, or does the presence of one preclude the presence of the other?
* if these two morphemes occur in the same word, must they occur contiguously, or may another morpheme intervene between them?
* if these two morphemes occur in the same word, may they occur contiguously, or must another morpheme intervene between them?
* if these two morphemes occur in the same word, may they occur in either order, or must one always occur before the other?
** if one must occur before the other, which must come before and which must come after?

Among examples would be a morpheme that only occurs in nouns and another that only occurs in verbs, in a language in which nouns are never verbs and verbs are never nouns. They could never co-occur in the same word.

Or, a prefix and a suffix. They can never occur contiguously, for the root must always intervene between them. Furthermore any word that has both affixes must have the prefix before it has the suffix.

It’s theoretically possible, though I can’t think of an example that we could have morphemes A and B that can co-occur in the order AB but only contiguously, and could co-occur in the order BXA but only if a string X of one or more other morphemes intervened between them.

It might be possible to divide morphemes into classes, such that, given any class, no two morphemes of that class can co-occur in the same word; and, given any two classes, either every morpheme of the one class can co-occur with every morpheme of the other class, or else no morpheme of the one class could ever co-occur with any morpheme of the other class.

I need to think about that.
Last edited by eldin raigmore on 11 Nov 2020 03:07, edited 1 time in total.
User avatar
eldin raigmore
korean
korean
Posts: 6352
Joined: 14 Aug 2010 19:38
Location: SouthEast Michigan

Re: Phonotactics and Morphotactics

Post by eldin raigmore »

My last few remarks were about classifying the morphemes in a found language.

It would be much easier to create the morphemes in classes to begin with, and then describe the morphotactics in terms of these classes.

For instance:
For each part of speech, we could have a regular expression, or finite-state automaton, that would generate some set of strings of morpheme-classes.
Then replace each class in that string with a morpheme from that class.
The morpheme-string thus formed would be a word from that part-of-speech.
No other strings of morphemes than those would be words.
....
For the languages I was thinking of, we could have one chain of distinct morpheme-classes per part-of-speech.
No class would be repeated in any one string, and no two classes would occur in both orders in different strings.
Few of the classes would be mandatory in any given string, but at least one would. The rest would be optional.
Now replace each class in the string by one member of that class, or, provided the class is optional, by nothing.
The morpheme-string so generated would be a word of the part-of-speech specified.
No morpheme-string not so generated would be a word.

.....

I’m estimating at most about a hundred parts-of-speech, of which at most about two dozen would be “major”. I don’t know what “major” will mean in this instance; but one thing it could mean is large, open word-classes. I don’t know what “large” would mean either; but “open” would mean word-coinage and word-borrowing are going on synchronically.

...

I’m also estimating at most about a hundred morpheme-classes, of which at most about two dozen could be “major” morpheme-types.
I don’t know what “major” would mean here, either.
But it could mean “is a root morpheme”.
And/or it could mean “might be a component of more than one part-of-speech”.

......

The language so created would have an extremely simple morphology.
Or at least I guess it would.
Post Reply