AMA on Indonesian

Omzinesý · Post by **Omzinesý** » 10 Feb 2023 21:15

Creyeditor wrote: ↑26 Aug 2022 15:12 The next post will be on vowels but the one after that will be on syllable structure and or phonotactics.

I'm btw still interested in the topic.

Post by **Creyeditor** » 10 Feb 2023 21:26

Yes, I started writing the post on syllable structute several times, but I find it so baffling in Indonesian. Syllable structure is crucial for stuff like stress and root shapes, yet it is very variable across varieties. Might take some time to make up my mind. But thanks for reminding me.

Omzinesý · Post by **Omzinesý** » 10 Feb 2023 21:31

Creyeditor wrote: ↑10 Feb 2023 21:26 Yes, I started writing the post on syllable structute several times, but I find it so baffling in Indonesian. Syllable structure is crucial for stuff like stress and root shapes, yet it is very variable across varieties. Might take some time to make up my mind. But thanks for reminding me.

No stress. Take your time.
Knowing Indonesian syllable structure is not the most urgent thing in my life.

Post by **Creyeditor** » 04 May 2023 23:02

Syllable structure
So, Indonesian syllable structure kind of makes my head explode. Here is why. It varies wildly between speakers, yet it is crucial to understand things like root shapes.

Basic syllable structure
The basic syllable structure varies between dialects from CCCVVCC to CVC. In all dialects coda consonants can be any consonant but voiced obstruents or palatal consonants. So no syllable ends in <ny c b d j g z> /ɲ tʃ b d dʒ g z/. For palatals this also holds for morphemes, so no morpheme can end in a palatal consonants. Loaned roots can end in voiced obstruents which only show up if affixed. Take for example jawab /dʒawab/ [dʒawap] to answer from Arabic. If you use the concrete nominalizer -an, you get jawaban [dʒawaban] answer with a voiced obstruent.
So far, so easy. Now, the variation with regard to consonant clusters is part of what drives me crazy. It is best illustrated by loanwords like strom electric shock (from Dutch). In Papua Indonesian, consonant clusters are commonly kept and not simplified, so the word is [strom]. This is consistent with clusters in other loanwords like knalpot and knop button, but also with native words like kenapa [knapa] why and kenal [knal] to know s.o. , where the Schwa is often elided compared to other varieties. In more Standard varieties a single schwa is inserted into strom [sətrom]. The varieties often allow stop+liquid clusters in native words like berapa [brapa] how much and Dutch loanwords such as preman gangster and antri to queue. Not much more is allowed and schwas break up such consonant cluster in native words such as kenal. Loanwords with such clusters are rearer. Complex coda clusters are also avoided by schwa insertion, e.g. filem from Dutch <film>.
Yet, other varieties, including rural dialects on Java, seem to tolerare no consonant clusters inside a syllable at all. This leads to [sətərom] for strom. This also applies to native words, such as berapa [bərapa] how much. Indonesian orthography varies between the second and the third type.
Diphthongs are another issue where there is some variation but apart from the variation described in the post on vowels, this is mostly notational. Some people argue that diphthongs in open syllables are vowel-glide sequences, other say they are diphthongs. Note that most people generally agree that diphthongs (or vowel-glide sequences) only exist in open syllables e.g. bau [baw]~[bau] smell and not in closed syllables e.g. laut [la.ut] sea. To be honest, I don't know what all the fuzz is about.

Root shapes
Now why should syllable structure matter at all. As mentioned above, stress or accent can't really be the issue, as it is highly variable as well. The problem are root shapes: native Indonesian roots are generally bisyllabic. Monosyllabic roots are very rare and trisyllabic roots usually include a fossilized affix, especially an infix, e.g. gemetar to shake, to shiver related to getar to vibrate, to trill, to shake. The interesting thing is that it partially applies to loanwords and is therefore at least partially productive. A typewriter is a mesin tik lit. tick machine but to type is actually ketik, with an epenthetic syllable. Similarly, verbal monosyllabic loan roots are often expanded so to bomb is mengebom as if the root was kebom. These loan roots then end up bisyllabic. So, how is that supposed to work if varieties differ wildly in what they consider a licit syllable? I don't know. A recent paper argues that some of these effects might be cumulative and they might be on the right track but I can't really say more about this. I hope this post was still interesting enough and sorry for the long wait.

Omzinesý · Post by **Omzinesý** » 07 May 2023 20:53

Interesting!

The syllable structure reminds me of different Arabics. Basically CVC but can be very horrible in practice.
The second question is what the schwa is, but I can start with Wikipedia and your post on stressing.

Post by **Creyeditor** » 07 May 2023 21:41

Just to be sure: What do you mean 'what the schwa is?' It's phonetic quality? It's phonological status? It's IPA symbol?

Omzinesý · Post by **Omzinesý** » 09 May 2023 15:44

Creyeditor wrote: ↑07 May 2023 21:41 Just to be sure: What do you mean 'what the schwa is?' It's phonetic quality? It's phonological status? It's IPA symbol?

I mean its phological status.
If it is a phoneme, an allophone of some phoneme (in the orthography it seems to be <e>), or just something epenthetic, i.e. phonemically zero, or some combination of them depending on the phonemic context.

Post by **Creyeditor** » 09 May 2023 16:01

I think in structuralist terms it's either a phoneme or epenthetic in Standard Indonesian, depending on the context. It can be considered to be epenthetic if it breaks up otherwise illegal consonant clusters, e.g. kenal [k@nal] to know someone. It also appears in word-initial or word-final position, though, e.g. eke [Ek@] I, me (certain sociolects), elang [@laN] eagle, where it has to be phonemic.

Post by **Salmoneus** » 09 May 2023 19:46

I take it it's not possible to just see it the other way around?

Your description of syllable structure to me sounds like Indonesian is just (archi)phonemically limited to CVC syllables, and that different dialects have different rules to delete schwa in different contexts - particularly as you say that Papuan elides these schwas even when they're not derived from epenthesis in loanwords?

How that would interact with word shape rules depends on, you know, how different dialects deal with these words. Are words like 'knop' and 'strom' augmented by affixed to turn them into bisyllables in all dialects? Or some? Or none? There doesn't seem to be a rule against having MORE than two syllables (a statistical tendency isn't the same thing), so as long as the schwa is still counted as being there there shouldn't be any problems, even if schwa is sometimes elided in speech in some dialects. [elision of schwa is a very common feature of languages]

Or are there words that have pronounced inherited schwa in contexts where epenthetic schwa is elided?

Post by **Creyeditor** » 10 May 2023 09:08

@Sal: I will need some time to think about your post and have a look at some of the literature again. It's an interesting idea and the reason people haven't thought of it might be due to certain linguistic traditions.

Post by **Creyeditor** » 08 Jul 2023 23:49

I thought about it for some time now and I haven't really had an epiphany, but I feel like I won't have one anyway. I suspect that the reason that noone has proposed such an analysis is cover structuralist-generative thinking in most branches of theoretical and descriptive phonology. If a segment A (or more general any phonological property) is predictable on the surface, then it does not need to be memorized and therefore is not part of the underlying representation. (If you are looking for literature on this, I think Morris Halle's 1959 The Sound Pattern of Russian is the earlist clear example of this in generative phonology, but it's hard to digest, at least for me). At least some schwas are predictable in most dialects, so they must be epenthetic.
Additionally, syllable structure is generally predictable and not contrastive and therefore usually assumed to be absent in underlying forms and present in phonological surface forms (and absent again in phonetic outputs). (This is actually a hotly debated topic in Optimality Theory literature because of well, certain misunderstandings and covert structuralist-generative thinking in an interactionist framework.) If syllable structure is absent from underlying forms, Indonesian cannot be limited to CVC syllables in underlying forms and therefore schwas are only present if needed for surface syllable structure.

I am not saying that I buy into the argumentation (or that is coherent or logically solid) at all, I just think that this might be on the right track if one wants to figure out why no linguist has proposed such an analysis before. Of course, this is related to the fact that most people writing about schwas in Indonesian were strongly influenced by structuralist-generative frameworks.

Post by **Salmoneus** » 09 Jul 2023 02:24

EDIT: apologies if this comes across as a little aggressive! My criticisms are directed as the theory you seem to be relating from another source - which does not make sense - rather than at you personally!

Creyeditor wrote: ↑08 Jul 2023 23:49 I thought about it for some time now and I haven't really had an epiphany, but I feel like I won't have one anyway. I suspect that the reason that noone has proposed such an analysis is cover structuralist-generative thinking in most branches of theoretical and descriptive phonology. If a segment A (or more general any phonological property) is predictable on the surface, then it does not need to be memorized and therefore is not part of the underlying representation.

This is logically incoherent, though.
There is no such thing as, in the abstract, a feature being 'predictable' - one feature can only be predictable from (or relative to) another feature or set of features. But in order for the first feature to be predictable from the second feature (or set of features), there must be a 1:1 equivalence between the two features (or sets of features). Otherwise, there would be cases where one feature would be present without the other, or vice versa, making its appearance not truly predictable from the other. However, equivalence is not directional, by definition: if A and B are in a 1:1 equivalence, then B and A are in a 1:1 equivalence also. That means that if A can be predicted from B, then necessarily B can be predicted from A. [this may not casually seem true if 'B' is incorrectly defined - eg as a single feature rather than a compound feature].

So if you can create an "underlying" (i.e. superimposed) representation (the term 'representation' should be enough to show that the imagery of it being 'underlying' is misleading! Representations do not underly the things they are representing!) in which all cases of A are deleted, because mentioning A is superfluous if you're mentioning B, then equally you can also create a representation in which all cases of B are deleted instead, and only A is mentioned. There is never only one possible representation. [this is a pretty basic and familiar thought-experiment philosophy of science in general, not just in linguistics - facts always underdetermine theory, and even given infinitely detailed facts there would still be an infinite number of incompatible theories that would equally explain those facts]. It's actually used actively in logic and computing and so forth. A nice example is the way that description of propositions using 'and', 'or' and 'not' can be rewritten with either only the sheffer stroke OR only the pierce arrow, and whichever you choose then becomes predictable (as the only possible operator) and can be deleted, leaving only bracketing. So P->Q can be written EITHER as P(QQ) (using and deleting out the stroke) OR as ((PP)Q)((PP)Q) (using and then deleting the arrow). Alternatively if a directional convention for reading the scope of either the stroke or the arrow is used, then the operator can be retained and the brackets, which are now predictable from operator placement, can be elided, so that former, for instance, could be rewritten PQQ||. All these representations are incompatible with one another, but all of them are equivalent to one another.

Or, in linguistic terms: we don't usually mark delayed voicing in vowels as phonemic in English, because it's predictable from consonant aspiration. A representation including aspiration, therefore, doesn't need to mark delayed voicing in vowels. However, it's equally valid to mark delayed voicing in vowel and NOT mark aspiration, because aspiration can be predicted from delayed voicing (in at least some dialects - I'm guessing there are some where it gets complicated with final consonants not followed by a vowel).

So the argument "you shouldn't notate A in your representation because it's predictable from X" is a bad one, because it could equally be used as a reason to notate A and not X.

In this case, you're also assuming that the presence of schwas is a feature. It is equally conceptually valid, however, to say that the absence of schwas is a feature! If you can predict the presence of schwas, you can also predict their absence. It cannot be more inherently wrong, therefore, to not mark the schwas than to mark all schwas and not mark their absence!

[in concrete terms, an example: if only stressed syllables in a language can have vowels, there are (more than) three equally valid and equivalent ways to represent the words [kbat] and [kabt] - you could say that these are /kbat/ and /kabt/; you could say that these are /ka'bat/ and /'kabat/, or you can say that they are /k'bt/ and /'kbt/. In one system, stress is not marked, as it is predictable. In another, stress is marked, and all syllables are shown with vowels; in the third, stress is marked, and all syllables are shown without vowels (in this third system there would have to be an alternative way to mark non-/a/ vowels, of course). The choice between systems is not meaningful or significant - there can definitionally be no truth of the matter as to which is 'correct' or 'underlying'. In a sense, all three systems ARE the SAME system, just with different notational conventions, which purely comes down to the convenience of the individual user!]

Or, short version: assuming no schwas and creating rules to predict epenthesis is no more valid, no more logical, no more true, and no more 'generative' than assuming many schwas and creating rules to predict syncope. And, indeed, descriptions of countless other languages do include rules for syncope, which often vary depending upon the dialect.

[imagine how awkward it would be to insist on describing English with no schwas at all, and giving each dialect (and sometimes speaker) predictable rules for 'epenthesis'... it would be possible, of course, but utterly counterintuitive and unnecessarily awkward!]

-----------

(If you are looking for literature on this, I think Morris Halle's 1959 The Sound Pattern of Russian is the earlist clear example of this in generative phonology, but it's hard to digest, at least for me). At least some schwas are predictable in most dialects, so they must be epenthetic.

This, with respect, is VERY dodgy reasoning! "Some instances of X are predictable for many speakers" is just a long way of saying "X is NOT predictable". Either it is or isn't predictable - it can't be sometimes or partially predictable!

Additionally, syllable structure is generally predictable and not contrastive and therefore usually assumed to be absent in underlying forms and present in phonological surface forms (and absent again in phonetic outputs). (This is actually a hotly debated topic in Optimality Theory literature because of well, certain misunderstandings and covert structuralist-generative thinking in an interactionist framework.)

In Indonesian, you mean? Because obviously that's not true of, say, English, in which syllable structure is contrastive (as it is in countless languages). Indeed, in English syllable structure is incredibly important, as a huge array of other rules depend on it. That is to say: if we refuse to notate syllable structure, we need to notate a whole bunch of other things instead.

[for instance: normally English is notated with two stop series. But aspiration depends on syllable structure, so without syllables you need three series, not two, in order to distinguish /plVmp.aI/ from /plVm.paI/ (or /p_hlVmpaI/ from /p_hlVmp_haI/, if you insist!). But you'd also have to distinguish length in vowels - or, for some dialects, THREE vowel lengths! - because phonetic vowel length is also determined by syllable structure. So normally we could write /sElf.IS/ and /SEl.fIS/, but without syllable structure you'd need to say /sElfIS/ and /SE:lfIS/, as the latter has a phonetically longer vowel than the former. You'd also, likewise, have to notate two values of /r/, two values of /l/, multiple values of /t/ and /d/ and a length distinction in nasals, and you'd also have to throw a bunch of additional /t/ and /d/ into some words (eg writing 'fence' as /fEnts/, but 'inside' as /InsaId/ - whereas normally we can just show this through syllable structure). That's a whole bunch of things you'd have to add to English phonology - some of which would vary wildly with dialect - when you could just notate syllable structure instead]

[[these examples and this argument is from Wells, fwiw. Wells actually argues that syllable structure itself can be done away with in favour of merely a complicated set of rules based on recognising five phonemic degrees of stress, but I'm not sure that that's an improvement, and that still requires marking morpheme structure independently]]

And again: in all these examples, yes, English syllable structure is predictable from the values of other phonemic contrasts. But equally and oppositely, the values of those phonemic contrasts is predictable from syllable structure. So it makes no sense to say that this means we mustn't notate syllable structure!

If syllable structure is absent from underlying forms, Indonesian cannot be limited to CVC syllables in underlying forms and therefore schwas are only present if needed for surface syllable structure.

But here we're venturing, surely, into realms of the absurd? The idea of limitations in permissable syllable structure is central to the description of countless languages, if not all of them. Immense - and completely pointless - efforts would have to be gone to to explain real languages without such rules.

I find it difficult to believe that even generative linguists would have so little awareness of, or interest in, linguistic realities.

----------------------

To return to the point itself, though, I find myself a little hamstrung because you've still not explained what you see the problem as being in Indonesian. You've given what seems like a straightforward account of schwa elision - which you call the absence of schwa insertion - but then you've said "so how is that supposed to work?"...

...but as I don't know Indonesian, I don't know what you're pointing to when you say 'that'. What actual words are you struggling to explain? Even if you want to talk about the absence of schwa insertion, what exactly is the problem if different dialects possess an absence of schwa insertion in different words?

The three examples you give seem very straightforward: 'bom' and 'tik' violated the word shape rules, so are expanded with dummy syllables to kebom and ketik. [or k.bom and k.tik, if you insist]. Whether the schwa is phonetically elided - or, if you prefer, whether the absence of schwa is phonetically unelided by epenthesis - can then vary depeding on the dialect, as elision rules (or epenthesis rules) often do. 'mesin tik' is already more than one syllable (counted as a phonological word), so doesn't need any expansion.

Post by **Salmoneus** » 10 Jul 2023 13:48

I should apologise again - that was probably both rude and unclear. Maybe I should restate my points more concisely!

1: it would be helpful if you gave examples of some of the dialectically-variable wordforms that you feel are difficult to explain

2: the argument that if a feature, A, is predictable from another feature, B, or set of features [C,D], and therefore does not need to memorised by speakers, and therefore is not part of the underlying representation, must be wrong. This is because predictability is symmetrical: if A is predictable from B, then B is predictable from A (when A and B are defined suitably), or at least from a set of features [A, X] containing A. Any time it could be true to say "A does not need to be memorised, and therefore is not part of the underyling representation", it would be equally true to say "B does not need to be memorised, and therefore is not part of the underlying representation". And since both of these cannot be true at the same time, neither of them can be true.

3: the reason for the apparent paradox is of course that the idea of an 'underlying representation' is inaccurate and misleading. Phonemic representations do not underly reality; they only represent it - a phonemic transcription of a language is for the convenience of linguists. There is no one correct such transcription; preferred transcriptions will vary with the needs and tastes of the linguist who happens to be transcribing. In this case, if A is predictable from B, linguists may omit it from a phonemic representation; but they do not have to, and the result is not the phonemic representation. It would alternatively be possible to instead omit B from their representations, if that were more to their taste (this may be influenced by which transcription is more convenient to type, or which transcription allows more intuitive comparisons to a related language, for instance).

4: to the extent that phonemic analyses may have some reflection in the linguistic processes of speakers - that is, to the extent that speakers themselves are continually subconsciously analysing their own language in a way similar to a linguistic researcher - it is evident that speakers probably individually, and certainly as a whole, operate multiple rival phonemic analyses. After all, another term for 'predictability' is 'redundancy', which is an important design feature of language. If A and B always go together, there is no need for a speaker to think of one as more 'underlying' than the other. It is only when they hear a speaker use A and not B, or vice versa, that they are forced to decide for themselves if this is or isn't an example of the word that normally has both A and B in it - that is, then they must decide which of A and B is more criterial. But there is no reason why a speaker must have a definite and consistent opinion in this regard (especially as in practice sub-phonemic features E, F and G are usually present to provide guidance). There is absolutely no reason why a population as a whole would have a consistent and definite opinion. In fact, we know that they do not, as "re-analysis" of criterial vs incidental features, with opinions that change across time or between different parts of the language community, is essentially how we get language change - how we get languages, in other words. In any case, an investigation of how actual speakers analyse their language is psychological, or at the very least depends on detailed study of how a given speaker responds to abnormal linguistic prompts (i.e. whether they notice deviant wordforms and which correct wordform they interpret them as being instances of) - it could not possibly be deduced from a description of the language itself.

5: in this particular example, it seems as though schwa is explicitly not predictable, in any case, as its presence varies between dialects and seems to be what you are having difficulty explaining (although, to repeat the first point, I'm not entirely clear on what words you're finding hard to explain). We cannot simultaneously argue that A is predictable and thus can be ignored and that it is unpredictable and problematic. It is not enough to say that A is sometimes predictable, since this means that it is sometimes not predictable, and this unpredictable residue could not therefore be factored out in any valid representation.

6: you seem to suggest that the syllabic structure of words should not be transcribed in phonemic descriptions of a language. Obviously this is a matter of personal choice, as I've said. However, it is important to say that refusing to accept syllable structure as phonemic will often require grotesquely baroque and inconvenient phonemic descriptions, requiring a great profusion of phonemes, many of which may have perplexingly varied realisations transdialectically. The basic principles of science and rationality strongly discourage such analyses, in favour of the far simpler analyses that incorporate syllable structure as phonemic, unless there is a very strong motivation why such descriptions would be preferred in a certain paper. I can't see any serious reasons - conceptual or practical - for intentionally preferring the less parsimonious descriptions. And indeed descriptions of English that I've seen have always preferred the simple description (two stop series, no phonemic length, two degrees of stress, syllable structure) rather than the complex one (no syllable structure, three or more stop series, phonemic length on vowels and resonants, five degrees of stress, etc).

...and I see now that the brief restatement is itself foolishly long, so feel free to ignore both posts, I guess. But hopefully this has at least made my thinking clearer to any future passers-by.

Post by **Creyeditor** » 10 Jul 2023 19:54

Just wanted to write a short reply to say that I do not feel personally attacked at all. I don't have the time to write a long and detailed reply now and I don't feel joy defending a theory that I am not convinced by myself. I will write a response to the parts adressed at my presentation of the data but I will not manage any earlier than in two weeks. Just one last tidbit that was implicit in my last post. I personally am convinced by your deletion analysis and I cannot see any good argument speaking against it.

sevenorbs · Post by **sevenorbs** » 16 Jul 2023 21:19

Hey, as an L1 speaker of Indonesian, I just want to say that my heart is filled with sprinkles when I see this post. As of the latest chapter, I have some comments and additions.

The elision of ə is the characterstic of some Indonesian lect. In Bahasa Baku (formal Indonesian), the ə is often retained. In contrast, I agree in regions that don't allow some complex C clusters, such as most of Java (where, among others, Javanese, Madurese, and Sundanese are spoken), the insertion of ə is often intensified, even so on loanwords. So stroom is [sətrum] (as in the dictionary word setrum, synonymous with listrik "electricity") and knop and knalpotten is [kənop] and [kənalpot]. But, at least in where I currently live, the word realization [knapa] and [knal] is afaik not commonly heard and often contrary realized as [napa] or simply [kənapa] albeit with a very short ə (but never missing entierely) as the result of the influence of Colloquial Indonesian, mostly of Jakartan dialect (the loss of the initial syllable with nucleus ə before liquid as in bəlum "have not" [blɔm]) or the influence of other languages (e.g. Sundanese [naha] (I don't personally know the etym, but it means the same thing) where the initial unstressed syllable from kənapa (kəna + apa "be affected/touched by what") is missing). In some speakers of Indonesia where ə is not the part of the language's phonemic inventory, [ə] may be realized as [ɛ] or [ə] in learned communities but rarely reduced to nothing.

As for dipthongs, PUEBI (the spelling guide issued by the National Board of Indonesian Language) listed 4 diphtongs [ai] [au] [ei] (in loanwords) [oi] (p. 4), but in colloquial speech, the diphtongs are often variably realized and fused(?) as in sə.mai "sprouts" as [sə.me] and da.nau "landlocked body of water" as [da.no]; some has more steps in variability as in cə.rai "to divorce, to separate" as [cə.rɛj] and ultimately [cə.re] and pan.dai "skillful, clever" as [pan.dɛj] and ultimately as doublet to pandai "master of the work", in some languages [pan.de] "skilled" as in pande besi "metalsmith". In some situations, the dipthong may be immune to fusion or in some cases may be morphed into two separate syllables. In your examples, bau "smell" can be realized as [ba.u] but rarely [baw] and instead fused to [bɔ]. By the way, the ⟨au⟩ in laut "sea" is not a diphtong.

As an addition of mengebom part, it's an interesting phenomenon, thanks for the article. Ketik is not considered a proper word, but it's really common in everyday usage and often synonymous with the more proper counterpart tik. Other examples are mengecat "to paint", mengembik "to bleat", and mengelas "to weld", but the root *kecat, *kembik, *kelas is never heard.

Although my knowledge of linguistics is very poor and of course is very far from the polish of the academy, I hope you all don't mind that I enjoy the discussion as well.

Post by **Creyeditor** » 16 Jul 2023 22:56

Wow, thanks for your response. It's a pleasure to hear from an L1 Indonesian conlanger

And thank your for the additional information, I did not know about mengelas foe example. Feel free to jump in anytime if I am talking rubbish, of course. Or if you have anything else to add

Post by **Creyeditor** » 02 Aug 2023 00:18

Okay, this is the first of the two posts that will hopefully help answer some of the questions raised by Sal.
First, let me make clear my intentions. I am confused by the syllable structure patterns and the associated phonological alternations in Indonesian. I do not have a full cross-dialectal dataset and I doubt that anyone has when it comes to schwa epenthesis. Sal's idea of all schwa's being underlying sounds very convincing to me, which I found surprising since there has been some work on the patterns. Therefore, I tried to look for the (historical) reason for the absence of such an analysis. This is what my short post on July 8th was about. I claimed that there are theory internal reasons why nobody proposed this. I also said, that I do not think that these historical and theory-internal reasons are somehow logically solid or convince me in any way. Adding to this now, there is a strong urge in Austronesianist traditions to adhere to the idea that all roots are bisyllabic at some level of representation because this unifies the assumptions made about different Austronesian languages.
I initially planed to not answer to the questions posed by Sal on the generative-structuralist tradition here because I do not enjoy defending a point that I am not convinced of. The idea of letting this slide gave me a bit of a belly ache, so I thought I'll just try to be maximally helpful by trying to clarify stuff that I said and linking to publicly available online literature that makes the claims that I said that people make. I hope this is somehow helpful to some people. If you want to ask questions on any of the literature linked, please do so in this thread. There will be a second post where I will try to rectify the ways in which I presented the data. This might include more speculation that I like but I'll try to quantify my confidence on the data. This is really not a university lecture, just anecdotal evidence for the most part.

Salmoneus wrote: ↑09 Jul 2023 02:24
Creyeditor wrote: ↑08 Jul 2023 23:49 I thought about it for some time now and I haven't really had an epiphany, but I feel like I won't have one anyway. I suspect that the reason that noone has proposed such an analysis is cover structuralist-generative thinking in most branches of theoretical and descriptive phonology. If a segment A (or more general any phonological property) is predictable on the surface, then it does not need to be memorized and therefore is not part of the underlying representation.
This is logically incoherent, though.

Yes, I agree.

Salmoneus wrote: ↑09 Jul 2023 02:24 There is no such thing as, in the abstract, a feature being 'predictable' - one feature can only be predictable from (or relative to) another feature or set of features. But in order for the first feature to be predictable from the second feature (or set of features), there must be a 1:1 equivalence between the two features (or sets of features). Otherwise, there would be cases where one feature would be present without the other, or vice versa, making its appearance not truly predictable from the other. However, equivalence is not directional, by definition: if A and B are in a 1:1 equivalence, then B and A are in a 1:1 equivalence also. That means that if A can be predicted from B, then necessarily B can be predicted from A. [this may not casually seem true if 'B' is incorrectly defined - eg as a single feature rather than a compound feature].

There seems to be some sort of misunderstanding here. The basic idea for predictable schwa epenthesis is: certain consonant clusters (and monosyllabic roots in certain contexts) do not exist in Indonesian. They are absent. If we assume that these are repaired by schwa epenthesis, the schwa becomes predicatble in these contexts. If we know what the consonant cluster is in the underlying form, we can predict if schwa epenthesis would apply or not. Schwa epenthesis is thus predictable from the underlying consonant sequence you look at. Similarly, in your account you could say that the application or non-application of schwa-deletion is predictable from the consonant cluster that would result.

Salmoneus wrote: ↑09 Jul 2023 02:24 So if you can create an "underlying" (i.e. superimposed) representation (the term 'representation' should be enough to show that the imagery of it being 'underlying' is misleading! Representations do not underly the things they are representing!) in which all cases of A are deleted, because mentioning A is superfluous if you're mentioning B, then equally you can also create a representation in which all cases of B are deleted instead, and only A is mentioned. There is never only one possible representation. [this is a pretty basic and familiar thought-experiment philosophy of science in general, not just in linguistics - facts always underdetermine theory, and even given infinitely detailed facts there would still be an infinite number of incompatible theories that would equally explain those facts]. It's actually used actively in logic and computing and so forth. A nice example is the way that description of propositions using 'and', 'or' and 'not' can be rewritten with either only the sheffer stroke OR only the pierce arrow, and whichever you choose then becomes predictable (as the only possible operator) and can be deleted, leaving only bracketing. So P->Q can be written EITHER as P(QQ) (using and deleting out the stroke) OR as ((PP)Q)((PP)Q) (using and then deleting the arrow). Alternatively if a directional convention for reading the scope of either the stroke or the arrow is used, then the operator can be retained and the brackets, which are now predictable from operator placement, can be elided, so that former, for instance, could be rewritten PQQ||. All these representations are incompatible with one another, but all of them are equivalent to one another.

Or, in linguistic terms: we don't usually mark delayed voicing in vowels as phonemic in English, because it's predictable from consonant aspiration. A representation including aspiration, therefore, doesn't need to mark delayed voicing in vowels. However, it's equally valid to mark delayed voicing in vowel and NOT mark aspiration, because aspiration can be predicted from delayed voicing (in at least some dialects - I'm guessing there are some where it gets complicated with final consonants not followed by a vowel).

So the argument "you shouldn't notate A in your representation because it's predictable from X" is a bad one, because it could equally be used as a reason to notate A and not X.

In this case, you're also assuming that the presence of schwas is a feature. It is equally conceptually valid, however, to say that the absence of schwas is a feature! If you can predict the presence of schwas, you can also predict their absence. It cannot be more inherently wrong, therefore, to not mark the schwas than to mark all schwas and not mark their absence!

[in concrete terms, an example: if only stressed syllables in a language can have vowels, there are (more than) three equally valid and equivalent ways to represent the words [kbat] and [kabt] - you could say that these are /kbat/ and /kabt/; you could say that these are /ka'bat/ and /'kabat/, or you can say that they are /k'bt/ and /'kbt/. In one system, stress is not marked, as it is predictable. In another, stress is marked, and all syllables are shown with vowels; in the third, stress is marked, and all syllables are shown without vowels (in this third system there would have to be an alternative way to mark non-/a/ vowels, of course). The choice between systems is not meaningful or significant - there can definitionally be no truth of the matter as to which is 'correct' or 'underlying'. In a sense, all three systems ARE the SAME system, just with different notational conventions, which purely comes down to the convenience of the individual user!]

Or, short version: assuming no schwas and creating rules to predict epenthesis is no more valid, no more logical, no more true, and no more 'generative' than assuming many schwas and creating rules to predict syncope. And, indeed, descriptions of countless other languages do include rules for syncope, which often vary depending upon the dialect.

[imagine how awkward it would be to insist on describing English with no schwas at all, and giving each dialect (and sometimes speaker) predictable rules for 'epenthesis'... it would be possible, of course, but utterly counterintuitive and unnecessarily awkward!]

You might be surprised that the idea of having competing analyses be equally 'real' or 'true' within the mental representation of one and the same speaker is a relatively new idea in theoretical phonology (Smolensky & Goldrick 2016). Of course, it is well-known that several representations or analyses can describe the same set of data. But somehow, the solution was usually not to say that all of these are 'equally true'. Instead, people came up with so-called evaluation metrics in order to decide between different analyses. Since these usually talk about the length of all underlying representation in some way or another, they usually favor epenthesis analyses. Here is a recent paper on such evaluation metrics applied to Optimality Theory: Rasin & Katzir 2016. Here is a recent paper that talks about the difficulty of deciding between the two kinds of analyses and reanalyses many patterns as deletion: Morley 2015.

(If you are looking for literature on this, I think Morris Halle's 1959 The Sound Pattern of Russian is the earlist clear example of this in generative phonology, but it's hard to digest, at least for me). At least some schwas are predictable in most dialects, so they must be epenthetic.
This, with respect, is VERY dodgy reasoning! "Some instances of X are predictable for many speakers" is just a long way of saying "X is NOT predictable". Either it is or isn't predictable - it can't be sometimes or partially predictable!

I think this might be a terminology issue again. See my above comment that might clarify things. The other thing is that people often treat different speakers as different languages. I was basing my ideas mainly on Cohn (1989), which is unfortunately behind a paywall. But even others authors, that assume that schwas are underlying in some contexts, like Lapoliwa (1981) and Adiasmito (1993) assume that schwas are only epenthetic in some contexts. For Lapoliwa they are only epenthetic if they serve to rectify root minimality violations after certain prefixes and for Adiasmito they are only epenthetic in loanwords. Both assume that schwas are underlying in all other contexts.

Additionally, syllable structure is generally predictable and not contrastive and therefore usually assumed to be absent in underlying forms and present in phonological surface forms (and absent again in phonetic outputs). (This is actually a hotly debated topic in Optimality Theory literature because of well, certain misunderstandings and covert structuralist-generative thinking in an interactionist framework.)
In Indonesian, you mean? Because obviously that's not true of, say, English, in which syllable structure is contrastive (as it is in countless languages). Indeed, in English syllable structure is incredibly important, as a huge array of other rules depend on it. That is to say: if we refuse to notate syllable structure, we need to notate a whole bunch of other things instead.

[for instance: normally English is notated with two stop series. But aspiration depends on syllable structure, so without syllables you need three series, not two, in order to distinguish /plVmp.aI/ from /plVm.paI/ (or /p_hlVmpaI/ from /p_hlVmp_haI/, if you insist!). But you'd also have to distinguish length in vowels - or, for some dialects, THREE vowel lengths! - because phonetic vowel length is also determined by syllable structure. So normally we could write /sElf.IS/ and /SEl.fIS/, but without syllable structure you'd need to say /sElfIS/ and /SE:lfIS/, as the latter has a phonetically longer vowel than the former. You'd also, likewise, have to notate two values of /r/, two values of /l/, multiple values of /t/ and /d/ and a length distinction in nasals, and you'd also have to throw a bunch of additional /t/ and /d/ into some words (eg writing 'fence' as /fEnts/, but 'inside' as /InsaId/ - whereas normally we can just show this through syllable structure). That's a whole bunch of things you'd have to add to English phonology - some of which would vary wildly with dialect - when you could just notate syllable structure instead]

[[these examples and this argument is from Wells, fwiw. Wells actually argues that syllable structure itself can be done away with in favour of merely a complicated set of rules based on recognising five phonemic degrees of stress, but I'm not sure that that's an improvement, and that still requires marking morpheme structure independently]]

And again: in all these examples, yes, English syllable structure is predictable from the values of other phonemic contrasts. But equally and oppositely, the values of those phonemic contrasts is predictable from syllable structure. So it makes no sense to say that this means we mustn't notate syllable structure!

Again, I want to make it very clear that I do believe that syllable structure can be constrastive. For me, the easiest argument comes from reduplications of different sizes. If a language contrasts reduplications of different syllabic sizes these are contrasts that should be encoded as syllablic sizes. But again, you might be surprised that there is a large and vocal majority in theoretical phonology that considers contrastive syllable structure to be absent from all languages. This boils down to an argument similar to yours. In all languages, syllable structure is predictable (in the sense I tried to clarify above) from other phonological contrasts in all languages. Segmental contrasts on the other hand can be contrastive independent of syllable structure. If we try to restrict our universal gamut of phonological objects for underlying forms (which is a goal in and of itself in many branches of generative phonology, related to the idea of a Universal Grammar) and we want to kick out either segmental contrasts or syllabic contrasts, we can only kick out syllables. Here is a somewhat older textbook chapter arguing for the absence of underlying syllable structure: Blevins 1995 (argument starts on page 90).

If syllable structure is absent from underlying forms, Indonesian cannot be limited to CVC syllables in underlying forms and therefore schwas are only present if needed for surface syllable structure.
But here we're venturing, surely, into realms of the absurd? The idea of limitations in permissable syllable structure is central to the description of countless languages, if not all of them. Immense - and completely pointless - efforts would have to be gone to to explain real languages without such rules.

I find it difficult to believe that even generative linguists would have so little awareness of, or interest in, linguistic realities.

Again, you might be surprised.

Salmoneus wrote: ↑10 Jul 2023 13:48 2: the argument that if a feature, A, is predictable from another feature, B, or set of features [C,D], and therefore does not need to memorised by speakers, and therefore is not part of the underlying representation, must be wrong. This is because predictability is symmetrical: if A is predictable from B, then B is predictable from A (when A and B are defined suitably), or at least from a set of features [A, X] containing A. Any time it could be true to say "A does not need to be memorised, and therefore is not part of the underyling representation", it would be equally true to say "B does not need to be memorised, and therefore is not part of the underlying representation". And since both of these cannot be true at the same time, neither of them can be true.

As I said, this might be a terminology issue or my wording was particulariy unclear. See also my comments on competing analyses above.

Salmoneus wrote: ↑10 Jul 2023 13:48 3: the reason for the apparent paradox is of course that the idea of an 'underlying representation' is inaccurate and misleading. Phonemic representations do not underly reality; they only represent it - a phonemic transcription of a language is for the convenience of linguists. There is no one correct such transcription; preferred transcriptions will vary with the needs and tastes of the linguist who happens to be transcribing. In this case, if A is predictable from B, linguists may omit it from a phonemic representation; but they do not have to, and the result is not the phonemic representation. It would alternatively be possible to instead omit B from their representations, if that were more to their taste (this may be influenced by which transcription is more convenient to type, or which transcription allows more intuitive comparisons to a related language, for instance).

Again, I am not stating this as my own believe but generative phonologists generally believe that some underlying representations (or lexicons of inputs) are 'better models' or 'more mentally real' than other underlying representations. Evaluation metrics are just one way of comparing these but there sure must be others.

Salmoneus wrote: ↑10 Jul 2023 13:48 4: to the extent that phonemic analyses may have some reflection in the linguistic processes of speakers - that is, to the extent that speakers themselves are continually subconsciously analysing their own language in a way similar to a linguistic researcher - it is evident that speakers probably individually, and certainly as a whole, operate multiple rival phonemic analyses. After all, another term for 'predictability' is 'redundancy', which is an important design feature of language. If A and B always go together, there is no need for a speaker to think of one as more 'underlying' than the other. It is only when they hear a speaker use A and not B, or vice versa, that they are forced to decide for themselves if this is or isn't an example of the word that normally has both A and B in it - that is, then they must decide which of A and B is more criterial. But there is no reason why a speaker must have a definite and consistent opinion in this regard (especially as in practice sub-phonemic features E, F and G are usually present to provide guidance). There is absolutely no reason why a population as a whole would have a consistent and definite opinion. In fact, we know that they do not, as "re-analysis" of criterial vs incidental features, with opinions that change across time or between different parts of the language community, is essentially how we get language change - how we get languages, in other words. In any case, an investigation of how actual speakers analyse their language is psychological, or at the very least depends on detailed study of how a given speaker responds to abnormal linguistic prompts (i.e. whether they notice deviant wordforms and which correct wordform they interpret them as being instances of) - it could not possibly be deduced from a description of the language itself.

Again, I am stating someone else's believe here. If evaluation metrics evaluate some kind of quality of an analysis, then they are independent of psycholinguistic evidence. Also, see the Smolensky and Goldrick paper for how recent the idea of a single speaker not being sure about representations has entered the realm of theoretical phonology. Also, I think most linguists would not say that Indonesian dialects have some kind of uniform phonological grammar, instead most would probably argue that they have potentially different underlying forms and different phonological rules (or constraints or whatever), see the Adiasmito paper cited above for an example of how this might work.

Salmoneus wrote: ↑10 Jul 2023 13:48 5: in this particular example, it seems as though schwa is explicitly not predictable, in any case, as its presence varies between dialects and seems to be what you are having difficulty explaining (although, to repeat the first point, I'm not entirely clear on what words you're finding hard to explain). We cannot simultaneously argue that A is predictable and thus can be ignored and that it is unpredictable and problematic. It is not enough to say that A is sometimes predictable, since this means that it is sometimes not predictable, and this unpredictable residue could not therefore be factored out in any valid representation.

Again, this might be terminology or bad wording on my part. The schwa is not generally predictable but the absence of certain consonant clusters is (and some kind of minimality). The schwa is then inserted as a repair.

Salmoneus wrote: ↑10 Jul 2023 13:48 6: you seem to suggest that the syllabic structure of words should not be transcribed in phonemic descriptions of a language. Obviously this is a matter of personal choice, as I've said. However, it is important to say that refusing to accept syllable structure as phonemic will often require grotesquely baroque and inconvenient phonemic descriptions, requiring a great profusion of phonemes, many of which may have perplexingly varied realisations transdialectically. The basic principles of science and rationality strongly discourage such analyses, in favour of the far simpler analyses that incorporate syllable structure as phonemic, unless there is a very strong motivation why such descriptions would be preferred in a certain paper. I can't see any serious reasons - conceptual or practical - for intentionally preferring the less parsimonious descriptions. And indeed descriptions of English that I've seen have always preferred the simple description (two stop series, no phonemic length, two degrees of stress, syllable structure) rather than the complex one (no syllable structure, three or more stop series, phonemic length on vowels and resonants, five degrees of stress, etc).

I am not suggesting this at all. This is close to putting word in my mouth, but I can see where the confusion is coming from. The scope of my warning was not clear in the July 8th post. I was trying to convey the idea that there are many people that argue in favor of the absence of underlying syllable structure, see the Blevins textbook chapter cited above. Even then, generative phonology is largely based on the idea that there is no intermediate 'phonemic' level between underlying structure (what structuralist called archiphonemic) and surface structure. Don't ask me what the exact different between underlying and phonemic structure is though, I am still trying to figure that out. But maybe this helps you if you journey into the wild internet in order to try to understand why generative phonologists do what they do.

I hope this was somewhat helpful and not rude. Again, I really tried my best to convey other people's believes. I do not usually agree with them but I think it's easier for you to understand them if I try to present them in a somewhat objective way. I purposefully left out some stuff that you (Sal) were commenting on and I will try to come back to it as soon as possible in a second post. RL has been crazy, so it might take some time until I find two hours to compose another long-ish post. You might see me posting shorter stuff across the CBB. This does not mean that I forgot this thread.

Even though I was kind of nervous before composing this post, I am happy how it turned out and I might kind of enjoy this discussion after all

Visions1 · Post by **Visions1** » 02 Aug 2023 04:33

Someone here recommended (several years ago...) that I ask about the TAM of Indonesian. So how does that work?
Don't worry about typing too much. I want to know as much as I can.

Post by **Creyeditor** » 02 Aug 2023 22:26

The post after the next post will be about TAM. Thank you for the suggestion. Standard Indonesian TAM is something I wanted to talk about before. Even though there is no morphological marking, the system is relatively complex.

Post by **Creyeditor** » 21 Aug 2023 00:04

Okay, so I will try to write a post that better conveys why Indonesian syllable structure is interesting to me. I think this boils down to two parts

Syllable structure is highly variable between different speakers, different contexts and different dialects.
I am not sure how minimal word requirements interact with syllable structure.

I also thought, I'd provide a table like thing with words in different dialects to show how much variation there really is. I will refer to the varieties as A, B, C following Adisasmito and say something about what I think about the contexts varieties are used in at the end.

First, onset complexity varies dramatically between the three varieties. In the first variety, onsets can be /s/+obstruent+liquid. In variety be /s/ is not allowed before complex onsets, so you maximally get obstruent+liquid, whereas in variety C only simple onsets are allowed. Note that the variation is attested in loanwords and native word

A, B, C
stri.ka, sət.ri.ka, sə.tə.ri.ka `'iron'
strom, sə.trom, sə.tə.rom `electricity'
skrip.si, sə.krip.si, sə.kə.rip.si `thesis'
brang.kat, bə.rang.kat, bə.rang.kat
cla,ka, cə.la.ka

Only variety A allows complex coda, namely a liquid followed by a nasal or a consonant followed by /s/. Both B and C do not allow any complex codas but both allow simple codas.

A, B, C
helm, heləm, heləm `helmet'
film, filəm, filəm `film'
kom.pleks, kom.plek, kom.pə.lek `complex'

This maybe leads to my first problem. If C allows simple codas, why does it insert two schwas inside complex onsets? [sət.ri.ka] should be okay for variety C if it allows CVC syllables. Maybe there is a solution, but I really don't see it now.

Second, the variation described above does not apply to all schwas. Some schwas show up in all three varieties. Note that these words would be well-formed in all three varities, even without the schwa. If we assume that all schwas are underlying and are deleted between two consonants if this would lead to a valid cluster, these forms are not expected. Similarly, if we assume that schwas are only inserted to break up illegal consonant clusters, these forms are unexpected.

A, B, C
ga.mə.lan, ga.mə.lan, ga.mə.lan `gamelan'
u.pə.ti, u.pə.ti, u.pə.ti `tax'
e.kə, e.kə, e.kə `I, me (slang)'
ə.lang, ə.lang, ə.lang `eagle'

Third, and this became much clearer to me after reading the Adisasmito again, the tendencies for schwas to surface are opposite in loanwords and native roots. In native roots, higher register and slow speech tempo condition more schwas, whereas in loanwords it's the other way round. Loandwords have more schwas in lower registers and higher speech tempo. Of course, loanword phonology can be different from native phonology, I think this is still surprising.
Maybe this is also the place to talk about what varities A, B, and C are. Adisasmito posits that most speakers actually mix the three based on the language-external factors mentioned above. In my own experience, C is found in rural areas of Java, B is close to the standard, and C is what is used in Papua (which is by no means an exhaustive list.)

Third, how does it interact with the rest of the phonology. I will ignore stress, since it varies between varieties and generally ignores schwas in most dialects. Root minimality though is interesting in its own right. It seems to hold for certain affixed verbs but not for nouns. bom is a possible noun but its active verb form is mengebom, as mentioned in a previous post. Adisasmito mentions in a footnote that schwas are more likely to occur in otherwise monosyllabic roots, but even for speakers that do not have a schwa here, no problem seems to occur for root minimality. It seems that even though syllable structure varies between different speakers and contexts, the number of syllables is the same for all speakers. Which of course is elegantly solved if we assume that all speakers have access to some representation where all schwas are present, as Sal suggested. I also have to admit that I haven't seen/heard a lot of data that could be relevant here, since Papua Indonesian, my go-to variety for complex clusters without schwas, uses less morphology than the other varieties and so we have less forms like mengebom in actual speech.

I hope the data presentation was slightly better now and the problems that I had with the data became clearer. If not, feel free to ask more questions of course.

The CBB

AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian

Re: AMA on Indonesian