CALS vs WALS: A Comparison
CALS vs WALS: A Comparison
Hi, I'm PTSnoop. You may vaguely remember me from the previous CALS vs WALS thread almost two years ago.
I originally set out to do a statistical comparison of the conlangs on CALS and the natlangs on WALS, to see what intriguing features of natural languages we conlangers tend to pay more/less attention to. Then I found out that the CALS numbers I was using had natlangs mixed in with them, realised I'd have to redo all my graphs, ran out of round tuits, got thoroughly distracted by other things, then some time later noticed I had a whole bunch of Conlangery episodes sitting in my rss feed reader and got thoroughly distracted back into the world of conlangs once again.
So here's the much-delayed more-accurate version of the CALS vs WALS study, complete with the two extra years of CALS and WALS data. This time round, I'm armed with Python webscraping and matplotlib, rather than just copypaste and Libreoffice Calc - so if I find out I'll have to redo all my numbers again, then it should be a much quicker process!
In the graphs below, green represents CALS/conlangs, blue represents WALS/natlangs. The darker-coloured bars represent the difference between the two, showing what percentage of conlangs "should" have this feature and don't, or "shouldn't" have this feature and do (as it were).
As before, I'll be splitting things up into multiple posts.
---
PART 1: PHONOLOGY AND MORPHOLOGY
Phonology
Starting off with a surprise - compared to the numbers from two years ago, the phoneme inventory sizes for conlangs and natlangs match up really well.
In general, as I've been looking through these graphs, it seems like phonology's matched up pretty well, morphosyntax reasonably well, and the big statistical differences only really show up on the clause or sentence level questions. I've generally concentrated here on the graphs showing big differences between conlangs or natlangs - if I've missed out a graph, then that's likely to be because conlangs and natlangs have this feature in pretty similar distributions.
Slight tendency here towards large inventories. Possibly the larger more-interesting Indo-European vowel systems are pulling things off balance here.
I'd be tempted to describe this one as a tendency towards being consistent - if you've got voiced plosives, then you'll be voicing the fricatives too. But that wouldn't explain the lack of no-voicing-distinction languages... Maybe people just like voicing.
Here we see more conlangers going for the more "interesting" vowel systems.
Last time, I remember being surprised that the tonal-conlang and tonal-natlang values weren't further apart. Since then, it looks like the values have got even closer - apparently there are a lot more tonal conlangs than I'm aware of. We could still do with some more, though!
Conlangers seem to favour irregular stress patterns here, with "none","both" and "don't know" all favoured at the expense of the humble trochee. Iambs seem to be holding their own, though.
Unsurprisingly, the big winner here is the conspicuously-English dental fricative /θ/, turning up in over 30% of conlangs surveyed but only 7% of natlangs.
Morphology
Not what I was expecting here. I'd have thought conlangs would generally be straightforwardly concatenative, but apparently here we see the opposite - more isolation and ablaut making things more interesting.
In general, conlangs have fewer categories per verb than natlangs - until we hit the extreme heights of 12-13 per word, where the kitchen-sink conlangs outnumber the natlangs. It seems conlangs are generally tending towards the extremes here.
Here, we see conlangs being neater and more regular than natlangs. We've got an excess of normal straightforward dependent-marking languages, and a lack of "inconsistent or other".
One of the clearest trends so far - conlangs don't really do reduplication anywhere near as much as natlangs do.
Two things to take away from this graph. First, that conlangers like explicit case-marking - possibly trying to be non-English again. Secondly, even if we look only at the languages that do have cases, we can see that the conlangs are weighted much more towards "no syncretism" than the natlangs.
---
Coming soon: Nominal Categories and Nominal Syntax.
I originally set out to do a statistical comparison of the conlangs on CALS and the natlangs on WALS, to see what intriguing features of natural languages we conlangers tend to pay more/less attention to. Then I found out that the CALS numbers I was using had natlangs mixed in with them, realised I'd have to redo all my graphs, ran out of round tuits, got thoroughly distracted by other things, then some time later noticed I had a whole bunch of Conlangery episodes sitting in my rss feed reader and got thoroughly distracted back into the world of conlangs once again.
So here's the much-delayed more-accurate version of the CALS vs WALS study, complete with the two extra years of CALS and WALS data. This time round, I'm armed with Python webscraping and matplotlib, rather than just copypaste and Libreoffice Calc - so if I find out I'll have to redo all my numbers again, then it should be a much quicker process!
In the graphs below, green represents CALS/conlangs, blue represents WALS/natlangs. The darker-coloured bars represent the difference between the two, showing what percentage of conlangs "should" have this feature and don't, or "shouldn't" have this feature and do (as it were).
As before, I'll be splitting things up into multiple posts.
---
PART 1: PHONOLOGY AND MORPHOLOGY
Phonology
Starting off with a surprise - compared to the numbers from two years ago, the phoneme inventory sizes for conlangs and natlangs match up really well.
In general, as I've been looking through these graphs, it seems like phonology's matched up pretty well, morphosyntax reasonably well, and the big statistical differences only really show up on the clause or sentence level questions. I've generally concentrated here on the graphs showing big differences between conlangs or natlangs - if I've missed out a graph, then that's likely to be because conlangs and natlangs have this feature in pretty similar distributions.
Slight tendency here towards large inventories. Possibly the larger more-interesting Indo-European vowel systems are pulling things off balance here.
I'd be tempted to describe this one as a tendency towards being consistent - if you've got voiced plosives, then you'll be voicing the fricatives too. But that wouldn't explain the lack of no-voicing-distinction languages... Maybe people just like voicing.
Here we see more conlangers going for the more "interesting" vowel systems.
Last time, I remember being surprised that the tonal-conlang and tonal-natlang values weren't further apart. Since then, it looks like the values have got even closer - apparently there are a lot more tonal conlangs than I'm aware of. We could still do with some more, though!
Conlangers seem to favour irregular stress patterns here, with "none","both" and "don't know" all favoured at the expense of the humble trochee. Iambs seem to be holding their own, though.
Unsurprisingly, the big winner here is the conspicuously-English dental fricative /θ/, turning up in over 30% of conlangs surveyed but only 7% of natlangs.
Morphology
Not what I was expecting here. I'd have thought conlangs would generally be straightforwardly concatenative, but apparently here we see the opposite - more isolation and ablaut making things more interesting.
In general, conlangs have fewer categories per verb than natlangs - until we hit the extreme heights of 12-13 per word, where the kitchen-sink conlangs outnumber the natlangs. It seems conlangs are generally tending towards the extremes here.
Here, we see conlangs being neater and more regular than natlangs. We've got an excess of normal straightforward dependent-marking languages, and a lack of "inconsistent or other".
One of the clearest trends so far - conlangs don't really do reduplication anywhere near as much as natlangs do.
Two things to take away from this graph. First, that conlangers like explicit case-marking - possibly trying to be non-English again. Secondly, even if we look only at the languages that do have cases, we can see that the conlangs are weighted much more towards "no syncretism" than the natlangs.
---
Coming soon: Nominal Categories and Nominal Syntax.
Re: CALS vs WALS: A Comparison
PART 2: NOUNS
Nominal Categories
Conlangs are distinctly lacking associative plurals. Apparently I'm not the only person who doesn't really understand what they are...
Here, we see conlangs being generally more interesting than natlangs. We're lacking some straightforward interrogative-based indefinite pronouns, in favour of our own special roots or custom systems.
And similarly for reflexives, with (perhaps) more conlangers prefering new words for new systems instead of already-existing ones.
Here we see conlangs strongly favouring regular ordinal systems ("one, two, three" and "oneth,twoth,threeth"), while natlangs prefer "inconsistent" mixed systems.
Here's that no-reduplication conlang tendency again.
Similarly to the reflexive pronouns, here we're seeing conlangers being more likely to create new words than use preexisting similar ones.
Possibly conlangers prefer not to use affixes to mark possession. Or possibly we're just seeing a bias in the natlang sample data here - the WALS chapter notes "that languages of this sort are proportionally underrepresented on the map; they are much more common than their frequency on the map might suggest".
Nominal Syntax
It seems that conlangers don't like multiple possessive classes. Possibly we're trying to get away from English here, with it's "of the" vs "'s".
A lot more natlangs than conlangs are perfectly okay with unmarked bare adjectives. Also, when we do have nouned adjectives, we'd rather have it marked on the adjective itself than have separate dummy-nouns.
And, once again, we see conlangers making new words rather than reusing old ones.
Coming soon: Verbs and Word Order.
Nominal Categories
Conlangs are distinctly lacking associative plurals. Apparently I'm not the only person who doesn't really understand what they are...
Here, we see conlangs being generally more interesting than natlangs. We're lacking some straightforward interrogative-based indefinite pronouns, in favour of our own special roots or custom systems.
And similarly for reflexives, with (perhaps) more conlangers prefering new words for new systems instead of already-existing ones.
Here we see conlangs strongly favouring regular ordinal systems ("one, two, three" and "oneth,twoth,threeth"), while natlangs prefer "inconsistent" mixed systems.
Here's that no-reduplication conlang tendency again.
Similarly to the reflexive pronouns, here we're seeing conlangers being more likely to create new words than use preexisting similar ones.
Possibly conlangers prefer not to use affixes to mark possession. Or possibly we're just seeing a bias in the natlang sample data here - the WALS chapter notes "that languages of this sort are proportionally underrepresented on the map; they are much more common than their frequency on the map might suggest".
Nominal Syntax
It seems that conlangers don't like multiple possessive classes. Possibly we're trying to get away from English here, with it's "of the" vs "'s".
A lot more natlangs than conlangs are perfectly okay with unmarked bare adjectives. Also, when we do have nouned adjectives, we'd rather have it marked on the adjective itself than have separate dummy-nouns.
And, once again, we see conlangers making new words rather than reusing old ones.
Coming soon: Verbs and Word Order.
Re: CALS vs WALS: A Comparison
Thank you for making this thread, it's quite interesting to get this perspective, especially in such an accessible format!
Re: CALS vs WALS: A Comparison
i second this, really amazing and helps a lot with avoiding a relex by making me think about grammar more deeply!Aszev wrote:Thank you for making this thread, it's quite interesting to get this perspective, especially in such an accessible format!
eventually ill work out a good conlang :)
- k1234567890y
- mayan
- Posts: 2402
- Joined: 04 Jan 2014 04:47
- Contact:
Re: CALS vs WALS: A Comparison
Besides from trying to Make a more "logical" and "ideal" language(which probably explains why conlangers tend to make more words, another possible reason is that they simply didn't notice the nature of certain words), many of the biased use of features are probably due to that conk angers tend to use features from languages of Eurasia especially those from European languages, and languages of Eurasia tend to be dependent marking and tend to use cases.
Also, it seems that it is unlikely for a person who is neither a linguist nor an anthropologist nor a person belonging to certain ethnic communities to know a non-mainstream language, which could be a reason for the biased use of certain features
Also, it seems that it is unlikely for a person who is neither a linguist nor an anthropologist nor a person belonging to certain ethnic communities to know a non-mainstream language, which could be a reason for the biased use of certain features
I prefer to not be referred to with masculine pronouns and nouns such as “he/him/his”.
Re: CALS vs WALS: A Comparison
How does WALS count languages?
It needs criteria to distinguish language and dialect.
Some facts may make some unusual features appear common. That includes proto-languages that were successfully spread to a large area, and sprachbunds that have lots of languages.
Another database is Phoible. It tells how common a phoneme is. But it does not show the coexistence of multiple phonemes and how frequent or limited a phoneme is in a language.
I am not a linguist, but it seems that conlangers are more familiar with English or European languages.
The IPA charts inspire conlangers to include more phonemes in their conlang. That explains a large number of conlangs that have front rounded vowels.
Most conlangers think that tones are weird or do not know how tones work, for that reason, tones are not popular.
The th sounds are popular because of the English language.
The chart of size of vowel inventories is very generic. It would be better if it separated each quantity instead of grouping them into intervals. The interval between 7 and 14 has lots of differences.
It needs criteria to distinguish language and dialect.
Some facts may make some unusual features appear common. That includes proto-languages that were successfully spread to a large area, and sprachbunds that have lots of languages.
Another database is Phoible. It tells how common a phoneme is. But it does not show the coexistence of multiple phonemes and how frequent or limited a phoneme is in a language.
I am not a linguist, but it seems that conlangers are more familiar with English or European languages.
The IPA charts inspire conlangers to include more phonemes in their conlang. That explains a large number of conlangs that have front rounded vowels.
Most conlangers think that tones are weird or do not know how tones work, for that reason, tones are not popular.
The th sounds are popular because of the English language.
The chart of size of vowel inventories is very generic. It would be better if it separated each quantity instead of grouping them into intervals. The interval between 7 and 14 has lots of differences.
English is not my native language. Sorry for any mistakes or lack of knowledge when I discuss this language.
| | | | |
| | | | |
Re: CALS vs WALS: A Comparison
Thanks for the kind words, guys!
k1234567890y: That's what I'd have expected - and yes, there are plenty of cases explained by "people write what they're used to". But we've got some other cases that - to my mind - are better explained by the opposite, "people try to write what they're not used to", even when the European / English way is actually more widely common (number of cases, for example). Pesky truth, resisting simplicity again.
Squall: yeah, more detailed conlang-natlang phonology comparison would be nice (and I had a go at something similar, back in the day - I expect it's dropped off the internet by now). Something for me to do once I've finished going through the WALS data, I guess!
---
PART 3: VERBS
Verbal Categories
Quite a marked difference here. The majority of natlangs are generally fine with having no grammatical perfective/imperfective distinction, but the majority of conlangs have the distinction marked.
Similarly, lots more natlangs than conlangs don't bother with the past/nonpast distinction.
Natlangs are more likely than not to have some sort of special system for prohibitives. Conlangs prefer to use the same grammatical systems they already have, and keep things regular.
Would that fewer conlangs had an optative!
For "can" and "may", apparently conlangers are less likely to use verbal constructions. Maybe we're trying to get away from English again.
Not entirely surprising here - we're seeing a split between "nothing marked" and "everything marked", with "only indirect evidentials marked" taking a back seat.
Word Order
Most natlangs have Genitive->Noun; most conlangs have Noun->Genitive.
But for adjectives, it's the reverse - natlangs are more likely to be Noun->Adjective, while conlangs are more undecided, with a roughly 50-50 split.
Conlangs are less likely to have demonstratives after the noun, though they seem undecided on what they'd prefer instead.
Similar to adjectives - conlangers would rather put them before the noun, natlangs would rather put them after.
Most natlangs either have question particles at the end of the clause, or just don't have them at all. But conlangers prefer a range of more interesting options - or maybe we just like to know whether we're hearing a question before we get to the end of it.
Natlangs prefer to give people more freedom in where they put their interrogative phrases - but conlangs are more likely to restrict things to the start of the clause, either all of the time or some of the time.
Coming soon: Clauses.
k1234567890y: That's what I'd have expected - and yes, there are plenty of cases explained by "people write what they're used to". But we've got some other cases that - to my mind - are better explained by the opposite, "people try to write what they're not used to", even when the European / English way is actually more widely common (number of cases, for example). Pesky truth, resisting simplicity again.
Squall: yeah, more detailed conlang-natlang phonology comparison would be nice (and I had a go at something similar, back in the day - I expect it's dropped off the internet by now). Something for me to do once I've finished going through the WALS data, I guess!
---
PART 3: VERBS
Verbal Categories
Quite a marked difference here. The majority of natlangs are generally fine with having no grammatical perfective/imperfective distinction, but the majority of conlangs have the distinction marked.
Similarly, lots more natlangs than conlangs don't bother with the past/nonpast distinction.
Natlangs are more likely than not to have some sort of special system for prohibitives. Conlangs prefer to use the same grammatical systems they already have, and keep things regular.
Would that fewer conlangs had an optative!
For "can" and "may", apparently conlangers are less likely to use verbal constructions. Maybe we're trying to get away from English again.
Not entirely surprising here - we're seeing a split between "nothing marked" and "everything marked", with "only indirect evidentials marked" taking a back seat.
Word Order
Most natlangs have Genitive->Noun; most conlangs have Noun->Genitive.
But for adjectives, it's the reverse - natlangs are more likely to be Noun->Adjective, while conlangs are more undecided, with a roughly 50-50 split.
Conlangs are less likely to have demonstratives after the noun, though they seem undecided on what they'd prefer instead.
Similar to adjectives - conlangers would rather put them before the noun, natlangs would rather put them after.
Most natlangs either have question particles at the end of the clause, or just don't have them at all. But conlangers prefer a range of more interesting options - or maybe we just like to know whether we're hearing a question before we get to the end of it.
Natlangs prefer to give people more freedom in where they put their interrogative phrases - but conlangs are more likely to restrict things to the start of the clause, either all of the time or some of the time.
Coming soon: Clauses.
- eldin raigmore
- korean
- Posts: 6356
- Joined: 14 Aug 2010 19:38
- Location: SouthEast Michigan
Re: CALS vs WALS: A Comparison
This is a great thread! Or, at any rate, I'm very interested. Thanks! I don't want to forget to finish reading it.
My minicity is http://gonabebig1day.myminicity.com/xml
Re: CALS vs WALS: A Comparison
PART 4: CLAUSES
The case-marking alignment graphs look similar for noun-phrases and pronouns - strong trend away from "neutral" (no case marking), with the excess split between nom-acc and active systems.
Conlangs here are more likely than natlangs to have subject-position pronouns, either obligatory or optional. For natlangs, subject-marking on the verb is much more common.
Here we see the opposite trend to some of the earlier graphs - natlangs prefer the more complicated polypersonal agreement, while conlangs are perfectly happy to go for the simple boring no-marking option.
And while that tendency towards no-person-marking throws this next graph a bit, there's still another clear trend here - conlangs with person marking are more likely to choose the slightly-off-the-wall "some third person singular verb-markings are zero while others aren't". Why this is the case, I'm not sure.
Conlangs are more likely than not to have a passive voice. Natlangs, on the other hand, are more likely not to have one.
Here, again, we see conlangs tending towards either "none of this feature" or "all of this feature", whereas more natlangs have a more moderate approach.
Conlangs are more likely to give negative clauses the same structure as indicative clauses, eschewing the asymmetrical systems - and even more eschewing the weirder inconsistent systems - that lots of natlangs have.
Natlangs are overwhelmingly more likely to have their negative indefinite pronouns require the clauses to be negated too. Conlangs, on the other hand, shy away from these "double negatives". I blame the Victorian grammarians.
Natlangs here have a whole bunch of different ways to say "he has the book". Conlangs, though, tend to avoid using topic markers or conjunctions for this kind of phrase, though, preferring a separate verb "has" or some sort of genitive construction.
Conlangers like European-style comparative particles ("than"), at the expense of locational markers or conjoined noun phrases.
Coming soon: Sentences.
The case-marking alignment graphs look similar for noun-phrases and pronouns - strong trend away from "neutral" (no case marking), with the excess split between nom-acc and active systems.
Conlangs here are more likely than natlangs to have subject-position pronouns, either obligatory or optional. For natlangs, subject-marking on the verb is much more common.
Here we see the opposite trend to some of the earlier graphs - natlangs prefer the more complicated polypersonal agreement, while conlangs are perfectly happy to go for the simple boring no-marking option.
And while that tendency towards no-person-marking throws this next graph a bit, there's still another clear trend here - conlangs with person marking are more likely to choose the slightly-off-the-wall "some third person singular verb-markings are zero while others aren't". Why this is the case, I'm not sure.
Conlangs are more likely than not to have a passive voice. Natlangs, on the other hand, are more likely not to have one.
Here, again, we see conlangs tending towards either "none of this feature" or "all of this feature", whereas more natlangs have a more moderate approach.
Conlangs are more likely to give negative clauses the same structure as indicative clauses, eschewing the asymmetrical systems - and even more eschewing the weirder inconsistent systems - that lots of natlangs have.
Natlangs are overwhelmingly more likely to have their negative indefinite pronouns require the clauses to be negated too. Conlangs, on the other hand, shy away from these "double negatives". I blame the Victorian grammarians.
Natlangs here have a whole bunch of different ways to say "he has the book". Conlangs, though, tend to avoid using topic markers or conjunctions for this kind of phrase, though, preferring a separate verb "has" or some sort of genitive construction.
Conlangers like European-style comparative particles ("than"), at the expense of locational markers or conjoined noun phrases.
Coming soon: Sentences.
- k1234567890y
- mayan
- Posts: 2402
- Joined: 04 Jan 2014 04:47
- Contact:
Re: CALS vs WALS: A Comparison
one thing more to say, some people may not really have read what the corresponding WALS article says when they try to answer the question about a feature for their conlangs in CALS
I prefer to not be referred to with masculine pronouns and nouns such as “he/him/his”.
Re: CALS vs WALS: A Comparison
You mean there is a higher likelihood of people misrepresenting their conlangs' grammar?
- k1234567890y
- mayan
- Posts: 2402
- Joined: 04 Jan 2014 04:47
- Contact:
Re: CALS vs WALS: A Comparison
such a chance does exist, and school grammar may sometimes be misleading in my opinion, but I don't know the likelihood, and it is possible that most conlangers interpret the choices offered by WALS/CALS accurately and I am wrong.clawgrip wrote:You mean there is a higher likelihood of people misrepresenting their conlangs' grammar?
I prefer to not be referred to with masculine pronouns and nouns such as “he/him/his”.
Re: CALS vs WALS: A Comparison
I think one has got to be pretty sweaty to put one's lang on CALS in the first place, so the risks are likely to be low.
- eldin raigmore
- korean
- Posts: 6356
- Joined: 14 Aug 2010 19:38
- Location: SouthEast Michigan
Re: CALS vs WALS: A Comparison
Sorry, but:Prinsessa wrote:I think one has got to be pretty sweaty to put one's lang on CALS in the first place, so the risks are likely to be low.
(1) what does "sweaty" mean in this context?
(2) risks of what are likely to be low?
My minicity is http://gonabebig1day.myminicity.com/xml
Re: CALS vs WALS: A Comparison
Nerdy! I.e. knowledgable. Seasoned conlanger. WALS isn't interesting to a lot of language lovers and CALS probably isn't interesting to a lot of conlangers except those who would use it themselves. Is my theory. I might be wrong!eldin raigmore wrote:Sorry, but:Prinsessa wrote:I think one has got to be pretty sweaty to put one's lang on CALS in the first place, so the risks are likely to be low.
(1) what does "sweaty" mean in this context?
Of people misunderstanding, which was discussed right above.eldin raigmore wrote:(2) risks of what are likely to be low?
Last edited by Prinsessa on 16 May 2015 09:17, edited 1 time in total.
Re: CALS vs WALS: A Comparison
Brilliant! You've put a lot of work into this and it's appreciated.
I'm going through these and answering for my conlang Ngolu. I want to see if I'm Doing My Part for conlangers so I'm going to count how many features are overrepresented versus underrepresented. Not that it matters in any case as I like my lang how it is, but I've had to read a lot of the WALS descriptions to get a lot of it, so it's been a learning exprience. Most of it is probably only of interest to me, so I'll hide most of it. There's a couple I have questions on though.
Ngolu:
Periphrastic Causative Constructions
Both (???) - Overrepresented
I think? I read the description on WALS over and over couldn't quite get my head around it. Is the difference simply whether there is some marker of purpose associated with the effect clause? Also, they say it has to be bi-clausal and then they use examples like (7) which is pretty clearly one clause.
I think (1) represents sequential and (2) represents purposive, yeah?
(1)
Kue ju jo hu nu.
cause he.NOM that.ACC go I.NOM
"He cause(s/d) me to go."
(2)
Go ju kuajo hu nu.
do.something he.NOM that.BEN go I.NOM
"He act(s/ed) so that I (would) go." (The benefactive is also used for purposes.)
The two sentences differ semantically. In (1), you know that I went. In (2) you only know that that was his aim, whether he was successful or not is not mentioned. And if I'm correct in saying (1) is sequential and (2) is purposive, why is English given as having only sequential???
Negative Indefinite Pronouns And Predicate Negation
Predicate negationalso present - underrepresented
Kka kaus u.
not eat NOM.3s.ANIM.NSPC
"Nobody ate." (Literally kind of like "Anybody didn't eat." The -s u can also be dropped.)
Kka kau mu.
not eat NOM.3s.ANIM.SPEC
"Somebody didn't eat." (Literally: "A specific-person didn't eat." With sufficient context, the mu can be dropped.)
Result - slightly more overrepresented than underrepresented. I'm a conlanger.
___________________________________________
In Intensifiers And Reflexive Pronouns, why the hell does Swedish get counted as identical but German not?!?!?!
Jag gjorde det själv.
Ich habe es selbst gemacht.
I did it myself.
Jag hatar mig (själv).
Ich hasse mich (selbst).
I hate myself.
It is much more common to use själv in Swedish than selbst in German, and maybe it's even becoming obligatory in normal Swedish, but even then, you don't say Jag gjorde det *mig själv, so the reflexives mig/dig/sig/oss/er själv(a) are still different from the intensifier själv(a).
EDIT: Added more.
I'm going through these and answering for my conlang Ngolu. I want to see if I'm Doing My Part for conlangers so I'm going to count how many features are overrepresented versus underrepresented. Not that it matters in any case as I like my lang how it is, but I've had to read a lot of the WALS descriptions to get a lot of it, so it's been a learning exprience. Most of it is probably only of interest to me, so I'll hide most of it. There's a couple I have questions on though.
Ngolu:
Spoiler:
Both (???) - Overrepresented
I think? I read the description on WALS over and over couldn't quite get my head around it. Is the difference simply whether there is some marker of purpose associated with the effect clause? Also, they say it has to be bi-clausal and then they use examples like (7) which is pretty clearly one clause.
I think (1) represents sequential and (2) represents purposive, yeah?
(1)
Kue ju jo hu nu.
cause he.NOM that.ACC go I.NOM
"He cause(s/d) me to go."
(2)
Go ju kuajo hu nu.
do.something he.NOM that.BEN go I.NOM
"He act(s/ed) so that I (would) go." (The benefactive is also used for purposes.)
The two sentences differ semantically. In (1), you know that I went. In (2) you only know that that was his aim, whether he was successful or not is not mentioned. And if I'm correct in saying (1) is sequential and (2) is purposive, why is English given as having only sequential???
Spoiler:
Predicate negation
Ngolu, like many of the languages mentioned, doesn't have specifically negative indefinite pronouns, so there is clause negation without there being a double negative.Natlangs are overwhelmingly more likely to have their negative indefinite pronouns require the clauses to be negated too. Conlangs, on the other hand, shy away from these "double negatives". I blame the Victorian grammarians.
Kka kaus u.
not eat NOM.3s.ANIM.NSPC
"Nobody ate." (Literally kind of like "Anybody didn't eat." The -s u can also be dropped.)
Kka kau mu.
not eat NOM.3s.ANIM.SPEC
"Somebody didn't eat." (Literally: "A specific-person didn't eat." With sufficient context, the mu can be dropped.)
Spoiler:
___________________________________________
In Intensifiers And Reflexive Pronouns, why the hell does Swedish get counted as identical but German not?!?!?!
Jag gjorde det själv.
Ich habe es selbst gemacht.
I did it myself.
Jag hatar mig (själv).
Ich hasse mich (selbst).
I hate myself.
It is much more common to use själv in Swedish than selbst in German, and maybe it's even becoming obligatory in normal Swedish, but even then, you don't say Jag gjorde det *mig själv, so the reflexives mig/dig/sig/oss/er själv(a) are still different from the intensifier själv(a).
EDIT: Added more.
Last edited by Imralu on 21 May 2015 15:16, edited 1 time in total.
Glossing Abbreviations: COMP = comparative, C = complementiser, ACS / ICS = accessible / inaccessible, GDV = gerundive, SPEC / NSPC = specific / non-specific, AG = agent, E = entity (person, animal, thing)
________
MY MUSIC | MY PLANTS
________
MY MUSIC | MY PLANTS
- eldin raigmore
- korean
- Posts: 6356
- Joined: 14 Aug 2010 19:38
- Location: SouthEast Michigan
Re: CALS vs WALS: A Comparison
Thanks! Now I know.Prinsessa wrote:Nerdy! I.e. knowledgable. Seasoned conlanger. WALS isn't interesting to a lot of language loves and CALS probably isn't interesting to a lot of conlangers except those who would use it themselves. Is my theory. I might be wrong!eldin raigmore wrote:Sorry, but:Prinsessa wrote:I think one has got to be pretty sweaty to put one's lang on CALS in the first place, so the risks are likely to be low.
(1) what does "sweaty" mean in this context?
Huh. I missed it the first time through. Thanks.Prinsessa wrote:Of people misunderstanding, which was discussed right above.eldin raigmore wrote:(2) risks of what are likely to be low?
My minicity is http://gonabebig1day.myminicity.com/xml
Re: CALS vs WALS: A Comparison
What would you need to make this easier? I'm the (sole) developer of CALS. A read only REST-API is planned, but not high up on the long TODO.PTSnoop wrote:Hi, I'm PTSnoop. You may vaguely remember me from the previous CALS vs WALS thread almost two years ago.
I originally set out to do a statistical comparison of the conlangs on CALS and the natlangs on WALS, to see what intriguing features of natural languages we conlangers tend to pay more/less attention to. Then I found out that the CALS numbers I was using had natlangs mixed in with them, realised I'd have to redo all my graphs, ran out of round tuits, got thoroughly distracted by other things, then some time later noticed I had a whole bunch of Conlangery episodes sitting in my rss feed reader and got thoroughly distracted back into the world of conlangs once again.
Re: CALS vs WALS: A Comparison
kaleissin: A CALS REST api would be a very useful thing to have. For me personally, though, I don't think it'd help all that much - I've already written a web scraper to pull all the raw numbers off the website. Also, this is the final set of graphs, so I'm pretty much done for now. Thanks for the offer, though!
---
PART 5: SENTENCES
Complex Sentences
Big trend here: conlangs are quite a lot more likely to use relative pronouns for relative clauses, and less likely to use the "gap strategy" of missing out the head noun.
Natlangs prefer their purpose clauses to have verb forms different to normal declarative clauses ("deranked"). Conlangs are more likely to use the same verb forms ("balanced").
Lexicon
Conlangs are much more likely to have distinct words for "hand" and "arm". New words for new things.
Conlangs are overwhelmingly more likely than natlangs to have strange and interesting numeral bases. Also, there are a fair few natlangs with restricted numeral systems - systems that don't have numbers much higher than 20 - but we hardly have any conlangs matching this category. This is probably just as well.
Conlangs strongly tend towards having more non-derived colour terms.
This graph shows the colour-word tendency more clearly - a strong bias towards "all the colour terms!"
And the same again here - the overwhelming majority of conlangs have separate words for green and blue, even though grue is significantly more popular among natlangs.
I'm surprised the conlang "other" category isn't larger here, to be honest.
And finally: lots more conlangs don't have para-linguistic clicks. Tut, tut!
---
If anyone's interested in a category that I've not covered here, the complete set of graphs can be found at http://sasha.sector-alpha.net/~ptsnoop/calswals2a/ .
---
PART 5: SENTENCES
Complex Sentences
Big trend here: conlangs are quite a lot more likely to use relative pronouns for relative clauses, and less likely to use the "gap strategy" of missing out the head noun.
Natlangs prefer their purpose clauses to have verb forms different to normal declarative clauses ("deranked"). Conlangs are more likely to use the same verb forms ("balanced").
Lexicon
Conlangs are much more likely to have distinct words for "hand" and "arm". New words for new things.
Conlangs are overwhelmingly more likely than natlangs to have strange and interesting numeral bases. Also, there are a fair few natlangs with restricted numeral systems - systems that don't have numbers much higher than 20 - but we hardly have any conlangs matching this category. This is probably just as well.
Conlangs strongly tend towards having more non-derived colour terms.
This graph shows the colour-word tendency more clearly - a strong bias towards "all the colour terms!"
And the same again here - the overwhelming majority of conlangs have separate words for green and blue, even though grue is significantly more popular among natlangs.
I'm surprised the conlang "other" category isn't larger here, to be honest.
And finally: lots more conlangs don't have para-linguistic clicks. Tut, tut!
---
If anyone's interested in a category that I've not covered here, the complete set of graphs can be found at http://sasha.sector-alpha.net/~ptsnoop/calswals2a/ .
-
- roman
- Posts: 1500
- Joined: 16 May 2015 18:48
Re: CALS vs WALS: A Comparison
When I get my language done it's going to help balance things out a little, due to my strategy of "let's do this just to be normal because I don't really care much about this part of the language". I do have one question that I'll have to be clear about once I have enough done before I put it in CALS: does WALS group epiglottals with pharyngeals? I thought they'd know the difference, but they talk about a "pharyngeal stop" or something like that so I don't know. My language has epiglottal sounds but not actual pharyngeal.
Also, the reason a lot of natlangs don't have a passive voice is because a lot of natlangs aren't nominative-accusative, which means they'll have some other voice like antipassive, not that you have to say every dang thing in active voice. The reason a lot of conlangs require relative pronouns in relative clauses is because a lot of conlangs don't conjugate verbs for person, number, gender, or whatever to show what role the person/thing would have in the relative clause. A lot of these things are correlative even in conlangs, there are just different tendencies in conlangs.
Also, the reason a lot of natlangs don't have a passive voice is because a lot of natlangs aren't nominative-accusative, which means they'll have some other voice like antipassive, not that you have to say every dang thing in active voice. The reason a lot of conlangs require relative pronouns in relative clauses is because a lot of conlangs don't conjugate verbs for person, number, gender, or whatever to show what role the person/thing would have in the relative clause. A lot of these things are correlative even in conlangs, there are just different tendencies in conlangs.
No darkness can harm you if you are guided by your own inner light