Tips on making xenolangs

A forum for guides, lessons and sharing of useful information.
Post Reply
User avatar
lurker
roman
roman
Posts: 1281
Joined: 28 Jul 2023 14:08
Location: The City of Eternal Noon

Tips on making xenolangs

Post by lurker »

I thought others might benefit from my personal approach to developing Commonthroat's phonology, so here's a post detailing how I did that.

Someone's bound to mention This Artifexian video, and while it does touch on a few things I did, they still use a primarily articulatory approach, and they still attempt to make the language approximately pronounceable by humans, which for me is a big no no.

Now in the case of Commonthroat, it's still using an oral medium that humans can interpret, even if they can't reproduce it. It's perfectly possible to come up with a language that uses a medium that humans have no access to at all, like modulated radio waves or releasing pheromones, but if you're working on something that humans can at least perceive, it makes it a little easier.

The first thing to do is to come up with a very high-level qualitative impression of the language. How would a person who is hearing or otherwise perceiving this language for the first time describe how it sounds? For example, Mandarin is chalk full of sibilants and is famously tonal, Russian paletalizes everything, (American) English is abundantly nasal. I wanted Commonthroat to sound like the noises a dog makes when it's dreaming.

OK, so we have a high-level description, the next question is, how do you come up with a phonology that gives that impression? The video linked above asks "How would a speaker of this language make this particular sound?" complete with IPA-esque charts listing manners and places of articulation. This was my first approach as well. I spent some time googling "dog vocal tract", but didn't get much I could use, especially since I'm neither a veterinarian nor a linguist.

The light bulb moment for me was when I realized that I was asking the wrong question. Instead of asking "How does a dog make a particular sound?" I should ask "what does each phoneme sound like?" without worrying about the anatomy needed to generate those sounds. That's a much simpler question to answer. Instead of slogging through scientific journals that I can't understand, I just had to listen to my dog as he slept, only supplementing that information with a few popular articles on dog vocalization to tie up loose ends.

So, for example, if you're creating a language for sapient neutron stars, you merely need to do some light research into what sorts of things neutron stars do, and ask yourself, at a high level, which of those things you could hammer out into a language. Neutron stars have jets of X-rays that they emit from their poles, and also experience star-quakes, so you could potentially turn those phenomena into a language. The key isn't to ask, "how could a sentient neutron star produce such and such?" but "How would an observer describe the patterns of X-rays the star is emitting?" or similar. No need to worry about the articulatory mechanisms at play.

Back to my doggo, my next step was to listen to the sorts of sounds he made while sleeping. He's quiet to a fault wile awake, to the point that I've forgotten about him in the back yard for hours because he wouldn't bark to be let back in, just sit silently at the door. (Don't worry, I've installed a camera pointing at the door so I can tell if he's ready to come in.) Anyway, he may be a mime when awake, but he's extremely vocal while asleep. After a few nights of observation, I came up with the following different noises: whines, yips, growls, and sighs through the nose.

I didn't feel that was quite enough to go on, so I thought about other sounds I've heard dogs make. My first guide dog, a golden retriever, would make these happy grunting noises whenever she greeted a human she recognized. That sounded like it could fit into the overall gamut of sounds without compromising the "dreaming dog" quality of the language, so I decided to add grunts as a category of sound. Of course, yinrih aren't just dogs. They're aliens that happen to look sort of like dogs, so I wasn't strictly limited to only sounds dogs can make. Tigers make a sound called a chuff that I find pleasant, so I decided to add that to the list.

Our bird's eye view of the phonology now consists of six sounds: whines, growls, grunts, sighs (which I call "huffs"), chuffs, and yips. But six sounds isn't a lot. That's still just over half the size of the smallest phoneme inventory for a human language. (Piraha and Rotokas I believe have something like 10 or 11 phonemes, depending on who's counting.) Herein lies the other aha moment for me. Instead of thinking in terms of atomic segments, think in terms of a feature space.

The first step here is admittedly a bit of a lazy shortcut, I decided to think in terms of syllables, specifically which sounds can serve as syllable nuclei (vowels) and which cannot (consonants). There's no reason why a xenolang would even have the concept of syllables. Indeed, it seems even in human linguistics the concept of a syllable is a "You know it when you see it" kind of thing. Anyway, I decided that huffs, chuffs, and yips shall serve as consonants, and whines, growls, and grunts shall serve as vowels.

Huffs, chuffs, and yips shall therefor be considered atomic, with no internal features beyond a vague qualitative description. Huffs are a sigh through the nose, chuffs are like huffs, but trilled, and yips are quiet little barks. Here's where some people may find my approach a little unsatisfying, especially if you want to produce audio samples of your language. We know what yipping sounds like in general, but at some point an obvious question comes up, how does a yip effect the sounds around it? And how is it effected in turn? This technique doesn't really help answer that question, and I'm left with the somewhat disappointing fact that I honestly can't say how a yip sounds on a technical level.

On to the vowels. We've got three broad vowel qualities, which I call "phonations": whines, growls, and grunts. The vowels are where the concept of a feature space really comes into play. What do I mean by feature space? Think of how you specify colors on a computer. The most common way is to specify how much red, green, and blue a particular color contains. Theoretically, you can define any color by specifying values for these three axes. So we need to think of axes that would define our vowels. Phonation itself can be considered an axis with three values: whine, growl, and grunt. Dogs can also change the pitch and volume of their vocalizations, so we can add two more axes to our feature space: tone (pitch) and strength (volume). You can have as many values on each axis as you like, but I decided to go with a rather coarse two values for each, with high and low tones, and strong and weak volumes (strengths). Remember, we're going for a high level qualitative approach. Don't worry about exactly how high or how loud. In the yinrih's case, they're fairly quiet even at their loudest, so even strong (loud) vowels are quiet by human standards. But is there some other feature we could add as an axis? Of course, length! It's everywhere in human language, and it's trivial to toss it in as a feature, with two values of short and long.

The feature space now has four axes: two lengths (short and long), two tones (low and high), two strengths (weak and strong), and three phonations (whine growl and grunt). This gives us a grand total of 24 vowels. With our three consonants (huffs chuffs and yips) pushing us up to a total phoneme inventory of 27 phonemes. Not too shabby!

But as the late Billy Mays would say: "I'M NOT DONE YET!. We've got our phoneme inventory, so it's time to start thinking about phonetactics. Let's circle back to the concept of syllables. Internally, syllables consist of an onset, a nucleus, and a coda. That nucleus need not be a single solitary vowel. We can dramatically increase our syllable count by using diphthongs, or as I call them in Commonthroat, Contours. There's no reason you can't just say any two vowels can form a contour, indeed, there's no reason you have to limit it to two vowels, but I wanted to be able to easily describe qualitatively how a syllable sounds, even if I can't tell you the nitty-gritty of how the sound is generated. I decided to come up with some phonetactic constraints to limit the number of possible contours. You can use whatever criteria you want when coming up with constraints, but my goal was to make it easy to programmatically generate a list of every possible syllable. With that in mind, I decided that there are two rules that govern which vowels can form contours. First, two vowels may not form a contour if they differ only in length. A short low weak growl and a long low weak growl cannot form a contour. Second, the two vowels must have the same phonation type. A short low weak growl and a short high weak growl can form a contour, but a whine and a growl cannot.

Since this process is all about getting the general vibe of how a language sounds, we should probably come up with a concise way of describing contours as well as simple vowels. We have two vowels, and each vowel has four features, one of which (phonation) will always be the same between them. So let's say that if the two vowels have the same value for a particular feature, we can simply describe them like a simple vowel with that feature. If both vowels are short, the contour can simply be described as short. If both vowels are high, the whole contour is high, and so on. If we want to be nit picky, we could clarify that a long contour is probably quantitatively longer than a long simple vowel, but the key to this approach is to use broad strokes, not get into the phonological weeds.

Contours with different tones are trivial to describe, since human linguistics already has a way of describing them. Low to high is rising, and high to low is falling. It took me a bit to think of simple descriptors for contours of the other axes. I dropped the terms "volume", "quiet", and "loud" for describing the loudness of a vowel because I wanted to maintain the impression that the language is always spoken at a comparitively quiet volume. So the category is called "strength", with quiet instead being called "weak" and loud being called "strong". With new qualitative terms for the volume strength axis, we can extrapolate words for contours along that axis: weakening and strengthening. But the two vowels of a contour can also have differnt lengths. This was the hardest axis to describe. Eventually it occured to me that if the first vowel is short and the second is long, that means that the change from one end of the contour to the other occurs earlier in the syllable. So a contour consisting of a short vowel followed by a long vowel can be called early. A contour consisting of a long vowel followed by a short vowel can thus be called late, since the change from one vowel to the other occurs later in the syllable.

Now we have nice, qualitative descriptions for our simple vowels and contours, their timing, tone, strength, and phonation, from short low weak whine to long high strong grunt, and early falling weakening growl to late rising strenghthening whine.

Nuclei aren't the only part of a syllable, we still need to think about our consonants--onsets and codas. Since my goal is to keep things simple to program, I've settled on a very simple syllable structure of (C)V(C). Since I can't imagine how a yip would sound like at the end of a syllable, I'll restrict yips to onsets only. So we have three possible onsets (huff, chuff, and yip), with an empty onset bumping it up to four, and two codas (huff and chuff), three counting open syllables.

One quick Python script later and I have a list of every possible syllable in Commonthroat. 2016, it turns out.

That's our phonology done and dusted. The TL;DR is that you want to think of how a language sounds to the listener, not how it's produced by the speaker. You also want to keep a high-level qualitative view of the phonology--what impression does it give to the listener overall, and you want to think in terms of an abstract space of feature axes that combine to make a phoneme, and not simply limit yourself to atomic segments.
Last edited by lurker on 06 Oct 2024 02:13, edited 2 times in total.
User avatar
lurker
roman
roman
Posts: 1281
Joined: 28 Jul 2023 14:08
Location: The City of Eternal Noon

Tips on making xenolangs: lexicon

Post by lurker »

Once I was done figuring out the phonology, and once I had a list of valid syllables, I needed to match those syllables to basic terms. I decided to start with the Swadesh list as a base. I've read many claims about the Swadesh list: that the terms in the list are the least likely to be replaced over time, that the list represents universal concepts that every language will have simple native words for, etc. I'm not concerned with the linguistic validity of these claims, since this is a constructed world, not a dissertation of the real one.

In my case, I decided to go with the angle that the list represents simple concepts that any language spoken by the yinrih would differentiate. However, there's a problem. The Swadesh list was designed with the perfectly reasonable assumption that the languages it would be used to analyze were spoken by humans living on Earth, not a spacefaring civilization of arboreal monkey foxes. So the first thing to do is go through the list and cull terms that wouldn't make sense for the yinrih.

This is also a good time to mention that you should have a few ideas for some high-level features you want your language to have, as this may effect what words on the list are valid for your language as much as the species' environment and biology. In Commonthroat's case, the big feature is the complete and utter lack of pronouns. If you're using the large 207-word list available on Wikipedia, that means that the first 15 entries are no good.

After that we can trundle merrily along until wee get to entries 36 to 41, (woman, man, person, child, wife, husband, mother, and father). It is at this point that you need to look deeply within yourself and ask some very important questions--how is babby formed? How girl get pragnent? In other words, what is your speakers' reproductive strategy and lifecycle? Do they use sexual reproduction? If so, do they use a two-gender system like terrestrial vertebrates? Any weird phenomena like parthenogenesis? Do they have extreme sexual dimorphism? Do they even reproduce at all, or are they immortal gods for whom such biological niceties are utterly meaningless? This is going to have a huge impact on your language's vocabulary. If your speakers are asexual sponges that reproduce by budding, then all those entries go out the window. In the yinrih's case, they do have analogs to males and females, so "woman" and "man" can stay.

You may decide to tweak the meaning of an entry rather than remove it outright, so "child" becomes "pup". This is mostly for flavor, as yinrih also refer to human children as "pups", the change is simply a reminder that the yinrih are canine. The next four entries--whife, husband, mother, and father--also warrant scrutiny. If your culture has no concept of marriage, husband and wife are out the window. Mother and father similarly hang on your species' reproductive strategy. The yinrih do not have a sex drive, and consequently don't have a concept of marriage. A litter of pups can also be the product of up to twelve genetic parents. With that in mind, we need to axe "husband" and "wife" altogether. I also decided to tweak the meanings of "mother" and "father", as a pup usually has more than one of each, so the terms become "dam" and "sire", respectively. We'll have to circle back to family life later, but let's move on for now.

The next several entries (44 through 60) have to do with flora and fauna. Here you're going to have to put some thought into the ecology of your conworld. You can get really creative with things like aeroplankton and giant tardigrades, or you could be lazy like me and dust off the uncreative conworlder's favorite tool: Convergent Evolution! Does it really make sense for things like fish, birds, dogs, and lice to have emerged on a completely different planet? probably not, but it's an easy way to make your speakers more relatable by giving them a familiar environment. In the case of Yih, I decided to chuck the terms for fauna (except for the generic "animal"), but keep all the terms for plants. You can't be arboreal if you don't have trees, after all.

The next entry, 61 "rope", forces us to ask if our critters are tool-users, and if so, is rope a thing? In the yinrih's case, the answer is yes, so rope gets to stay.

Entries 62 through 91 are all parts of the body, human or otherwise, so we need to think about our speakrrs' anatomy, and should you be so inclined, to circle back to the fauna of your conworld to come up with some non-sophont body parts that you deem important enough to include. Going down the list, everything seems reasonable enough until we get to "egg". Is ovipary present in your world? If so, are your speakers themselves oviparous? The yinrih are, so while "egg" gets a direct translation, it has much weightier cultural connotations. Male yinrih also lay eggs, since their reproductive strategy is somewhat like broadcast spawning, but I decided not to differentiate between male and female eggs in common speech, even though medically they're very different. This implies that, to the layman, the two types of eggs at least look superficially similar enough to share a common word.

Here is where I had to come up with a term from whole cloth. Yinrih aren't just oviparous, they're "exovoviviparous". After they lay their eggs, they gather the male and female eggs together in a safe place. Once the eggs are together, a protective membrane forms over the clutch and grows into what I can only describe as an external uterus. This is called a "womb-nest", and is vital enough that a simple, single morpheme word ought to exist for it. You may need to make similar additions for your own list as the need arises.

Continuing on, we eventually reach a cluster of nonhuman body parts: tail, horn, and feather. Yinrih themselves have tails, so the word gets to stay. I decided to exclude horn and feather, but you will need to decide based on your speakers' anatomy and that of other relevant animals in your world whether these terms stay or go, or whether more need to be added.

"Hair" gets a slightly expanded meaning, as the yinrih are covered in fur, so the word now becomes "pelage" or "coat". "Head", "eye" and "ear" translate more or less exactly, but here's an instance where I decided the yinrih would make a distinction not present in normal human languages. Like the canids that inspired them, yinrih have muzzles and wet noses. They don't have a single term for "nose", but a word for "muzzle", which includes the jaw and lips, and a term for "rhinarium", the wet tip of the nose.

"Mouth", "tooth", and "tongue" stay as-is, although since the tongue doesn't move to modulate yinrih speech, it isn't associated with language. Instead, I inserted the word "throat", since it's almost solely responsible for producing their vocalizations, it gets the honor of being associated with speech, with the word doubling as "language" in a similar way to how "tongue" does in many human cultures. This is why the language is called Commonthroat. If your speakers use a signed language, the word for "hand" or similar may serve the same function.

Moving on, "fingernail" becomes "claw". "foot" and "hand" merge to become "paw", as the yinrih use all four paws for both locomoation and manipulation. Both their front and rear paws look very hand-like, they are monkey foxes, after all. "leg" now refers to any of the four legs, and "knee" becomes the more general "joint". While not part of the original list or the list of basic words I derived from it, "finger" and "toe" also merge to become "digit", and "palm" and "sole" are both referred to as "palms".

"Wing" gets the axe, as I didn't think it essential enough to include. Once we get to "breast", we need to circle back to reproduction. Are your speakers mammals? While the yinrih do produce milk, they sweat it out like monotremes. Yinrih milk is produced from a patch of skin located on the forepaws of the female. It looks like an undifferentiated patch of bare skin, which is translated as "lactation patch". Here I add another basic term, "ink", as the yinrih produce a musky blue-black excretion from one of their claws that once served for scent marking, but evolved into a written language, so I've also added "write" to the list.

Now comes a series of basic actions performed by the body. Hopefully the earlier body part terms helped you come up with your critters' anatomy, and this series will detail basic actions that they can perform with that body. We start off with "drink", which gets swapped out for "lap", meaning to draw liquid into the mouth with the tongue, as the yinrih drink like dogs. "Suck" relates to "breast". since kits draw their tongue across their dams' paws to lick up the milk, "Suck" becomes "lick", in the sense of draw the tongue across a surface.

The next entry worth mentioning is "laugh", which I swapped out for "pant". As it happens, both chimps and dogs use panting similarly to human laughter.

A series of words denoting sensory and mental processes comes next. Here you need to think about your critters' psychology and how they perceive the world around them. The yinrih are all good until we get to "to smell". They have rediculously sensative noses, so while the term comes over as-is, it gains extra connotations related to perceiving the emotions of others, as yinrih use pheromones that tell those around them how they're feeling.

The next bump in the road is "sleep". Yinrih are incapable of losing consciousness, and the closest thing they have to human sleep is a period of reduced activity and dulled awareness called "torpor". So "sleep" goes and "torpor"stays.

Now we come up to a series of more dynamic actions, including some body postures and terms describing different types of motion. So we need to think about how your critters get around. Everything looks fine for the yinrih, but "to walk" specifically refers to walking on a level surface on four legs, and I added a term "to brachiate", meaning to swing hand over hand, since that's how the yinrih move through the trees as well as how spacers pull themselves along using paw cabling in microgravity. As for static postures, "to sit" gets split into "to perch", meaning to straddle a branch lying on the belly, and "to squat", meaning to sit like a dog. Yinrih also differentiate between lying on the back and lying flat on the belly, and I also added a term meaning to stand on the hind feet.

The next entry of interest for me is "sing". Yinrih language relies very heavily on timing, volume, and pitch to distinguish meaning, so they can't put words to a melody as that would make the words unintelligible. They can, however, howl, rather tunefully, in fact, so "howl" replaces "sing".

The next couple terms relate to weather and environment. If your speakers live underground, they probably lack words like "sun" and "rain". For the yinrih, everything looks good with the exception of "moon". Yih has no moon, but it does have a ring, so "moon" gets dropped in favor of "planetary ring". Note, however, that from the surface of the planet, it doesn't look like an annular shape, but an arch, so the word doesn't get used for circular objects as the word "ring" would imply, but rather bow- and arch-like ones.

After that, there are some color words mixed in with the other environmental entries. This is an aspect of worldbuilding that deserves more attention. If your speakers can't see, they probably don't have simple words denoting colors. If, like the yinrih, their visual system works dramatically different from humans' then its likely their color vocabulary will work very differently as well. Monkey foxes'eyes are more like radio receivers than cameras. Behind their normal eyelids are four pairs of "bandpass membranes" that filter incoming light. The eyes proper are patches made of billions of quarter-wave dipole nanoantennas sitting on a shared ground plane. They look normal as long as their outer eyelids are closed, but when fully open, it looks like they have empty eye sockets behind their eyelids. Yinrih have a much wider visible spectrum, although they can't perceive the entire range all at once. They use their bandpass membranes and signal processing in the brain to "tune" to different spectra, so an object may appear different depending on what spectrum they're currently tuned to.

The end result is that their color vocabulary works like English's odor vocabulary. That is, there are no words for basic colors, only descriptive terms that relate to objects that are so colored. Conversely, yinrih's odor vocabulary works (or will work, once I'm satisfied with my research) like English's color vocabulary, with words denoting abstract subjective experiences not tied to specific objects.

If you want to make your critters more unique, think about how their perception would effect their way of thinking about the world, and how that would in turn impact their language.

We're coming up on the end of the list, and I don't have much of substance to add. If your speakers aren't bilaterally symmetrical, they likely don't have words for "right" and "left".

TL;DR: think about your speakers, how they look and move, their environment, how they reproduce, how they sense and think about the world. Use that to come up with a few hundred words to use as a very basic lexicon. You can make the process easier by using the Swadesh list, or a similar list, as a base, adding, removing, and altering the entries to meet your needs.

Now you have a phonology that allows you to construct valid syllables, and a lexicon of basic terms. The next step is to assign valid sequences of phonemes to each of those terms, and bam! you've poured the foundation for your very own xenolang! You can use this foundation, as well as the high-level ideas you likely have in mind for the grammar, to start teasing out things like morphology and syntax. My personal approach is to use that basic lexicon and start writing simple glosses. I see what I like and what I don't like, adding, removing, and tweaking things as I go along.

While not strictly related to xenolangs, I've found that glossing is an extremely powerful tool for conlanging in general. You can write glosses to tease out grammatical features even when you don't have your lexicon handy.


dog-ERG man-ABS bite-1SG.ACT

man-ABS bite-1SG.PAS


I do this all the time at work when I'm away from my notes.

From here on out it's not much different from any other conlang, so have at it. I hope to see more xenolangs on this board in the future.
User avatar
Man in Space
roman
roman
Posts: 1395
Joined: 03 Aug 2012 08:07
Location: Ohio

Re: Tips on making xenolangs: lexicon

Post by Man in Space »

lurker wrote: 29 Jun 2024 17:52Once we get to "breast", we need to circle back to reproduction. Are your speakers mammals? While the yinrih do produce milk, they sweat it out like monotremes. Yinrih milk is produced from a patch of skin located on the forepaws of the female. It looks like an undifferentiated patch of bare skin, which is translated as "lactation patch".
I know it’s a thing in Bantu (at least) to differentiate between ‘breast (male)’ and ‘breast (female)’, so it could be seen as a strictly anatomical term.
Twin Aster megathread

AVDIO · VIDEO · DISCO

CC = Common Caber
CK = Classical Khaya
CT = Classical Ĝare n Tim Ar
Kg = Kgáweq'
PB = Proto-Beheic
PO = Proto-O
PTa = Proto-Taltic
STK = Sisỏk Tlar Kyanà
Tm = Təmattwəspwaypksma
Post Reply