Gen, and other links

Most of this post is targetted at people who have dabbled in the art of language invention, aka conlanging, and in places it will be a little technical. For everyone else, let me recommend links on miscellaneous subjects like parasites that may affect your thoughts, the science of forgetting, the history of sleep, or violin strings made by spiders. Or if you’re interested in language and understand some basic phonetics, you can read on.

Two weeks ago I wrote a blog post full of nonsense words, which I hoped would get people pondering a bit. This was output from a web page by Mark Rosenfelder called “Gen“, which generates nonsense text based on rules specified by the user. In other words, it produces the illusion of a language tweaked to your own aesthetic taste. Anyway, one day I was playing around with it when suddenly I noticed my name — adrian — in the output text. Here is the screenshot from when this took place.

You could take this in any number of ways. You could suggest that the program is alive and trying to communicate with me (Rosenfelder suggested this). You could interpret it as a sign that I’m destined to create a language based on this particular configuration. If you have absolutely no sense of humour, you could dismiss it as a coincidence.

Before we go further, let’s review the rules that govern this particular stream of gibberish. (I’ve made a couple of cosmetic changes that don’t affect the output.)

Categories:
C=dgbtkp
D=nsx
R=ryw
V=auoieòè
F=nlh

Rewrite rules:
ò|ou
è|ei
nr|dr
gw|kw

Syllable types:
VF
CV
DV
RV
CRV
CVF
DVF
RVF
CRVF

Dropoff: Medium
Monosyllables: Less frequent

Some notes on interpreting this:

  • The capitals C, D, R, V, F are categories of letters that can appear in the generated text. The designations are arbitrary, but basically I set it up so that “C” designates the majority of consonants, “D” designates consonants that cannot be followed by an “R” in the same syllable, “R” designates consonants that can appear in between the first consonant and the vowel, “V” designates vowels, and “F” designates consonants that can appear at the end of a syllable.
  • The rule “C=dgbtkp” tells us that the letters included in category C are ‘d’, ‘g’, ‘b’, ‘t’, ‘k’ and ‘p’. But more than that, the letters are listed in decreasing order of frequency, so the rule also says that ‘d’ is the most common of these options, ‘g’ the next most common, and ‘p’ the least common of the six. The definitions of the other categories, D, R, V and F, are interpreted in the same way.
  • The types of syllables that can be generated are listed under “syllable types”, and again, they are listed in decreasing order of frequency, so “VF” is the most common type of syllable (consisting of a letter from category V followed by a letter from category F) and “CRVF” is the least common (consisting of two consonants, a vowel, and a final consonant, picked from categories C, R, V and F respectively). The other possible types are listed in between.
  • The rewrite rules describe situations in which text is automatically replaced with other text, either within or between syllables. This can be used to create digraphs, for example the categories section specifies ‘ò’ and ‘è’ as members of the V category, but the rewrite section says that ‘ò’ should be replaced with ‘ou’ and ‘è’ with ‘ei’. It can also be used to create other constraints, for example the rule “gw|kw” tells us that a ‘g’ is automatically replaced with a ‘k’ whenever it is followed by a ‘w’. (Currently one fault in the program is that the rewrite rules don’t handle capital letters properly. One would like it so that words with a capital retain a capital — e.g. if ‘ò’ becomes ‘ou’ then ‘Ò’ becomes ‘Ou’ — but I don’t believe there’s a way to make that happen automatically.)
  • The dropoff setting specifies whether the most common letters and syllable types vastly outnumber the least common ones, are only somewhat more common, or all equal. The monosyllables setting governs the typical length of words. I left both settings on their defaults.

Now, I was just messing around, not paying too much attention to whether the rules I created are realistic for a hypothetical language, but after my name appeared I figured I ought to do something with it. Not create a language. I’ve messed about with conlangs before and have no inclination to invent another one. But I can at least decide how the nonsense text ought to be pronounced.

We can deal with the vast bulk of letters by keeping everything as simple as possible.

  • Plosives: ‘p‘ = /p/, ‘b‘ = /b/, ‘t‘ = /t/, ‘d‘ = /d/, ‘k‘ = /k/, ‘g‘ = /ɡ/
  • Nasals: Only ‘n‘ = /n/.
  • Fricatives: ‘s‘ = /s/, ‘x‘ = /x/. Because there are few fricatives, we can expect variation. For example, /s/ might well be realised as [ʃ] in some circumstances.
  • Approximants: ‘y‘ = /j/, ‘r‘ = /r/, ‘w‘ = /w/, ‘l‘ = /l/. The rule that ‘n’ becomes ‘d’ before ‘r’ strongly suggests to me that /r/ is trilled.
  • Vowels: ‘a‘, ‘e‘, ‘i‘, ‘o‘ ‘u‘ = /a/, /ɛ/, /i/, /ɔ/, /u/ respectively. As for the digraphs, let’s say ‘ei‘ designates /e/ and ‘ou‘ designates /o/. A glottal stop [ʔ] separates two adjacent vowels within a word. I tend to realise /i/ & /u/ as [ɪ] & [ʊ] respectively in most positions but as [i] & [ʉ] at the end of a word or before another vowel. This is a consequence of my own native speech and should probably be resisted, but variations of one sort or another would exist. Update: I’m thinking of moving ‘u‘ to /ɵ/ and realising ‘e‘ as more like [æ].

This covers all letters except ‘h‘, which I am unsure about. The usual realisation [h] doesn’t make much sense given that it appears only at the end of syllables. Another option, that it modifies the preceding vowel in some way, also doesn’t make sense given that the rules allow only two normal vowels to be adjacent (as when a CV syllable is followed by a VF syllable), but any number of vowels in a row if each is followed by a ‘h’.

I’m open to suggestions as to what  ‘h‘ should represent; there are several possibilities but I haven’t settled on one I like. Meanwhile I’ve been taking it to indicate creaky voice on the preceding vowel. As I just explained, it’s very unlikely to be a vowel modifier, so the interpretation is provisional, but at least this gives me a chance to practice using my creaky voice.

I’m inclined to place stress on the penultimate syllable.

About these ads

4 Responses to “Gen, and other links”

  1. Irina Says:

    Must play with it, though I’m rather short on spoons at the moment (this back/leg nerve thing is taking up a large part of my mental CPU).

  2. Flesh-eating Dragon Says:

    Well, if you come up with a configuration that pleases you, I’ll be interested to see it. :-) All the best with the back. The expression “short on spoons” is new to me.

  3. Irina Says:

    http://www.fibroaction.org/Articles/The-Spoon-Theory.aspx (not quite the same context, but it works well enough to explain how I’m not 100% at the moment)

  4. Flesh-eating Dragon Says:

    Thanks for the link. I dimly recall reading about that on a website once but didn’t make the connection. My mother suffers from back problems so I know it’s not very nice.


You are welcome to add your thoughts.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s