I have had a fascination for languages ever since I read Tolkien and discovered that he had created the Elvish languages from scratch. It was this hobby of “conlang” design that ultimately led me to become a programmer.
My first programs (written in line-oriented GW-BASIC) were intended to generate random words according to specific rules. I would then pick the most interesting ones for inclusion in whatever language it was I was designing at the time. Each program was language-specific. That is, I would choose a set of phonemes, create a set of rules for their combination, develop an orthography, and then write a program to generate valid words.
When I discovered REBOL, I realized that its dialecting capabilities would be perfect for this task, but instead of having to create a program per language, or struggle with other less-than-ideal techniques such as XML configuration files, I could simply create a REBOL dialect to describe the task.
To illustrate my Conlang Dialect, we‘ll go through the process of creating a brutally simple language. Obviously we won’t bother with grammar and syntax. We are dealing just with the sounds. We will call this simple language Na. Na has only four consonants, s k t n, and three vowels, a i u. The consonants are as one would expect, and the vowels are pronounced more or less as in Spanish.
A syllable in Na must always end with a vowel, and must begin with at least one and at most two consonants. Thus a is not a valid syllable in Na, but ta and tki are. Words can consist of any number of syllables, but we will stick with those between two and five.
The Conlang Dialect consists of three verbs, rand, rept and join. Verbs are followed by one or more arguments. Our first verb, rand, can be followed either by a string or a by a REBOL block in a specific format. When followed by a string, rand instructs the Conlang Dialect to randomly choose one of the characters from the string.
rand "aiu"
When the expression above is evaluated in the Conlang Dialect, it will result either in "a", "i", or "u". However, sometimes we want to say that one choice can occur more frequently than other choices. In the Na language, "a" is more common than either "i" or "u", which are about equally common. We can express this as follows.
rand [ 3 "a" 1 rand "iu" ]
This means that 3 out of 4 times, the evaluation of the whole statement will result in "a", otherwise it will result in the evaluation of the expression rand "iu". Thus, Conlang Dialect expression can be nested within each other, and often are.
Now that we have a way to randomly choose strings, we need a way to stitch them together. This is performed by the join verb, which takes a block containing the expressions we want to join.
join [ rand "sktn" rand [ 3 "a" 1 rand "iu" ] ]
This instructs the Conlang Dialect to take the result of the two expressions inside of the block and combine them. So, for instance, we could get results like "ki" and "su" from the whole expression above.
The last verb in our repertoire is rept. This verb takes three arguments, and is best illustrated with an example.
rept 1 3 join [ rand "sktn" rand [ 3 "a" 1 rand "iu" ] ]
The first two arguments of rept tell the Conlang Dialect to repeat the evaluation of the expression given in the third argument from one to three times, in this case. In other words, pick a random number between 1 and 3 and execute the join expression that number of times, stitching the result together. The result of this expression could be words such as "ka", "kasiki", "sita", and so on.
This is all we need to create the Conlang Dialect, but unfortunately it has some drawbacks. It would be nice if we could assign expressions to names and reuse them, and so we can:
na: [ consonant: rand "sktn" onset: rand [ 3 consonant 1 join [ consonant consonant ] ] vowel: rand [ 3 "a" 1 rand "iu" ] syllable: join [onset vowel] main: rept 1 5 syllable ]
Here we have a full specification written in the Conlang Dialect. Expressions are assigned to names using a standard REBOL set-word. It should be fairly obvious from the above example how they are used. Assigning expressions in this way is not required, with one exception: The main expression is required, as it serves as the entry point into the specification. Using named expressions makes the Conlang Dialect much more usable, so I highly encourage their use.
Each verb has a much shorter synonym, and I tend to use these exclusively.
rand | ? |
rept | * |
join | & |
Here is a full program using the Conlang Dialect to produce words in our imaginary Na language. This uses the abbreviated synonyms given above.
REBOL [ needs: [ 2.100.99.2.5 http://r3.revolucent.net/net.revolucent.conlang.v1.r 1.0.2 ] license: 'mit ] na: [ consonant: ? "sktn" onset: ? [ 3 consonant & [ consonant consonant ] ] vowel: ? [ 3 "a" ? "iu" ] syllable: & [onset vowel] main: * 1 5 syllable ] random/seed now generator: parse-conlang na words: [] while [greater? 10 length? words] [ unless find words word: generator/eval [ append words word ] ] print words
And here is a list of ten words generated by executing the program. Of course, subsequent executions are almost certain to produce a different set of words.
nkasanatana ntusnana sata sanuna kasakukana skannatu suknakaka naka natasasa kanika