(Originally appeared in Lognet 90/3)

In Defense of Loglan Morphology

James Cooke Brown

Part 2 — Critique of "Rexlan"

Let’s have a look, now, at Rexlan itself. Phonologically it is quite like Loglan except that it doesn’t have the irregular phonemes x y w q, or rather it has their letters but has reassigned them to other phonemes. It assigns y to the [y] allophone of Loglan i (naturally), w to the [w] allophone of Loglan u (naturally again), q to the [ng] allophone of Loglan n, and x is not a Rexlan phoneme at all but a sort of “digraph-maker” which allows us to introduce 25 additional irregular phonemes, as was proposed in Rex’s “X-Rating Loglan” article in last December’s Lognet. But how the irregulars are to be used in Rexlan is not explained. Schwa apparently exists, but is "invisible", i.e., not represented by a letter in text.

For morphological purposes, the 25 regular phonemes Rex does discuss may be most usefully sorted into four categories (though he doesn’t sort them this way): (1) The stops, fricatives and the plosive (all of which I’ll represent by ‘s’ or ‘S’ in my analysis and call “stops”); these are b c d f g h j k p s t v z. (2) The nasals and labials (which I’ll represent by ‘n’), which are l m n r q. (3) The semivowels (shown by ‘y’), which are w y. And (4) the vowels (shown by ‘v’), which are a e i o u.

The morphology is then delightfully simple. A morph is anything that begins with one or more s’s followed by a string of n’s, y’s and v’s in any order so long as there is at least one v among them. We could write this rule as follows: s(s)(n/y/v)v(n/y/v), where the expressions within parentheses may occur none or more times, and equally permissible alternatives are separated by slashes. Thus the minimum morph is sv, for example da; but long polysyllablic morphs like banana (svnvnv) and stereo (ssvnvv) are also possible. (The formula defines a superset because not all expressions that satisfy it are actually pronounceable morphs; however, all morphs satisfy it.) The subset of morphs which have exactly one s followed by v, vv, yv, or vy—words having the shapes of sa seo swi saw, for example—are reserved for little words. All the rest are “big morphs” used for forming predicates and names indiscriminately. These may be either monomorphic “primitives” or polymorphic “compounds” depending on certain other features of the speechstream, like stress and/or the incidence of hyphen particles.

All morphs big and little begin with s. Little ones must end with either v or y and must never contain n’s. Big ones are frequently polysyllabic and may contain n’s in addition to anything that little morphs may contain. Big morphs are thus a residual category: you resolve a morph; you ask whether it’s a little one; if it isn’t, it’s a big one. The invariable sign that one morph is over and a new one is beginning is the appearance of another s. Thus, /daBAnana/ resolves as Da banana because it has exactly two stops (or groups of stops) in it, d and b. It may be rewritten as svSvnvnv. Note that before the first s or S in every group of one or more stops there is a morpheme boundary. I’ve used a capital ‘S’ for the stop that heads banana to indicate that the syllable which it initiates is in this case stressed. It is one of Rex’s rules that polysyllabic morphs must always be initially stressed, but that is quite unnecessary for morpheme resolution. The positions of the stops are enough to separate the morphs. Stress becomes important only when the resolution of polymorphic words (compound words such as bananafawl = banana-bird) is being considered. Apparently the first syllable of such compounds must be stressed: /BAnanafawl/. I will write this compound word as S---s-, for in addition to the stops, we now need indications of which stops start stressed syllables (indicated by capital ‘S’s), and, while not essential for resolution, it will be helpful to be able to examine the number of syllables between the stops (I will indicate these by hyphens). In contrast to the compound word, the phrase banana fawl may be pronounced entirely without stress /bananafawl/, or with just the second morph stressed /bananaFAWL/, or even with initial stress on both morphs /BAnanaFAWL/, and still it will resolve as the two-word phrase banana fawl. The rule for terminating a compound once we know one has begun, i.e., for preventing it from incorporating the next big morph, is to speak any little morph (other than a hyphen particle) first, or stress the first syllable of the next big morph; for these two signs will always mark the beginning of a new word. So if bananafawl is followed by three monomorphic predicate words—and let us assume they are monosyllabic ones as well, say bananafawl tcaq sri bun—then all three of these words must be stressed in speech: S---s-|S-|S-|S- (I am now indicating word-juncture by ‘|’). This is because any other stress pattern will produce a false resolution: S---s-|S-s-|S- or bananafawl tcaqsri bun; S---s-|S-s-s- or bananafawl tcaqsribun; S---s-s-s-|S- or bananafawltcaqsri bun; and so on. Thus the use of compounds in Rexlan speech will be frequently attended by a considerable penalty, namely the necessity of stressing all the syllables in a multisyllabic string. These will not only be tedious to pronounce—for equally stressed monosyllables are usually separated from one another by the very pauses that Rex dislikes—but any long series of adjacent stressed syllables will break up the rhythm of speech, causing it to be unnaturally ponderous in the vicinity of compounds. Apparently strings of unstressed syllables are permissible in Rexlan, as they are in Loglan and Spanish. But strings of stressed syllables will probably not survive the rigors of brisk speech. As Rex apparently realizes, the only way to avoid the arhythmia problem—which I’m afraid is an inevitable by-product of any attempt to resolve compounds by stress in this language—is to hyphenate the elements of compounds. Resolution would then depend entirely on the positions of the stops and hyphens. Compound resolution would thus be independent of stress, and any natural stress rhythm whatever could be adopted by the speaker. Rex suggests the little word zu for the Rexlan hyphen, and using it to make compounds does seem to be a better solution to the resolution problem than using stress. However, hyphenation is not without its own problems. For using hyphens to glue strings of morphs into compound words will lead to the paradox that a compound word will always be longer than its deriving metaphor: /baNAnazufawl/ vs. /baNAnafawl/, say. Hyphenated sequences are also not going to be very isomorphic; for apparently these compound expressions will be written with the hyphen-word suppressed, that is, as bananafawl vs. banana fawl...despite the fact that the first expression takes substantially longer to say than the second.

While resolution will be simple and elegant in this neat little language, its un-Zipfean compounds, as well as its severely circumscribed word-space—there will be no v-, y- or n-initial morphs, for example, and no morphs with medial stops will be possible—will mean, first, that its little-word space will be much smaller than that of Loglan, so that in places where we logli are able to use one- and two-letter words, three-letter or even longer words (i.e., compounds) will be necessary in Rexlan. This will inevitably lengthen the Rexlan utterance. (One of the remarkable things about Loglan is how short its utterances are despite its extraordinary explicitness. This still somewhat mysterious achievment is not likely to be matched in Rexlan.) Second, it may be difficult to find enough short natural words that are of Rexlan’s “big morph” shape to provide an adequate pool of acceptably short primitives. This, too, may well lengthen the utterances of the language. But while these stern morphological constraints may turn out to be fairly difficult for the word-finder to live with, a far more serious problem for Rexlan is its low learnability, internationally considered. It will surprise Rex that I say this, and possibly some other readers as well. For on the surface, Rex’s proposed method of “finding” Rexla-form words in the natural languages seems an eminently fair one, and we naturally think that fairness in this matter will also mean high learnability. Not so. Let me demonstrate this conclusion arithmetically on a tiny language for a tiny world. Everything we discover will apply to big languages for big worlds as well.

Suppose our international language has exactly 100 primitive words. Suppose we confine our interest in the learnability of this language to the learnability of its words by 100 people. These people will, accordingly, constitute the world. Suppose the native languages spoken by the people in this microworld are distributed according to the scoring weights of the Loglan composite primitive given on page 415 of the 4th Edition. Thus, there’ll be 28 Englishmen, 25 Chinese, 11 Hindi-speaking Indians, and so on. We’ll use this table despite the fact that its coefficients are based on 1950 demographic data, and that the real world has changed in the meantime... although in ways that are unimportant for the use to which we’ll put this table. Thus, in addition to the English, Chinese, and Indians, there will be 10 Russians, 9 Spaniards, 6 Frenchmen, 6 Japanese, and 5 Germans. To simplify things, we will imagine that no one speaks Swahili or Innuit in this world. To simplify things even further, we will assume that none of these people knows any part of any language other than da’s native one.

Now let’s consider three ways of making the 100 international words. First, we’ll use the Loglan method; then, we’ll use what we’ll call the “English method”; and finally we’ll explore the Rexlan method. We commence by building Loglan-type composite words. We’ll give all our 100 words a 5-letter form and, using the proportionality principle, we’ll try to pack as much memorability into each word for as many people in this tiny world as we can manage to cram in. With a word like matma we’ll get practically everybody experi-encing one-shot learning. That is, practically everyone in the test population will learn matma more or less permanently from one short learning exposure. The precise mathematical expectation is that 94 of the 100 people will learn matma easily, given the proportionality principle. For matma contains overlapping instances of words from all eight languages, and most of the clue-words are present in their entirety. But with a word like dzoru, which has only a Chinese word and part of a Japanese word in it—all of Chinese dzou and two-thirds of Japanese aru(ku)—only 29 persons in our micropopulation can be expected to learn it easily. The other 71 will probably take some time to learn dzoru. The 29 rapid learners will include the 25 Chinese, none of whom will, according to the proportionality principle, have any difficulty remembering dzoru once their own word dzou in it has been pointed out to them (this is an exaggeration, of course; some Chinese may forget that strange embedded r, and we must later make allowances for that), plus 4 of the 6 Japanese. These are included in the portion of the population which is likely to learn dzoru quickly, because, for each of the 6 Japanese, the chance of short-exposure learning succeeding is .67. Thus we may predict that some four of them will learn dzoru quickly and that the other two will require more time. This is because 2/3rds of the Japanese word aru(ku)—ku is a standard affix—is also found in dzoru, and the probability of learning it quickly is apparently well-estimated by that fraction. It is true that the [ROO] of [ah-ROO] doesn’t appear to be a very useful clue. But is it going to be less useful to the Japanese than the [eek] of [week] is to us? The two clue-words belong to the same proportionality class for their respective languages, and presumably they will provide about the same amount of help.

In this way we will be able to maximize the memorability of each word for our population of 100 learners, and will end up with 100 composite primitives of the Loglan type for them to learn. If the words are typical of the 800-odd Loglan words that have actually been built in this way—and for a mathematically equivalent target population, note—they will have an average memorability score of .45. Let us call this number the “learnability quotient” for this method of constructing international words for this particular population. What that number means precisely is that if we pick one of these 100 people at random, and then pick one of those 100 words at random, then the expected value of the probability of that person’s learning that word quickly—that is, of remembering it for a reasonably long time after a reasonably brief exposure to it—is .45. In other words, we can expect rapid learning of these words to take place in this heterogeneous population about 45% of the time. The learnability quotient of a language is an inverse measure, obviously, of how long it will take any member of its population—or its whole population, for that matter—to learn all its words. So much for the Loglan method. If the proportionality principle is correct—and that is what allows us to pack in a maximum amount of learnability for a given population of people into a set of composite words—the Loglan method produces a vocabulary with a learnability quotient of .45.

The next method we’ll examine is the “English method”. This method occurs to us because we notice that English is the language with the largest number of speakers in our model population, namely 28. That means that the probability of picking an English-speaker at random from it is .28. Our strategy now is to represent each primitive predicate in the model international language by its English equivalent. Let’s call this the “mono-clue” method, for in each word we adopt there will now be just one clue to its meaning, namely that English word itself. Obviously the probability of short-exposure learning of each of these English words by an English-speaking “learner” is 1.00, that is, certainty. But let us also assume that it is zero for everybody else (that, too, isn’t strictly true and will have to be allowed for later). Thus the learnability quotient for the English method is the sum of .28 X 1.00 + .78 X 0, or .28. (We are pretending here that words like German Mann, which is virtually identical to English man, or French homme, which can easily be related to it, do not exist. Taking them into account would mean that the quotient for the English Method would be slightly higher than .28, because in fact the method is not really a monoclue one. But we can safely neglect this fact in order to make the next point we wish to make.) We now notice that including even one Chinese word, say, in the 100 primitives reduces the quotient for this method. For we will then need to calculate with 99 English words and one Chinese one, which gives .28 X .99 + .25 X .01 + .47 X 0 = .2797 for the learnability quotient, a figure that is not much lower than the .28 we got for the “pure” English Method, but it is lower. From this we learn that no departure from the “take all words from the most populous language” strategy can improve the learnability quotient of the monoclue method. Already this tells us that the Rexlan strategy is hopeless, but let’s calculate a likely quotient for it anyway.

The Rexlan strategy is to take our primitive concepts one at a time and start looking in the most widely spoken language for a word for each of them that also fits the Rexlan “big morph” template. If we find one, we adopt it for that con-cept; otherwise we look in the next most populous language for a “rexla”-form word for it; and so on. The result will probably be, as Rex expects, a distribution of words among the source languages in which large numbers of primitives will have come from the more populous languages and smaller numbers from the less populous ones. Let’s suppose that the distribution of chosen words among the eight nationalities in our test population ends up being exactly proportional to the number of speakers we have assumed for each language. Thus, there will end up being 28 English words in the language, 25 Chinese words, 11 Hindi words, and so on down to 5 German words. (There is no reason to suppose that a distribution even approximately like this one is likely to happen by chance. What will really govern the productivity of a language, given Rex’s method, will be the proportion of its basic words that are “rexla”-form, as well as how early it appears on the search list. Expected values of such a distribution are impossible to calculate beforehand, although I expect as Rex does that one of the more concordant, and therefore more productive, languages will turn out to be Chinese.) Given this arbitrarily symmetrical outcome—which from one point of view seems the fairest one (analogous to proportional seating in a world assembly, for example), and therefore the one which fair-minded word-finders might strive hard to achieve—we can now calculate the learnability quotient as follows. It will be the sum of .28 X .28 (the chances of choosing an English-speaker from the population times the chances of choosing an English word from the language for da to learn) + .25 X .25 (the chance of a Chinese speaker being both chosen and given a Chinese word to learn) + ..., or what turns out to be the sum of the squares of the eight proportions. This comes to .1808 for the average probability of one-shot learning by the Rexlan method, a rather lower figure than I expected and probably much less than Rex had hoped for, too.

In other words, the Rexlan method will score abysmally low on learnability if it is fair. In fact its best learnability score will come from “unfairly” taking as many words as possible from the world’s major language...in which case it will probably be unacceptable politically. But putting political con-siderations aside, we see that we can maximize learnability under the Rexlan method only by letting the great languages dominate the picture. If we do that, its learnability quotient will approach, and may even reach, the one that would be achieved by taking all the primitive words of the inter-national language from the single most populous language on Earth. By most people’s reckoning—one which takes non-native speakers of a language into account—the world's most populous language is English, but not apparently by Rex’s. But that doesn’t matter. Whether it is English or Chinese—and they have no other rivals for that distinction—the highest learnability quotient for a monoclue language will be obtained by drawing its words exclusively from the world’s most populous tongue...in other words, by using the English method. So the fairer a monoclue system is, the less learnable its words will be, and vice versa. This may be, in fact, the most grievous defect of monocluing...a defect amounting almost to disaster.

But if the principle of monocluing is effectively abandoned and multiclue words are deliberately manufactured, and if the goodness of a clue in promoting learning does turn out to be, as the experience of most sutori language learners and my own early research suggests, a function of the proportion of some natural clue-word that is found in each word to be learned, then sets of more than one such clue-word can co-exist in a manufactured word, and much higher learnability quotients will then be achieved. For example, for Loglan I obtained .45, which exceeds .28 by far; but slightly higher values are probably also possible.

Thus on the empirical principles I found in my early word-making work, I have achieved about 60% more learnability for Loglan (.45) than would be obtained by using only English-derived pri-mitives (.28); and that figure, in turn, is the upper bound of the learnability that can be achieved by using the Rexlan method. Indeed, if its words are “fairly distributed” among the target languages, then, as we have seen, the Rexlan method will produce a much lower learnability than even the English one, namely .18. It is true that the “pro-portionality principle” on which my work with the Loglan multiclue primitive has been based is still a fuzzy one, based on one-language samples and small ones at that, and still scientifically “soft” in some other particulars as well. But even on the assumption that its claim to superiority is sure to be scaled down when its data-base has been suitably hardened, the learnability index of that method now is so much greater than the figures obtained for the monoclue method—60% greater, in fact—that there is plenty of room for that index to diminish, and the monoclue figures to grow—as they certainly will grow when we take into account the man/Mann phenomenon, for example—and still come out ahead of its rivals. That includes Esperanto, note, which also uses monocluing. Indeed, it includes any proposed international auxiliary that does not use com-positely-derived, multiclue words that have been systematically maximized for learnability over the world's population.

Sixty percent is no small margin. It practically guarantees that the Loglan Advantage will prove a robust one, that it is more than merely likely to survive whatever hardening we or others give these numbers. All of this means that for those of us who are still Loglanists—and I trust that that number has grown, not diminished, during the course of this exercise—one of the most important scientific tasks ahead of us is to replicate those early learning experiments of mine using our now much richer pool of lexical materials as well as large-sample, cross-cultural experimental designs. With the results from such experiments in hand—and they’ll be more sensitive than my early engineering tests were, if only because they’ll involve larger samples—we can calculate the true learnability of Loglan to any degree of precision we desire (and can afford). Indeed, we can compare Loglan systematically to other candidate languages on these and other crucial matters.

I thank Rex for bringing the importance of the learnability question to our attention again. Involved as I have been in engineering other aspects of Loglan in the last decade or so, I had not thought about learnability for some time. And Rex’s inspired antithesis to my ancient composite-word making thesis brought about precisely the sort of surprising synthesis that Hegel conjectured was the creative force that drives the history of ideas. Hegel’s own thesis seems oversimple now, in the last years of the 20th Century. But however that may be, I can personally attest that I would most likely never have made these calculations, or stumbled onto the happy conclusion to which they apparently lead—obvious as it may be to my daughter Jenny, who, as the world’s first nativeborn logli (I was writing the first edition of Loglan 1 as she lay there chortling in her Paris crib, and that does make her Loglandia’s first native, I suppose), may very well have a leg up on the rest of us (soi smile)—had I not had the pleasantly antithetical phenomenon of Rexlan to deal with. I am sorry that the figures come out so badly for the procedure Rex favors. Perhaps prima facie fairness is more important than learnability; I do not know. But with or without Rex's word-making procedure, the language whose outlines he has put before us is certainly an intriguing one. I hope he will continue to work on it, and report back to us from time to time. Rex’s new address is PO Box 9, Bellvue, CO 80512, in case anyone would like to communicate directly with him about what may very well become a new international language project.