(Return to the Table of Contents)
Notes 466
Languages behave in many ways as if they were alive. For one thing, all languages grow. For another, each language seems to grow in its own way, as if it had its own unique pattern of openness and resistance which causes it to change quite readily in some directions but not at all in others. For example, one of the ways in which English seems to grow is by swallowing the words of other languages whole. As Otto Jespersen, the late 19th, early 20th century Danish linguist, once put it, English eats other languages alive. Not so with French, for example, which prefers to redefine old words to meet new needs or coin new metaphors from old roots. Chinese also shies away from foreign loan-words and grafts new ideas onto ancient stock. Japanese, on the other hand, is more like English, adopting and repronouncing foreign words with gay abandon. Changes in the pronunciation of words over time seem also to follow special patterns which are unique to individual languages, or to small groups of closely related tongues. And it is a major, but so-far untested, hypothesis of linguistic theory that each language tends to develop its own metaphysical outlook on the world. All this has suggested to nearly all students of language that languages are rather idiosyncratic creatures, disorderly in externals but internally autonomous, and in any case very much alive.
Now if you plant something that you thought was a tree and it doesn't grow, then you were mistaken. What you planted was a stick: a poor wooden thing, stiff and unyielding, and without the essential expansiveness, that ability to stretch and swell and fill its niche, that is the very stuff of life. So with languages; if they are alive, they grow. Some partisans of the international language movement have argued that one of the advantages of a constructed language over a natural one, as a universal second tongue, would be that the constructed language wouldn't change. They may be right; but certainly it is a mistake to hope so. For just to the extent that such hopes are fulfilled, they fail. An unchanging language would be a useless corpse, suitable for ritual incantations, perhaps, but quite incapable of serving the changing needs of humans. Latin is said to be a "dead language"; and so it may be. But not because it isn't spoken. For it is obediently intoned in churches and schoolrooms even yet. If it is dead, it is because it has stopped growing. Nobody lives in the house of Latin anymore. Its vast, still rooms no longer echo to the sounds of scuffling life.
Now if Loglan is a language, one of the ways that we will know it, will be that it will grow. The ways in which Loglan has been built to grow are a very carefully considered part of its structure. We have realized from the outset that if the language is to be easily learned, its initial vocabulary must be very small. But even though small, that initial vocabulary must be capable of very rapid growth...growth both in the mind of the learner, as da encompasses the language, and in the community of speakers as they continually expand its corpus to match their expanding need. The machinery we have provided for rapid, spontaneous and yet orderly growth is the main topic of this chapter.
But first a warning. Natural languages grow rather slowly. The proportion of people who now speak English who will contribute a living word to it during their lifetimes is rather small. A few scholars will do it; a journalist now and then; a poet to implement a new perception; an inventor gazing fondly at the new thing da has made; or an occasional gifted urchin who, searching for a word de doesn't know, coins one in desperation that happens to stick among de's fellows and then lives on as slang. But the frequency of these events reckoned over the populations who speak these modern languages is very small. Most of us live with our native languages as we find them. To make new words ("neologisms," as they are called by pedants) is not approved in college classrooms. Word-building is a game we are taught not to play.
Not so with Loglan. If Loglan lives, it will be in part because its very structure stimulates the poetic impulse in a substantial proportion of its speakers. Or because the word-making game in Loglan turns out to be easy and fun to play. In most of this chapter we will be describing the rules of that word-making game. But first let us review the resources with which we start the play.
Let us take stock of what we have. At the present time (1989) the vocabulary of Loglan stands at around 9,000 words. Of these, about 300 are simple little words, the 1-, 2- and 3-letter structure words such as a, ia, ta and tai with which we have been mainly concerned in describing the grammar of Loglan; another 500 or 600 are compound little words, the less commonly used, longer structure words, such as pana and anoi, with which we have had less to do in this book; another few hundred are illustrative names such as Djan and Frans (the number of names in any language is of course unlimited; the set provided in the current Loglan lexicon simply illustrates the naming process); and the remaining 8,000, or more than 90% of the present vocabulary, are predicate words of various kinds.
Among the little words are the hard-worked common words, like le, la, da, pa and ba, some of which occur in nearly every utterance. Such words lay out the grammatical and logical structure of ideas. But if grammar and logic, through their little words, provide the bones of a language, predicates provide its infinitely varied flesh. Grammars, and the little words that impart grammatical structure, are necessarily finite affairs; the vocabulary of predicates, in contrast, is in principle infinite in size. This is true in any language. In theory, and perhaps also in practice, there may be many learned persons who know the whole grammar of English. No one knows the whole predicate vocabulary of English, and now that it has passed the half-million mark, no one ever will. Obviously, if we are concerned with how Loglan might grow, it is with the mechanisms for augmenting its predicate vocabulary that we will be primarily concerned.
Of the 8,000 predicate words that are currently defined in Loglan, approximately 1,500 are borrowings--most of those from the vocabulary of science--and another 1,300 are "primitive" in the sense that they are not derived from any other word or words in the language. The largest subset of the primitive predicates are the basic semantic building-blocks of the language, the 860 "composite" words like mrenu and fumna. Recall that these have been derived as broadly as possible from eight natural languages. But the 5,200 current Loglan predicates which are neither borrowed nor primitive, and which we accordingly call "complex," is by far the largest category of words in the language. These have nearly all been derived from two or more of the composite primitives...with an occasional assist from a little word like ne, no or nu. In English, 'ease' is a primitive predicate: 'easy' and 'uneasy' are complex. More obviously, 'black' and 'berry' are primitive; 'blackberry' is complex.
With relatively few exceptions, the predicates used in the specimen sentences in this volume have been composite primitives. Indeed, only about a hundred of the 860 composites have been used to write the specimens used in this book. These are the words like kamla and godzi, matma and farfu, ditca and kicmu, which we have encountered over and over again. Occasionally a complex predicate has been used to illustrate a grammatical point; but these have been introduced metaphorically as we went along. For example, faltaa [fahl-TAH-ah] ('liar') was introduced as 'false-talker'. The word is a complex formed of parts of the two primitive predicates falji ('false') and takna ('talk'), from which we learn that 'false-talker' is the basic metaphor underlying the Loglan conception of a liar. Similarly, mormao [MAWR-mough] ('kill') is made from mor- and -mao; and these parts come from the primitives morto ('dead') and madzo ('make'). So to kill something is to "make it dead" in Loglan. By a similar metaphor, a scientist is a "science-maker", someone who contributes to the growth of science; so the Loglan word for 'scientist' is sesmao. The word is made from the ses- of sensi ('science') plus -mao again. There are numerous -mao words in Loglan. For example, rojmao comes from rodja madzo or 'grow-make' and so means 'cultivate'. Agronomy can then be conceived as "cultivation-science"; and this metaphor can be rendered by the three-term complex rojmaosensi. Loglan for 'agronomist' then unfolds as the four-term word rojmaosesmao ("grow-make-science-maker"). In this way the pool of predicates can be indefinitely extended by the metaphorical elaboration of the fundamental ideas of the language. It is clear that the elaboration of new complex predicates by the invention of new metaphors will be the major source of future word growth in the language.
But it is not the only source. Our studies of these matters over the years have suggested that there are at least six further ways in which the vocabulary of Loglan will continue to grow. In addition to new complexes, there is a small pool of what might be called "international" words--words like 'telephone', 'football' and 'beer'--which has been only partly incorporated into Loglan. Most of these now-international words originally spread across the planet in the wake of commerce or conquest during the centuries of European expansion; so they are nearly all European words. But despite their local origins, they are now universally recognized. They are therefore ready to be added gratuitously, as it were, to the vocabulary of Loglan. Usually they are incorporated as primitive-form words: futbo, telfo, paspo. Others come in as longer borrowings: tcokolate. About a hundred of these words have already been brought in. Perhaps a few hundred more remain.
A morphological aside. When a Loglan word is primitive in form--i.e., when it looks and sounds like a primitive, as telfo and futbo do--but is derived as these words are from only one or a few languages, it is called a one-source primitive in order to distinguish it from the composite primitives, like sensi, morto and madzo, which are derived from as many languages as possible. Complex predicates like mormao and sesmao are largely made, of course, from composite primitives; for it is these that have been provided with the short affixes, like ses-, mor- and -mao, that are the principal building blocks of the language. But, morphologically, locally-derived primitives like futbo are also primitive...even if they are not semantically so. Words that are neither primitive nor complex, but are still predicates, are called borrowings. These are words like trombona, iglu and proteini which are obviously predicates (they have all the necessary features) but are neither primitive nor complex in form. Armed with these few principles, let us continue our survey of the possible sources of future word-growth in Loglan.
In addition to the international words--which come into Loglan either as one-source primitives or borrowings--there is a second, even larger pool of "native" words which, when their meanings are required by Loglan scholars and story-tellers, will probably also be adopted. These are the local words for languages, peoples, artifacts, articles of food and dress, or even local plants and animals. Such things are often unique to the places where they originated--or at least, like kayaks and parkas, they were originally--and so they should obviously be predicated in Loglan by words which resemble the local words for them as closely as possible. Most of these words, like Innuit 'igloo', are primitive notions to the original users. So they should probably not be built as complexes even though they could be. Thus nichaa [neesh-HAH-ah] ('snow-house') is really not an adequate translation of Innuit 'igloo'. As it happens, the Loglan borrowing can be an exact imitation of the original Innuit word, namely iglu.
By a similar route Swahili simba for 'lion' also entered our international language intact. Like parka, the natural word 'simba' already has the shape of a Loglan primitive. Since the once-widespread lion is now virtually confined to Africa, it seems only fair to use the most widely-used African word for the beast. The Innuit word 'kayak' also names a unique category of objects on this planet, and so also deserves to be preserved and not metaphorized. But as the word /KAiak/ is vowel-rich and consonant-final, it is more difficult to import into Loglan than iglu, parka and simba were. We will learn how to borrow it in Section 6.5.
Another source of locally-derived words are the language-nationality-culture triplets which we find in Loglan: words like spana, spani and spano. As we have already noted, all three of these were derived from Spanish 'Espan(i)ol' and mean in some sense 'Spanish'. Such words are brought into Loglan either as one-source primitives, as these three were, or as borrowings. So these, too, form a large pool of importable natural words only a small portion of which has yet been imported. In addition to the "Spanish" triplet above--spana for 'is an instance of the Spanish language', spani for 'is a Spaniard/Spanish person', and spano for 'is a Spanish custom/instance of Spanish culture'--we also have, with parallel meanings, dotca/-i/-o from German 'Deutsch' and 'deutsche', hinda/-i/-o from Hindi 'Hindi' and 'Hindu', ponja/-i/-o from Japanese 'Nippon' and 'nippon-ji', and junga/-i/-o from Chinese 'Zhung' and 'Zhungwo'. There are perhaps a dozen more of these primitive-form triplets in the current lexicon. But this just scratches the surface. There are at least 5,000 nameable languages on this planet; and a distinct people and culture--or at least a sub-culture--can usually be associated with each language.
There is still a fourth body of internationally-used concepts waiting to be incorporated into Loglan, and it is the largest yet. This is the international vocabulary of science. Of these, by a conservative estimate, there are now upwards of a quarter of a million terms. Science words were mainly derived from Latin or Greek by European scholars during the few centuries since the European Renaissance, and they are found similarly-spelled, or spelled with only minor and often quite regular variations, in nearly all European languages. Indeed, when suitably transliterated into other writing schemes, they are to be found in all the other world languages whose speakers now "do science".1 In incorporating these and other scholarly words into Loglan, we regularly use not the sound but the appearance of the scientific word as our model. We do this because, while scientific words are often pronounced very differently in different languages, their appearance on the printed page is usually quite uniform over many languages. The uniformity is often faithfully preserved even when the original Graeco-Latin word has been transliterated into some other alphabet, for example, into Cyrllic, the alphabet of Russian. So it is the scientific reader, not the listener, we must keep in mind when we are borrowing a scientific word for Loglan.
In making a scientific borrowing we try to retain those portions of the international word that are most consistently retained in other languages. This means avoiding the special endings that are often imposed by certain languages as well as language-specific spellings. At the same time, the resulting word must be a Loglan predicate, made with Loglan phonemes, and so display at least one consonant-pair as well as being vowel-final. It must also be regularly pronounceable as a Loglan word however it is spelled. Proteini [proh-TAY-nee] is an example of a word which meets all these requirements handsomely. It not only captures the look of what amounts to the "same word" in all European languages, but its final vowel is the standard /-i/ given to all Loglan scientific borrowings. So to a Loglan reader it has the unmistakable look--and sound, when da pronounces it--of a scientific borrowing. As many science words in Loglan do, it sounds a bit Italian. But that's fair; the scientific Renaissance commenced in Italy.
Not all ISV words ("International Scientific Vocabulary") fare so well. Take the word that is spelled 'atom' in English. Adding the conventional ending -i to the English word (which happens also to be the invariant international stem) gives ?atomi. (A leading question-mark, recall, is the mark of a trial word.) But heard in a Loglan utterance this trial word would break up as the little word phrase a to mi = 'Or two of me'. Thus ?atomi lacks the essential consonant-pair that is the distinctive feature of the Loglan predicate. But this lack is easily supplied. We insert a "gluing consonant"--conventionally the phoneme /h/--between the /t/ and /o/ of ?atomi, and this gives the word the required consonant-pair: ?athomi. Moreover, it puts the consonant-pair in a position that will prevent the trial-word from breaking up. The result is athomi [aht-HOH-mee], a word that now passes all tests. While not so immediately recognizable as proteini is, the Loglan word for 'atom' is still recognizable on second glance... especially by loglanists who know the gluing art and understand the vital need for it.2
Probably the International Scientific Vocabulary is the largest single population of words that will ever be incorporated into Loglan. Now approaching uniformity in all the languages whose speakers follow science, the terminology of science has become a truly planetary linguistic phenomenon, its numbers probably now growing more swiftly than that of any other body of words on the planet. In a sense it is already an international language, one waiting only for its grammar. Loglan may supply that international grammar. Sections 6.5-9 will be devoted to the art of ushering this flood of international words into Loglan.
So far we have mentioned making complex predicates and borrowing three kinds of concepts: international but once-local words in common use, language-nationality-and-culture words, and the vocabulary of science. We have treated these as four sources of potential growth for Loglan. A fifth but much smaller source of future growth is that we may yet find that the present set of semantically basic predicates, the composite primitives, is not yet complete...that is, not yet entirely adequate to the growth needs of the language. It is these semantically basic words, of course, that are employed in building complex predicates. They represent concepts that are nearly universal in human experience, and are represented by primitive predicates in nearly every human language. We will know that the Loglan set of them is incomplete when we find that words that we wish to build as complexes in Loglan may not be built without certain constituent notions that are not yet in our language.
As I mentioned, there are at present about 860 of these composite primitives in Loglan. It is pleasant to report that their number has grown only very slowly over the years. They have been tested three times for constructive adequacy against lists of the most frequently-used concepts in four major European languages (English, Spanish, French and German). So far they have passed these tests with flying colors. That is to say, they have proved collectively capable, with a negligible addition-rate, of supplying metaphors for all the complex notions we have so far encountered in these languages.3 But they have not been tested against languages which are representative of other language-families than the Indo-European one. It will therefore not be surprising, on Whorf's hypothesis, to discover that in embracing the concepts of the non-European languages, as we are now on the verge of doing, a handful of new "fundamental notions" may yet be required. If so, a new handful of composite primitives will have to be added. These will then be used to express the complex notions of still other major branches of human thought.
As their name implies, composite primitives are constructed compositely from words drawn from many languages. This procedure is described in Section 6.5 below.
A sixth source of word-growth is obviously inherent in the privilege every loglanist has of adding new names to the language. These may be either imitations of natural names, or entirely new coinages. There is in principle no limit to the number of names that may be added to the language. The making of names is discussed in Section 6.13.
Seventh and finally, there are certain "open" classes of CVV-form little words to which new members may be added from time to time. Chief among these are the modal operators, such as lia = 'like', and the discourse operators, such as sui = 'also', which may be indefinitely augmented. In addition, compound attitude indicators, like aiui = 'Yes, gladly', may also be added at will. Changes in, or additions to, other classes of little words will be more difficult to accomplish; but it is probable that needs for new punctuation words will be discovered, and it is even conceivable that whole new systems of little words will have to be incorporated to accommodate grammatical features of natural tongues which have so far been overlooked in our analyses. The unused CVV-form words listed in Paradigm L of Loglan 4 & 5 are available for making these additions. The V, VV and CV word-spaces have long ago been exhausted.
In sum, there are seven sources of future word-growth for Loglan: (1) the addition of new complex predicates (like djasolsensi for 'sociology of knowledge'); (2) the incorporation of international words (like futbo) and (3) local food, tool, plant and animal words (like gorgonzole and atlatlu), (4) the coining of language, nationality and culture triplets (like inhuita/-i/-o = 'Innuit' or 'Eskimo'), (5) taking in yet more of the international vocabulary of science (like deoksiribonukli), (6) bringing in new person and place names (like Betcua'naland), and finally (7) inventing new compound little words (like pacenoinacefacenoifazu = 'once-and-future-but-not-now-and-not-for-long', an ad hoc invention for the present paragraph) and the assignment of meanings to little word forms presently unassigned. In the remainder of this chapter we will consider how these seven elastic chambers of the language may be utilized by the speaker who wishes to contribute to its growth.
The composite primitives of Loglan are the semantically universal predicates of human experience. Such words have all been made in Loglan by deriving them from the eight source languages in such a way that the probability of recognizing them, as measured over the population who speak one or more of these eight languages, has been maximized. In general this has been done by ensuring that each phoneme in the constructed word appears in as many natural words of similar or related meaning as possible. The result is that these Loglan primitives have been made of overlapping pieces of natural words. It is in this sense that they are composite words.
The decision to add a new composite primitive to the language is a major one. Not only is the work involved many times greater than that required to make a complex, a borrowing, or a local primitive, but the presupposition that the machinery for making complex predicates either cannot or should not be applied to making a new predicate word for the needed notion, or that the notion is not already conveyed by a word that has spread internationally, which would justify our borrowing it, or by one that is of only local origin or significance--in which case, for a different reason, we would also be justified in borrowing it--is, by this time, likely to be a questionable one. To fail to find a suitable metaphor does not mean one cannot be found; and the discovery of a new, truly basic primitive is, by this time, an exceedingly rare event.
One late arrival in the set of primitives will illustrate this. It was the word setci = 'set'. The word klesi for 'class' or 'category' had been around for a long time. But there are many collectivities which are not classes, in that their members do not share any taxonomically useful characteristic--as mammals do, for example, and divorced American mothers also do--and which are, therefore, only sets. For example, the 37 objects bigger than dust-speck that are currently on my desk constitute a set, as do every tenth word in this book. The distinction between sets and classes is semantically fundamental...in a logical language. Moreover, there are certain complex notions--like 'genotype', 'character', 'kin', and so on--that can hardly be defined, or even thought about, without using the notion of a set. Klesi, to be sure, could have been made from the idea of a set; but not the other way round. Klesi was already deeply embedded in the semantic structure of the language; and anyway, until their differences from sets have been analyzed, classes and categories are more likely to be objects of direct perception than sets are. One sees the cardinals flitting through the woods as representative of their larger, unseen class: the species Pyrrhuloxia cardinalis. But one does not see the scattered objects on one's desk as members of any collectivity...until, that is, one begins to think metalinguistically about just such phrases as 'the objects scattered on my desk'. It is from such historically late thoughts about meaning that the concept of sets arises. But late or not it is fundamental to our hypermodern language. For all these far-flung reasons, then, setci was reluctantly added to the primitive notions of Loglan, and klesi was retained.
Still, the occasion will again arise--though one would hope, infrequently--when a concept of truly general human significance will be found to have no good translation into Loglan and for which the existing set of primitives provides no ready route by which to express it metaphorically. As mentioned in the previous section, just this occasion may arise when and if the productive adequacy of the existing primitive set is tested against the complex concepts of some non-Indo-European language, say Chinese; for when this is done it may turn out that a supplementary set of "fundamental notions" will have to be added to our international language. It is well, then, to record here the methods by which composite primitives have already been built, and by which new ones may still be added to the language.
In order to achieve a kind of cultural neutrality for Loglan, its primitive predicates were formed as composites of words of similar meaning in the natural languages. As I indicated while discussing Loglan design principles in Chapter 1, this should make cross-cultural experimentation with Loglan not only possible but fair.4 To this end, it became our technical objective to make at least this one very important category of Loglan words--arguably the most important, for among them are the elementary semantic notions of any language--maximally recognizable over the largest possible population base...or at least one large enough to afford a variety of both linguistic and cultural contrasts around which to design cross-cultural experiments of a Whorfian cast. This objective was easier to meet than it now sounds. The first eight languages with which we tried to meet it turned out to be entirely adequate--perhaps more than adequate--to the task. For the total amount of recognizability it was possible to build into Loglan words from a consideration of just these languages was surprisingly large.
The languages chosen were, in the order of the combined totals of their primary and secondary speakers in 1955: English, Mandarin Chinese, Hindi, Russian, Spanish, Japanese, French and German. These are the eight most widely spoken human tongues. The trial addition of the ninth language, Arabic, while it would have appreciably increased the size of the population base, as well as the linguistical and cultural variety of future experimental designs, turned out to make no significant contribution to the average recognizability of our composite words; so Arabic was not retained. Tests in the other direction--that is, of the effects of dropping German, French, Japanese, etc., in reverse order from our list--were not carried out. We were satisfied with what may well be the over-achievement of our aim.
Linguistically considered, this set of what we may now call the "source languages" of Loglan contains representatives of three language-families: Sino-Tibetan (Chinese); Japanese, which is sui generis; and Indo-European (all the rest of the source languages). Looked at geographically, the source set contains the three most important Asiatic languages--Chinese, Hindi and Japanese--as well as the five most important European ones. Among the source European languages, there is a Slavic language (Russian); two Germanic languages (German and English); and two Romance languages (French and Spanish). Some of the most interesting structural differences among human languages, from the Whorfian point of view, are spanned within this group of eight. As to culture, differences large enough to satisfy the most eager Whorfian also exist among the peoples represented in this group. The Chinese-Western difference, the Chinese-Japanese difference, the Indian-Western difference, the Romance-Germanic one--differences which express themselves in some of the widest philosophical, aesthetic, religious, political, metaphysical and ideological contrasts now partitioning the planet--all exist within the domain of Loglan's carefully contrived neutrality to be experimentally explored.
The population from which subjects for these experiments might be enlisted, and about which, therefore, conclusions might be drawn, is also satisfactorily large. The number of human beings who spoke one or more of the eight source languages as a native tongue, or, if none of them, then one or more of them as a second language, was, on the statistics available in 1955, roughly three-quarters of the population of the Earth.5 The average recognizability score of a Loglan composite word over this immense source population is 45%. This figure means that a person selected at random from this population would have the mathematical expectation of finding that about 45% of the sounds of some word of related meaning which da knows, appear in the same order in each new Loglan word with which da might be presented. Given the linguistic diversity of the source languages, this figure is surprisingly high. One would not have thought there would be so much commonality in the sounds of such historically diverse languages. But in part this high figure derives from the generosity of the Loglan five-letter form. For very often two very short natural words can be accommodated, as it were, side by side. This often happens for example, with Chinese and English, the two dominant languages on the list, both of which abound in short words. Thus Loglan forli ('strong') contains three-fourths' of English 'fort' followed by all of Chinese 'li'.
For the immense number of primary and secondary speakers of English--now reckoned at well over 30% of the world's five billion people--the probability of finding recognizable phonemes in a Loglan composite primitive is substantially more than .45; in fact it is about .70. English speakers are not only marginally more numerous than the second largest group, namely speakers of the Beijing dialect of Chinese, but unlike Chinese, English has lexical affinities with three other languages on the list. It shares some of its vocabulary with both Romance languages in the source set as well as other words with the other Germanic tongue. In other words, English has more linguistic allies in the source set than any other language in it. This, too, contributes substantially to the high average amount of English to be found in the typical composite word. Japanese, being sui generis, and so with the fewest allies, is also among the less widely spoken languages on the list. So it makes the smallest contribution to the Loglan composite word, with an average containment of Japanese phonemes of only 12%. The wide spread between these two figures is a consequence of our method of making--or, more accurately, discovering--high-scoring composite sequences, which is the method we will now describe.6
To maximize the probability of recognition of a given word over the source population, it was only necessary to weight the contributions made by each language to each trial word by a number which represented the proportion of the total source population who were speakers of that language.7 Then the trial word that demonstrably had the highest weighted-score was taken to be the best Loglan word available for the concept. The proportions, or scoring weights, which were used in this work are given below:
|
|
Let us now consider some examples of how these weights were used.
For the concept 'week' the trial word which received the highest R-score ("recognition score") was likta; and likta contains:
| 2/3 of | English 'week' | /uik/ yielding | 2/3 X .28 = | .19 |
| 3/5 | Chinese 'li bai' | /libai/ | 3/5 X .25 = | .15 |
| 3/7 | Japanese 'isshukan' | /iscukan/ | 3/7 X .06 = | .03 |
| 2/5 | Hindi 'saptah' | /sapta/ | 2/5 X .11 = | .04 |
| 3/9 | Russian 'niedielia' | /niedielia/ | 3/9 X .10 = | .03 |
| Total R-Score: | .44 |
The phonemes in the natural word which were counted as matching phonemes in the Loglan word are given in boldface.
Likta is a typical word in more ways than one. Not only is its score very near the average, but the fact that the only substantial contributions to its recognizability are made by English and Chinese is also typical of a very large group of composite primitives.
Here is the derivation of a fairly high-scoring word, djano = 'know', to which three languages made substantial contributions; for djano contains:
| 2/2 | English 'know' | /no/ | 2/2 X .28 = | .28 |
| 4/4 | Hindi 'jan-na' | /djan/ | 4/4 X .11 = | .11 |
| 4/5 | Chinese 'j dao' | /djdao/ | 4/5 X .25 = | .20 |
| Total R-Score: | .59 |
Note that the suffix '-na' is omitted in the phonemic transcription of the Hindi word 'jan-na' and that English 'know' is not transcribed as /nou/, as it would be if we were concerned with the exact phonetics of this word. This is because English speakers hear this diphthong as /o/.
As a third example, consider the derivation of a low-scoring word, dzoru = 'walk', which happens to be exclusively derived from the two Far Eastern languages; for dzoru contains:
| 4/4 | Chinese 'dzou' | /dzou/ | 4/4 X .25 = | .25 |
| 2/3 | Japanese 'aru-ku' | /aru/ | 2/3 X .06 = | .04 |
| Total R-Score: | .29 |
Again, a suffix, '-ku', is ignored in the calculation.
Here is another moderately high-scoring word, morto = 'dead'. This one comes exclusively from the six Indo-European languages; and it contains:
| 3/3 | Spanish 'mor-ir' | /mor/ | 3/3 X .09 = | .09 |
| 3/3 | French 'mort' | /mor/ | 3/3 X .06 = | .06 |
| 4/5 | English 'mortal' | /mortl/ | 4/5 X .28 = | .22 |
| 3/4 | Hindi 'mrit' | /mrit/ | 3/4 X .11 = | .08 |
| 2/3 | German 'tot' | /tot/ | 2/3 X .05 = | .03 |
| 3/5 | Russian 'smert' | /smert/ | 3/5 X .10 = | .06 |
| Total R-Score: | .54 |
Each word adopted was, of course, the survivor of a competition among as many as a dozen trial words. In nearly all cases the accepted word was demonstrably the best word that could be made from the lexical materials considered for each language. The exceptional cases were those in which the best word conflicted with an existing word, and so a slightly lower-scoring word had to be adopted.
As the above derivations show, a natural word was held to make a non-negligible contribution to a trial word only under fairly rigid circumstances. Any sequence said to be a "matching" one had, in fact, to meet the following three conditions:
In other words, we found that if only two phonemes were involved in a given match, they had to be reasonably close together and found in the same consonant-vowel pattern in both words. Thus the /aCo/ pattern in ?flako was said to match the /aCo/ pattern in 'tabol', but this sequence was not taken as matching either the /aVo/ in 'saio' nor the /aCCo/ in 'tabro'. On the other hand, we found that if as many as three phonemes were matchable between the trial and the source word, then the similarity between them was evidently more robust. For it apparently didn't matter whether the contexts matched or not. We found that the order of even three phonemes continued to matter, however.
What sounds of what source languages were taken to "match" what Loglan phonemes was a fairly complex problem in local phonemics which was settled differently for each source language. In Spanish, for example, any 'a' is said to match Loglan /a/; in English, only the 'a' of 'father', the 'o' of 'not', and the 'a'-sound occurring in diphthongs like 'aye' and 'ow' were said to match Loglan /a/. But because it "sounds like" 'o' to English speakers, the English final diphthong /ou/, as in /nou/ = 'know', was said to match Loglan /o/. In French, nasal 'a' and 'o' were taken to match Loglan /a/ and /o/ respectively provided they were followed by /n/ or /m/ in the Loglan words, for then it seems so to French ears. In Chinese, the sounds represented by 'hs' and 's' in the Wade system of transcription were both taken to match Loglan /s/; the sound written 'j' in Wade but with 'r' in the Yale system and in Pinyin is matched with Loglan /r/ even though this sound is not recognizably 'r' to any European ear; and the Chinese sound written 'ssu' in Wade, 'sz' in Yale, and 'si' in Pinyin is unmatchable with Loglan because the sequence /sz/, which is its approximate phonemic value, is proscribed in Loglan predicates. Of course all eight source languages have sounds that do not match any Loglan phonemes at all. But the above remarks will give some idea of the problems encountered in comparing the sounds of Loglan trial words to the words of several phonologically quite different languages simultaneously, and some of the ways in which these problems have been solved. In general, our ruling principle has been to set up such identities between Loglan and each source language as would fairly predict which sound-pairs would seem "recognizably identical" to a listener from that language, and which would not.
The recommended procedure for contributing a new composite primitive to the language is as follows: First, make a list for each of the eight source languages of all possible words in that language which might serve as mnemonic cues to the concept to be defined. They do not have to be synonyms.8 Eight good pronouncing dictionaries will be required. Second, transcribe each such potential cue-word into Loglan phonemics. Use such matching principles for each language as you can devise, or which you can induce from the derivations given in the Loglan-English dictionary. Use asterisks (or some other non-phonemic character) to record non-Loglan sounds in these transcriptions; for although they can make no contribution to a trial word, they must be counted as contributing to the overall length of each cue-word.9 Third, make trial words. Each word must, of course, be of either CV'CCV-form or CCV'CV-form, and any initial or medial consonant-pairs must appear in Tables 2.1 and 2.2, respectively, of Chapter 2; that is, they must be permissible. In making trial-words, you may be guided by four hypotheses, given here in the diminishing order of their liklihood: H1: The best word will maximize the joint contribution of Chinese and English. H2: The best word will maximize the contribution of English. H3: The best word will maximize the contribution of Chinese. H4: The best word will not maximize either Chinese or English, or Chinese and English jointly, but will capitalize on some adventitious commonality among cue-words of other languages. H4-words are exceedingly rare. There may be, of course, several trial-words to be tested under each hypothesis, and words produced under H1, H2 and H3 may be identical. Fourth, prove that the highest-scoring trial word or words is in fact the best possible word on the lexical materials you have assembled. Fifth, ascertain whether the best word conflicts with some existing word, and if it does, adopt the best of the words that do not conflict.
A trial-word is said to conflict with an existing word if (a) they are phonemically identical (no homonyms are allowed), (b) they differ only in their final vowels (likta and ?likti would conflict, for example), or (c) their only difference is a pair of consonants that occupy adjacent vertices on the "square of sibilants":
s------c
| |
| |
z------j
What this third kind of conflict means is that a pair of primitive-form words that differ only in the /s c/ difference, for example--as cimra and ?simra ([SHEEM-rah vs. SEEM-rah] would--are very likely to be confused in conditions of moderate noise; so this may not be the only difference between them. (Cimra/?simro would have an acceptably larger difference between them, for instance.) /c j/, /j z/ and /z s/ as only differences would also lead to word-pairs that are likely to be confused. However, the "diagonal differences" /s j/ and /c z/ in the square of sibilants are quite acceptable. For example, monca = 'mountain' and monza = 'morning' function side-by-side in the language with no evident problems. This is probably because two phonetic features discriminate the diagonal pairs--specifically the presence or absence of voice and the front-back position of the tongue--whereas only one feature discriminates the pairs on the sides of the square. Our research has shown us that differences in two features are always sufficient to distinguish word-pairs like monca/monza and jurna/surna even in conditions of moderate noise. Notice, also, that words differing only in their stressed vowels, like kerti/kurti, are quite distinct; whereas the same phoneme-difference in an unstressed syllable, as in larte/?lartu, would have generated conflict had we allowed such phonologically close pairs to exist in the language.10
Members of spana/-i/-o-type triplets do, of course, conflict with one another by this last criterion. But this is allowed. In fact, it is a deliberate feature of these semantically closely-related sets of local primitives. The fact that they are similar in all but their final vowels signals the close semantic relationships between the members of these language-nationality-culture triplets.
When you have made your new composite primitive, and assured yourself that it generates no conflicts in the existing lexicon, you may send your new word with its definition and derivation--arranged more or less as the dictionary entries for composite primitives are arranged in Loglan 4 & 5--to the address of The Institute given on the title page of this volume. Accompany it with a brief argument as to why you think we need it. For example, lists of useful complexes which might be made with your primitive would be germane. If, in the judgement of the Word-Makers Council, your word and the concept which it expresses are useful additions to the Loglan composite primitives, it will ultimately appear in the next edition of our dictionary, and, in the meantime, notice will be given of its acceptance, along with other current new words, in the various bulletins to that effect that appear from time to time in The Institute's other publications.
To add a new complex predicate to the language one must first coin a metaphor or metaphors capable of suggesting the meaning of the new predicate to future learners of the language. Remember that for the forseeable future Loglan will be learned as a second language by adults, only later by children. It is in this adult context that the importance of good metaphors will be seen. For example, the metaphor "sign-know" is very likely to suggest to a newcomer to Loglan the meaning of a predicate that is well-translated by English 'understand', especially if it is to be used in the sense of knowing the meaning of some sign or message. Surely it will do so a little better than 'under' plus 'stand' have ever succeeded in doing for adult learners of English!
Second, after one has made a metaphor, one must make sure that this metaphor or these metaphors are in fact expressible in Loglan primitives or borrowings. The metaphor "sign-know" is so-expressible; for both sanpa ('sign') and djano ('know') are elemental words...in Loglan as well as English.
Third, one must make a trial word or words from the Loglan version of each metaphor, making sure that it or they are in morphologically permissible form. For example, sanpa has two short affixes, san- and -saa, while djano has but one: dja. So the only possible 6-letter renderings of this complex notion are ?sandja and ?saadja. The morphology tells us (Table 2.3) that the */n+dj/-joint is not permissible. In fact it is unintelligible. The /d/ becomes inaudible in this context; so the string that is intended to be */ndj/ will be heard as /nj/. Hence ?sandja will be heard as sanja, which is primitive in form and thus misinforming. So if we want to use san + dja, we must buffer the */n+dj/-joint with the hyphen /y/. The result is sanydja [SAHN-nuh-jah], a 7-letter word. The hyphen looks awkward for so common a word. So we decide we prefer the one surviving 6-letter form saadja [sah-AH-jah].11
Finally fourth, one must discover which, if any, of the trial-words is in fact still available in the sense of falling within the remaining "free word-space" of the language. At the time it was made, the saadja-slot was still free, so saadja was adopted as the word for 'understand'. The metaphor was adopted in 1962, and the word itself was remade with the new morphology in 1982.
Let us take up these points one at a time from the perspective of the maker of a new complex.
Behind every complex predicate, old or new, stands a metaphor. One's insight into the meaning of that metaphor may be as sudden as a hammer-blow, like the immediate understanding that morto madzo ('dead-make') can mean nothing else than 'kill', or, like the French phrase 'savoir faire' ('to know to do'), one's understanding of its meaning may sink in only slowly over the first half dozen occasions of its use. (The same metaphor in Loglan, by the way, is durzo djano = 'do-know'; this is the thought that lies behind the Loglan complex predicate duodja [doo-OH-jah] = 'know how to do'.)
Sometimes one's understanding of a metaphor depends on the contrasts it makes with other, similarly constructed ideas. For example, the difference between the active kind of knowing conveyed by duodja and the more passive sort conveyed by saadja is a larger difference than the one between saadja and siodja [SYOHD-jah], which also means 'understand' but this time in the very different sense of comprehending the workings of some individual or system. Siodja, happily enough, is derived from the metaphor sisto djano, or 'system-know', which is an idea that seems very satisfactorily to convey the subtle difference between these two basic kinds of understanding ('I understand him' vs. 'I understand what he is saying').
A good metaphor must also avoid the emptiness of one term's being included in the sense of the other. If the meaning of one term in a metaphor is already implicitly contained in the meaning of any of its companions, then including it does not add much. Thus ?bersakli = berti sakli ('carry-sack') does not express the idea of a suitcase very effectively because (nearly) any sack or bag can be carried. ?Berbao = berti bakso ('carry-box') is an improvement since perhaps not all boxes can be; but ?berbao does not quite yet make the point about luggage. Racysakli [rah-shuh-SAHK-lee] made from traci sakli ('travel-sack') is a better metaphor since travelling is at least a surprising thing for a sack to do; but the word itself is awkwardly long. Note that the proscribed sequence */cs/ in ?racsakli required hyphenation. Moreover, there is no 3-letter affix for sakli. So racysakli is perhaps longer than we'd like the word for 'luggage' to be. So the metaphor we first chose to convey the idea of a piece of luggage in Loglan was traci bakso ('travel-box' or 'traveller's-box'). This yielded the complex racbao. We decided that this word was best for 'suitcase' because (i) all pieces of luggage travel, (ii) not all boxes travel, and (iii) nearly all luggage these days is rectilinear or boxlike in shape. Indeed, those that are not--like the duffle bags that sailors still use--could well be called racysakli ('travel-sack') in Loglan, a word with plain affinities to racbao. But having a pair of words for different kinds of luggage then suggested that the general term for luggage ought to be not racbao but racveo [rahsh-VEIGH-oh], the second part of this new complex being derived from veslo, a word which means 'vessel' and is the generic word for 'container' in Loglan. Thus racveo could mean 'travel-container'. So after a number of false starts it appeared that we had at last arrived at our destination. The lesson to be learned from this story is not to fall in love with one's early metaphors. Let them lie around unloved for awhile. If you do, better ones are very likely to come along.
Some metaphors require a fairly close analysis of the concept to be expressed. The English metaphor to be 'on the verge of', in the sense of being about to do something, is an example of an idea that defied successful metaphoric capture for some time. We tried moidru from modvi durzo ('intend-do') and durmoi from durzo modvi ('do-intend'); but intention is not the essential feature of an actor on the verge of something. Da may have been intending da's action for some time. It is the fact that da is on the edge of acting now that distinguishes a person about to do something from one who only placidly intends. Once this is seen, a happy metaphor is immediately forthcoming; and the Loglan word for 'be about to do' became durbiesni [door-BYESS-nee], a contraction of the phrase durzo bidje snire, a phrase which in turn translates literally and happily as 'do-edge-near'. So in the end we went back to the figure lurking behind the English metaphor. A person who is "on the verge of" something is indeed "near its doing-edge".
One might conclude from these examples that metaphor-making is essentially a poetic act. This would not mean, however, that only poets can do it...or if it did, that would not be much of a restriction. For there is probably a poet lurking inside every human head.12
Occasionally one makes two or more related complexes at once. Thus groracbao meaning 'trunk', made from groda racbao ('big-suitcase')--actually, from groda traci bakso ('big-travel-box'); for the terms of metaphors within metaphors must also be meaningful when unravelled--was made at the same time as racbao was, and suggests that the essential difference between a suitcase and a trunk is size. And so it is; for both are boxlike and both are luggage. And both contrast in shape with racysakli, the duffle-bags. On the other hand, all are racveo, the travel-vessels. Thus by contriving this quartet of words all at once, we permit racveo to be used generically, whereby Kambei lomi racveo eo ('Bring my luggage, please') may be expected (in Loglandia) to fetch all one's luggage however varied.
Suppose, now, we have a satisfactory metaphor or set of metaphors for some new predicate idea we wish to express, and that we have been able to write these metaphors in Loglan using only the existing set of primitives. How then do we build trial words? The complex-making procedure is algorithmic; which means that a computer program can do it for you.13 This is how that program works:
Step 1. It assembles all possible affixes for each position in the metaphor. Thus for sanpa djano it assembles:
| 1st Term | 2nd Term |
|---|---|
| san | dja |
| saa | djano |
| sanpy |
Note that a CVC-form affix need only be considered for non-final positions. CCV- and CVV-form affixes may be used in any position, but, as we'll see in Step 3, if one of the latter is initial in a 3-term or longer complex, it must be hyphenated to the body of the word to keep it from "falling off".
Step 2. It then generates all possible trial-words. In this case, there are six of them:
| ?sandja | ?sandjano |
| ?saadja | ?saadjano |
| ?sanpydja | ?sanpydjano |
Step 3. The algorithm then examines the "joints" in these trial-words to discover whether any need hyphenating, using Tables 2.2-3 to do so. Three joints do. ?sandja and ?sandjano require hyphen /y/ because of the unintelligible */n+dj/, and ?saadjano requires hyphen /r/ because, without it, /saa/ would fall off.
So hyphenated, the new set of trial words is:
| ?sanydja | ?sanydjano |
| ?saadja | ?saardjano |
| ?sanpydja | ?sanpydjano |
Step 3 reminds us that there are two kinds of intraverbal hyphens in Loglan as was mentioned in Chapter 2. One is hyphen /y/, which is used to buffer otherwise unpronounceable or unintelligible consonant sequences and to attach 4-letter affixes like sanp- to their neighbors; the other is hyphen /r/ (with its allomorph /n/) which is used to join two CVV-form affixes together in two-term complexes and to prevent initial CVV-form affixes from falling off three- or higher-term complexes.
Step 4. The algorithm then looks for and eliminates trial-words that break either of two rules: (a) No word is allowed to have adjacent identical vowels unless one of them is stressed. (b) The trial-word must not resolve into a CV-word followed by some unintended complex.
Part (a) of Step 4 eliminates ?saardjano but allows ?saadja because the second /a/ in the shorter word is stressed: /saADja/. Part (b) is called the "Tosmabru Rule" because the trial-word ?tosmabru has this property. The trial word ?tosmabru was once intended to be heard as tos+mabru; but instead it resolved as the phrase to sma+bru. None of our trial-words fails the Tosmabru Test.
Five trial-words survive Step 4:
| ?sanydja | ?sanydjano | ?saadja | ?sanpydja | ?sanpydjano |
Step 5. Determine the best word among the survivors.
To do this, the algorithm scores them all. Unless the concept is one with a low expected frequency of use, in which case a longer word may be desirable, the shortest words score highest. So on the grounds of length alone the computer will pick saadja from this set. Among words of the same length, however, words with more vowels score higher than those with fewer vowels. This is because Institute policy at present is to keep the vowel-consonant ratio in complex words as high as possible. This not only makes them easier to pronounce--judged on a world-wide basis--but increasing vowel-ratio also reduces the amount of consonant-buffering required in the buffered dialects; and this will reduce the average length of buffered words.
Human word-makers short-circuit the complex-making algorithm at many points. In fact, the human worker who looks up, or already knows, the short affixes of sanpa and djano (as found in Appendix C) will recognize immediately, on seeing san saa and dja, that ?saadja is the best of the two words that can be made from this metaphor with short affixes. For one thing, ?saadja will be seen to be richer in vowels than ?sandja is even before the possible difficulties of the sequence ?/n+dj/ are noticed and investigated. Therefore these difficulties do not need to be investigated. The best word is clearly saadja and that's the end of it.
With a little practice such decisions can be made in a few seconds. All one needs is a list of primitives and their affixes (such as Appendix B or C) and access to a few short rules (such as those embodied in Tables 2.2-3).
Technical words may be as long as the maker feels is necessary to get da's point across. Let's look at the word for 'agronomy' again. Its deriving metaphor is rodja madzo sensi = 'grow-make-science', or the science of cultivation. The algorithm makes rojmaosensi [rohzh-mough-SEN-see] from this metaphor, the long affix, -sensi, for the 'science'-term having been dictated by the fact that sensi has no V-final short affix. In fact, its only short affix is ses-. Building on rojmaosensi, the algorithm will then make the word for 'agronomist' by exploiting the one short affix that sensi does have. Indeed, ses- is involved in the general word for 'scientist', which is sesmao. Thus, as we observed earlier, the scientist, in Loglan, is seen as a "science-maker". So replacing -sensi in rojmadsensi with -sesmao we get 'agronomist' = rojmaosesmao [rohzh-mough-SESS-mough], the "cultivation-science maker". The fact that -mao is used in two quite different senses in this word, one figurative, the other literal--or both figurative, but to different degrees--is part of its poetry...the sort of thing that a poet but not a logician might be expected to do.
Most Loglan complexes are made up, as rojmaosesmao and saadja are, of 3-letter segments. As you may already have discovered, the separateness of these triplets virtually leaps off the page. Like the codons of the genetic code, their intelligibility derives from their regularity. Some irregularities do exist, of course...those caused by hyphenation and the occasional long affix, such as the long final segment in rojmaosensi, for example. But, as we'll see in a moment, even irregular segment boundaries are easily spotted by the resolver. Moreover, once resolved, each segment may always be assigned to exactly one primitive predicate. As a result, each Loglan complex will always be uniquely decipherable. So when it is new to a learner, it will be heard as a string of elemental meanings, the whole string to be understood as a metaphor. What is remarkable is that the decipherment and subsequent understanding of such metaphors can be done so easily by persons who have never seen or heard the word before.
To guess the intended sense of a new metaphor from a string of elemental meanings is of course an inductive leap, full of insight and daring. It is one of the most amazing performances of the human mind. Yet that we should be able to make this leap with the near-infallibility with which we do make it is perhaps only the other half of the poetic gift, the gift by which we understand the poetry of others.14 In fact, in what will often be their instant understanding of the metaphors behind the new complex predicates they encounter, human users of Loglan will be enjoying one of the few advantages they will ever have over their computers. The latter will, of course, have to look each word up: old or new, simple, borrowed or complex. (Fortunately computers can perform these incessant dictionary lookups at great speed.) In contrast, human users of Loglan will seldom have need for dictionaries. Once they have mastered the basic kit of Loglan primitives and their affixes, human auditors and readers will seldom have to look up even new words. When they do, it will usually be to confirm and extend an insight they will already have gained from the metaphorical combination of easily deciphered parts.
In making this kind of dictionary-free learning possible, Loglan is by no means unique. Germans, too, seldom use dictionaries. It is said that by the age of eight, each German-speaking child has mastered the basic building blocks of the entire adult German vocabulary...with the result that, from that point on, and except for foreign loan-words, almost nothing ever has to be looked up. We have tried to engineer this same desirable property into Loglan.15
One apparent irregularity in the segmentation of Loglan complexes is that occasionally they are hyphenated. But this is only apparent. As the hyphens--whether /y/ or /r n/--always occur at the boundaries between segments, they may hardly be regarded as hindering segmentation. In fact, except for two irregular element words, ytrio and yterbio ('yttrium' [Y] and 'ytterbium' [Yb]), which are obliged to be spelled with the irregular phoneme /y/ because of the 'Y' in their international symbols, /y/ never occurs anywhere else in Loglan predicates except at segment-boundaries. So if an instance of /y/ is found in a predicate word, and the word is not ytrio or yterbio, that /y/ is marking some segment boundary; and the predicate itself, of course, can immediately be known to be complex. So the presence of hyphens helps rather than hinders both the segmentation process and the swift classification of words to which it leads.
The /r n/-hyphen is of course more difficult to see than the /y/ one. The reason is plain. The sound /y/ has few functions in non-names. Except for the two /y/-bearing element words, the phoneme /y/ occurs only in its letter-words and in predicates. In the latter it occurs only as a hyphen or, in the buffered dialects, as a consonant buffer. So the appearance of a /y/ in a V-final, CC-bearing word in a non-buffered dialect is an instant signal to both eye and ear that the /y/-bearing word is a complex predicate.
Not so for the /r n/-hyphen. Both /r/ and /n/ occur in many words that are not complex, and in places even in complexes which are not the joints between its segments. Still, the /r n/-hyphen--/r/ being its primary or preferred allomorph, /n/ being used only when the next phoneme is /r/--is easy enough to spot. It is used for just two, easily recognized purposes, both of them quite rare. One is to fasten two CVV-form segments together to make a single two-term word, as bao and mao are joined in baormao [BOUGH-rr-mough] to make 'box-maker'; and the other, even rarer use is to tack a CVV-form segment onto the front of a 3-syllable, or longer, complex. Suppose one wanted a single short word that meant 'market-science'. 'Market' is marte; and looking it up in Appendix C we would find that it has exactly one short affix, mae. Since mae is CVV in form, if it is to be used at the beginning of a 3-syllable or longer word--and all -sensi words have at least three syllables--it must be /r n/-hyphenated. The result is maersensi [mah-ehr-SEN-see]. Maersensi is a pretty word; but perhaps a little obscure...as, in fact, /r n/-hyphenated words tend to be. So probably a better choice would be to settle, in this case, for the long affix marty-, with its incorporated /y/-hyphen, and build martysensi [mahr-tuh-SEN-see]. This new option, though certainly less mellifluous (at least to this word-maker's ear) than maersensi, is transparently the word for "market-science"; and so it is arguably a better word for so technical a concept.16
Even /y/-type hyphens are fairly rare in Loglan complexes. They occur most frequently in words like mekykiu ('eye-doctor' or 'ophthalmologist') in which hyphenation is unavoidable. The deriving metaphor for this word is menki kicmu, and each of its two words has exactly one short affix that will serve in these positions: mek- and -kiu. (Kicmu has another short affix, kic-; but kic- may not be word-final.) *Mekkiu and *mekkicmu are illegitimate constructions, threatening to be reduced in speech to the phrases me kiu and me kicmu immediately. So the four legitimate options are ?mekykiu, ?menkykiu, ?mekykicmu and ?menkykicmu. We see that hyphenation is literally unavoidable. Unless the concept of an eye-doctor is deemed to be infrequently enough used to deserve a longer form--like 'opthalmologist'?--the algorithm will again pick the short word, mekykiu.
Long affixes are another source of harmless irregularity in the segmentation process. Five-letter affixes like -sensi--which are nothing more than the primitive itself treated as an affix--appear as final segments whenever (i) the final term in the metaphor has no V-final short affix, as is the case with sensi, or (ii) a long suffix is deemed to be deserved by a more formal word. Mresurva [mreh-SOOR-vah] and fumsurva [foom-SOOR-vah] for 'manservant'/'valet' and 'womanservant'/'maid', respectively, are instances of the first case. The long affix is required because surva has only one short affix, namely suv-, and it is not V-final. All science-words are also built on this pattern in that they all have -sensi as their final segment: fidsensi ('physics') from fizdi = 'physical', livsensi ('biology') from clivi = 'alive', tarsensi ('astronomy') from tarci = 'star', numsensi ('mathematics') from numcu = 'number', tetsensi ('meteorology') from tetri = 'weather', and so on.
Another kind of long affix is the one used in non-final positions, such as marty- in martysensi. Counting only their meaning-bearing portions, these affixes are really just four letters long; but they are extended to five by the necessity of being joined to the rest of the word with a hyphen. All these long non-final affixes are formed by replacing the final vowel of the chosen primitive with /y/, as marte ('market') is so-modified to produce marty-. As all Loglan primitives are open to this affix-making move, and as all pairs of them except those that happen to be members of the same language-nationality-culture triplets differ in more than their final vowels, such long affixes are always uniquely assignable to exactly one primitive or primitive-triplet. The vagueness of being derived from a triplet rather than a singlet, as is the case of the language-nationality-culture words, may seem a defect of the system. But when one considers that what is common to the meanings of the individual words in a triplet like spana/spani/spano is a very large and robust thing indeed--in this case, the very soul of "Spanishness"--it becomes apparent that it is precisely that large, robust, common thing--that very Spanishness--that is signified in a deriving metaphor by the long affix spany-.
Finally, there is the kind of complex predicate in which one or more of the terms in the deriving metaphor is not a primitive but a borrowing. Athomynukle [aht-hoh-muh-NOOK-leh] is such a complex; the deriving metaphor is athomi nukle ('atomic-nucleus') and the borrowed term, of course, is athomi. We will look at how such borrowings are made in the next section. At the moment all we wish to point out is that, by convention, whenever any term in the deriving metaphor of a complex predicate is a borrowing, then all the joints between the segments of that complex must be hyphenated. (The ambiguity that is cut off at the pass by this rule will be identified in a moment.) The hyphen that is always used to attach such borrowed segments to the rest of the word is /y/. Thus iglymao [EEG-luh-mough] means 'igloo-maker' and is derived from the metaphor iglu madzo, of which the first term is obviously a borrowing. To make a non-final affix from the loan-word, its final vowel (or vowel group, if it had one) has been replaced by /y/.
Take the complex predicate bakteryrodhopsini [bahk-tehr-ruh-rohd-hohp-SEE-nee]. We know it is a complex and a predicate by its internal /y/. The single hyphen breaks this long word into just two parts, baktery- and -rodhopsini. Each part is derived from a borrowing. Baktery- comes from bakteri for 'bacterium/-ia', and -rodhopsini for 'rhodopsin', being final, represents itself. So the deriving metaphor behind this word was evidently bakteri rodhopsini or 'bacterial rhodopsin'. Thus its proper translation (into English) is 'bacteriorhodopsin', a single scientific word which actually means 'bacterial rhodopsin'. That, at least, is the inside story of bakteryrodhopsini as known to its builder.
But what about the outside story? How can such a word be known by the reader or listener to be a 2-term complex made from just two borrowings? Why don't bak and ter also count as segments of this word, representing, as they normally do, bakto ('bucket') and te/teri/tera ('three'/'third'/'triad'), respectively? The answer is that because we know that one of the terms in the deriving metaphor was a borrowing, we may infer that both of them are. Which one do we know to be a borrowing, and how do we know it? We know that rodhopsini is a borrowing because it does not break into segments as a complex and therefore can be nothing else. We also know the rule just stated that if any term in a complex is a borrowing, then all of its joints must be hyphenated. Only one joint in this borrowing-containing complex is hyphenated. Therefore, if the word-maker was obeying this rule--and we must assume da was; that is what rules are for...to ensure the safety of just such assumptions--no other apparent joint, such as the one between bak and ter, is a real one. Therefore, bakter- is a single segment and must come from bakteri, and not from bakto tera, for example. If in fact the word-maker had wanted to make a word for 'bucket-triad-rhodopsin', then da would have hyphenated the bak+ter joint thusly: bakyteryrodhopsini. In so doing, da would be following the rule that requires us to hyphenate all joints of loan-bearing complexes.
It is of course quite possible to create text in which all these hyphenated words have actual hyphens instead of 'y's. In this textual style 'bakter-rodhopsini' would contrast even more visibly with the unintended 'bak-ter-rodhopsini'.
It is now time for a phonological observation. It is permissible to pause after any of the /y/s in one of these long loan-bearing complexes, and even to stress the /y/-preceding syllable, without spoiling its resolution as a single word. Thus the pronunciation [bahk-TEHR-ruh . roh-dohp-SEE-nee] of the word bakteryrodhopsini is just as acceptable as its somewhat swifter delivery as [bahk-tehr-ruh-roh-dohp-SEE-nee]. The production with the pause will sound superficially like a two-word phrase but will turn out not to be one. It is the anomalousness of the [uh . ] sequence that tells us that it isn't. The [uh] is a hyphen; and as a word may not end with a hyphen, the pause [ - ] that follows it must be inconsequential, that is, ignorable by the resolver. The resolver is, in a sense, advised by the appearance of each hyphen that it is still in the midst of a predicate word. Knowing this it can, in effect, start its resolution over again. So the resolution of this string as a single predicate word is not in the least troubled by the practice of pausing after /y/s.
Close relatives of scientific complexes made with borrowings are the complexes which have letter-words or number-words among their parts. Usually, these non-predicate elements in the metaphor appear in its early portions; they are in any case never final. Little words, too, are joined to the predicate stem with hyphens. Thus Xaiykre [KHIGH-uh-kreh] is 'X-ray', the Xai being the letter-word for upper-case Latin 'X', and the kre being from kreni = 'is a ray from source...' Xaiykre may be variously written, e.g., as 'X-kre', 'Xykre', 'Xai-kre' or, indeed, as 'Xaiykre'. But all such expressions are read aloud as [KHIGH-uh-kreh]. The sound [kh], the reader may recall, is the gutteral "k" or "rough breath" of Russian or Greek, and an irregular sound in Loglan. The contrasting word Kaiykre [KIGH-uh-kreh], by the way, would be 'K-ray'; so one must, in this case, be careful of one's pronunciation...cultivating, for the purpose of talking about X-rays, one's Russian [kh].
Incidentally, the above example reveals how another potential ambiguity is avoided. Kaiykre, meaning 'K-ray', is hyphenated with /y/...as it should be. Suppose we mistakenly used the /r/-hyphen, writing kairkre and saying [KIGH-rr-kreh]. Unintentionally we would have invoked an entirely different predicate, one that means 'property-of-a-ray' or perhaps 'having-a-ray'; for in this word kai is not the letter-word for upper-case 'K' but the short affix of katli ('has property/quality/feature..'). The difference in meaning between the two words Kaiykre and kairkre is entirely conveyed in speech by the two hyphens. The rule is that letter- and number-word segments are always hyphenated to a predicate stem with /y/; while normal CVV-form affixes, when initial segments of their complexes, are always hyphenated with /r n/. Remember baormao, the "box maker".
Let us close this section on building new complexes by observing that if a Loglan speaker finds a complex predicate in the Loglan dictionary whose deriving metaphor strikes him as less effective, less beautiful, or less apt than one that has occurred to da, let da construct a new one. By submitting da's alternative word to The Loglan Institute da will ensure that it will be considered for inclusion in the next edition of our dictionary. If the Loglan Council of Word-Makers does consider it worthy of inclusion, da's coinage will, for a time, live side by side with the old one...whence, in competing for the attention of other users of the language, only one, perhaps, will survive.17
Borrowings are not so much made as seen. Once one has mastered the borrowing art, new borrowings may be written into Loglan almost as fast as one can write them down. So the most troublesome question is not how to borrow a given concept but whether to borrow it.
The borrowing process itself is virtually algorithmic. This is especially true when the word in question has a widely-borrowed root that appears in slightly different forms in many different languages already, as is true of nearly all science words. So recasting such a word one more time, this time in a form acceptable to Loglan morphology, is usually a trivial matter that will add only seconds to the borrowing process. Thus, 'protein', 'hormone' and 'interferon' come into Loglan as proteini, hormoni and interferoni almost as fast as you can type them; and writing 'atom' as athomi--once the borrowing rules are known--requires only a few more centiseconds. So typically the most difficult part of adding a new borrowed word to Loglan is making the decision to borrow it in the first place.
The temptation to borrow a word usually arises because its concept is a necessary one for some literary, scholarly or scientific enterprise, and one suspects that it should not be made as a Loglan complex. The other possibility, namely making it as a composite primitive, is seldom even considered. By this time it is an extremely rare event that a new concept is best rendered as a composite primitive. But choosing between borrowing an existing international word for a useful concept and making it as a Loglan complex sometimes requires some thought, and sometimes even preparatory work. Institute policy, while increasingly clear in the case of science words, may not yet provide guidelines that cover the case you are considering. So the most workable policy for individual word-makers is a tentative but optimistic one. Study this chapter; borrow the words you think we need; then let your fellows on the Loglan Word-Makers Council review your borrowings. They will decide whether your new words should be permanent features of the language or not. By considering everybody's work from their community-wide perspective, the Council is likely to develop the same kind of "nearly algorithmic" borrowing policies in other areas as have already emerged for science.
Borrowing policy as it affects science words may be simply stated. If a word has already been borrowed with minor local adjustments in the majority of European languages--in a sense, Europeans were the founding members of the now intercontinental community of science and still deserve to be consulted--and its concept is an exclusively scientific one, then borrow it once more for Loglan. Borrowing it will be to follow a path well-trodden.
But how does one know that the scientific word one is looking at is one of those much-borrowed words that is already part of what the Merriam-Webster people call "ISV", the "International Scientific Vocabulary"? Without consulting a stack of foreign language dictionaries? The easiest answer is to shift the inquiry to another, more easily answered question: Are its roots Graeco-Latin? Did the maker of the prototype--the neologism that is the ultimate source of all this borrowing--go back to those classic languages of European antiquity to get the semantic elements with which to coin it? Well, if you know the most commonly used Greek and Latin roots that appear as "combining forms" (affixes) in these modern scientific coinages, you will probably recognize immediately that 'protein', 'carbohydrate', 'rhododendron', 'horizon' and 'oxygen', but not 'light-year', show indubitable signs of having been made of just such roots. 'Light-year', in contrast, though equally "scientific", has been made of good old English roots and is therefore almost certainly not ISV. (Neither 'light' nor 'year' is either Greek or Latin.) An even simpler test, and probably equally decisive, is to look up the word or words in question in a good-sized dictionary of some Romance language. Any Romance language will do...for example, French, Spanish or Italian. They all have almost identical borrowing habits as far as science is concerned. If a scientific word is substantially the same word--or at least similar enough to attest to having been borrowed from some common source--in both one Germanic language (English) and any one of these Romance tongues, then it is almost certainly ISV and should probably be borrowed.
Applying this test to English 'protein', 'carbohydrate', 'rhododendron', 'oxygen', 'horizon' and 'light-year' we would get 'proteina', 'idrato di carbonio', 'rododendro', 'ossigeno', 'orrizonte' and 'anno luce' if our test language were Italian; 'proteina', 'carbohidrato', 'rododendro', 'oxigeno', 'horizonte' and 'año de luz' if our test language were Spanish; and 'proteine', 'hydrate de carbone', 'rhododendron', 'oxygene', 'horizon' and to 'anneé-lumière' if it were French. Clearly all of these words but the 'light-year' set are ISV. The point is that consulting any of the three Romance languages would have been sufficient to find that out.
Indeed, going back to our earlier point about the origins of these particular words, simply knowing their etymology (i.e., the history of their derivation, as given in almost any large dictionary) would have been sufficient to find out that all but 'light-year' are ISV. For under etymological inspection, 'light-year', being composed of two Germanic words (the German for it is 'Lichtjahr') stands out like a sore thumb. Even with a good etymological dictionary, however, other cases will not be so clear. All things considered, the Romance language test is probably more reliable.
So let us return to that test and its implications for our examples. Since English departs from the Romance languages in using 'light-year' for that important astronomical measure (instead of some anglicized Latin compound, say, like *'annolumen'), we may conclude that neither the Germanic nor the Romance rendering of it should be borrowed for Loglan. Instead the Loglan term for 'light-year' should be made as a Loglan complex...as it easily can be. In fact, in making our complex predicate we might as well use the same metaphor as the one that both the Germanic and the Romance languages use: "light-year" or "year of light", it amounts to the same thing. Thus the Loglan word for this concept was built as litnirne [leet-NEER-neh]. It comes from litla nirne, which is of course nothing more than the literal translation of the English phrase 'light year' into Loglan. Note that we use the Germanic word-order, not the Romance one. Loglan, too, is an Adjective-Noun language as neither Latin nor any of its descendants is.
For a very different reason, the concept of "horizon" should probably also not be borrowed. While the word is apparently ISV in all the European languages, the concept itself is part of everyday human experience. Every pair of human eyes has seen horizons. Unlike protons, oxygen and carbohydrates, horizons are directly perceived by human sense organs and noted--and probably named--in all human cultures. Therefore the Loglan word for 'horizon' should be reduced by metaphor to the common ingredients of human experience, which is what we do when we make it as a Loglan complex predicate. Following this decision, 'horizon' was easily made as telbie [TEL-byeh] from the metaphor 'Earth-edge'. An horizon, I reasoned, marks the edge of the planet...as everyone who has sailed, or flown, or stood in high places, knows. The Loglan metaphor that conveys this high-flying image is terla bidje.
All the other words on our test list may be safely borrowed...and from either the Romance or the English version of the word; the borrowing procedure has been contrived in such a way that it won't matter greatly. I personally prefer using the Spanish version as the source word. I use the Spanish language in my "Romance language test". So while I'm about it, I keep the Spanish word I've just looked up to see if it is like the English, and use it as the source word for the borrowing I will make only if it is. For example, Spanish 'sicopatico' is more useful as a starting point for a Loglan borrowing than English 'psychopathic' is. Both are "local versions" of exactly the same international scientific word; but in the Spanish language most of the necessary rewriting has already been done. The Spanish dictionary one uses to perform this test must be a fairly large one, of course, or specialized in the scientific direction. For it must in any case contain a usably large proportion of the ISV.18
Once the decision to borrow a scientific word has been made, then transforming, let us say, the Spanish version of it into a predicate-form Loglan word is accomplished in four steps, two of them tests which most words pass immediately. In the rest of this section I shall describe the four steps in a summary fashion in order to give the reader an overview of the borrowing procedure as a whole. Then, in Sections 6.6-9, the full range of moves under each step will be discussed in greater detail.
Before leaving our example, let me report that the four words we have been talking about, all of which we found to be safely ISV, were borrowed as proteini, carbohidrati, rodhodendroni and okso. The latter could have been borrowed as oksigeni, but as it is a very commonly used element word, the shorter okso seemed justified.
Now let us walk through the steps of the borrowing process. Suppose we want to import the word 'insulin' into Loglan. The Spanish word is 'insulina'; so we are assured that it is ISV. Clearly insulin is not, like horizons and blue-birds, directly perceived in human experience. Hence it is borrowable.
These preliminary questions out of the way, the first step is to rewrite the source word in Loglan phonemes if it needs to be. 'Insulina' is already in Loglan phonemes. So no respelling is necessary. Our first trial-word is therefore ?insulina itself.
The second step is to equip it with an appropriate ending. As you may have noticed, /-i/ is the ending conventionally assigned to Loglan science words. So we replace the final /-a/ of ?insulina with /i/. Again it is no surprise that ?insulini sounds Italian. Most Loglan science words do.
The third step is to test our trial-word for breakup. ?insulini doesn't break up into a phrase the way ?atomi did (a to mi). The initial /i/ is stuck fast to /ns/ because /ns/ is an impermissible initial sequence. After that, nothing breaks.
The fourth step is to inspect our trial-word for segmentation problems. (a) Does ?insulini segment like a complex? Clearly it doesn't. (b) Does it commence with a consonant-pair? It doesn't; but if it did, we would have to ask, Does the sequel to its first consonant segment as a complex? ?insulini is not CC-initial, so the second question doesn't apply. So ?insulini passes all tests. Insulini may now be used in the language as a predicate meaning 'is insulin/a quantity of insulin from animal/source...'
Reviewing what we have done,
| Step 1. | We rewrite the source word in Loglan phonemes if necessary. (It wasn't necessary.) |
| Step 2. | We supply it with an appropriate ending. (As insulin is a scientific concept, we replace the single-V ending of the Spanish source word with /i/.) |
| Step 3. | If it breaks, we glue it. (It didn't break. We didn't have to glue it.) |
| Step 4. | If it, or the sequel of an initial consonant, segments like a complex, we spoil the segmentation pattern. (We didn't have to.) |
'Insulin' is a fairly typical word. About a tenth of the science words we've borrowed have been like it in being consonant-final and requiring no gluing: 'interferon', 'protein', 'proton', 'interleukin', etc., are all of this pattern. They've all gone into Loglan with (a) little or no respelling, and (b) the addition of final /i/ to the English form of the word. This usually amounts to the replacement with /i/ of the single final vowel of the Spanish source word. The results are words like insulini, interferoni, proteini, protoni and interleukini that sound like Italian plurals and yet are distinctly loglandical. Because they neither break as phrases nor segment as complexes, they have the properties of a simple (i.e., non-complex) Loglan predicate quite naturally. Such words can be borrowed almost as fast as they can be written. The source of the borrowing is always plain.
Now let's look at several additional difficult-to-borrow words. This time we'll consider a word that does require some respelling: 'cercopithecine'. It's based on a Linnaean genus word, so we'll probably fail to find it even in a large Spanish-English dictionary; but looking it up in an unabridged English one we find that it is derived from 'Cercopithecus', the name of a genus of long-tailed African monkeys that includes the guenons. Our convention is to use a Linnaean word as the source of a borrowing whenever one is available. The reason we do this is that endings like '-cine' tend to be language-specific; and so it would be a mistake to imitate any one of them in Loglan. In contrast, the Linnaean terminology is universal. So as we want our word to be as international in flavor as possible, we use 'Cercopithecus' as our source for the Loglan translation of the English word 'cercopithecine'.
In Step 1 we ask, Does 'Cercopithecus' require respelling? The answer is yes. Two of the 'c's in this word precede "strong vowels" (/a o u/) and so are turned into /k/s. Thus ?cerkopithekus. Note that the first 'c', which precedes a "weak vowel" (/e i y/), is unchanged. This follows a general custom in the Romance languages...and many others. However, instead of pronouncing this unaltered 'c' as /s/, as it is the Romance custom to do, we will of course continue to pronounce it as Loglan /c/, i.e., as [sh]. What about the 'th'? Our rule here is to rewrite this digraph as /t/ whenever it does not precede a stressed, i.e. penultimate, vowel. When it does precede such a vowel--as it does, for example in ?ethili--we keep the /h/ and pronounce it: for example, [et-HEE-lee]. The 'e' in '-thecus' is also going to be stressed. So we keep this /h/ and pronounce it: [shehr-koh-peet-HEH-koos]. Thus our trial-word is still ?cerkopithekus.
Step 2 asks us to fit a new ending if our trial-word requires one. ?cerkopithekus does. It is not only a science word but its source is Linnaean. So special rules apply. One of those rules requires that we replace the /us/ on words derived from Linnaean sources with /ui/; we'll see why this special treatment is required in Section 6.6. So ?cerkopithekui (pronounced [shehr-koh-peet-HEK-wee]) is the new shape of the trial-word as it emerges from Step 2.
Step 3 asks us to test our trial-word for breakup. Does it break up as a Loglan phrase? No; again the initial /ce-/ is prevented from being heard as a CV-word by the fact that the /rk/ that follows it in ?cerkopithekui is impermissible at the head of a word. So we pass on.
Step 4 asks us to look for segmentation problems. Can ?cerkopithekui be heard as a Loglan complex? We can start to segment it as cer+kop+..., but then the sequence +ithekui comes along and we cannot continue. If a word cannot be completely segmented, it is not a complex. Finally, ?cerkopithekui is not CC-initial so the second segmentation test does not apply. Apparently there are no segmentation problems; so there is nothing to be done in Step 4. Cerkopithekui is evidently the Loglan word for 'is a cercopithecine, a member of the genus Cercopithecus'.
Let's now borrow a word that illustrates why Spanish makes a better source language than English does. Let's borrow the medical term 'psychopathic'. The words 'psychopath' and 'psychopathy' will of course come along with it. The corresponding Spanish cluster is 'sicopatico/-ta/-tia'...attesting to the ISV-ness of the psychopathy concept. In Step 1 we rewrite the two 'c's as /k/s; for both precede strong vowels. In Step 2 we replace the ending /-iko/ with /i/ getting ?sikopati. (It wouldn't have mattered which member of the source cluster we had used, because /-a/ and /-ia/ are also replaced with /i/.) In Step 3 we find that ?sikopati breaks up as si ko pa ti; so we glue it together by inserting /h/ after the consonant (or consonant group) that follows the first word-break, in this case, the one between si and ko. This gives us ?sikhopati. In Step 4 we discover that there are no segmentation problems. So sikhopati is evidently the Loglan predicate for 'is psychopathic/a psychopath, someone suffering from psychopathia'. Psychopathia itself, of course, will be designated by lopu sikhopati, the mass of psychopathic properties, or psychopathies. The mass of psychopathic states can then be designated by lopo sikhopati.
Let's borrow one more medical word and then go on to consider the four borrowing steps one at a time. Let's take another illness word, this time 'tubercular' in the sense of 'suffering from tuberculosis'. In Loglan, we usually use the disease noun, the '-osis' or '-pathy' word, as the source of these medical borrowings. In Step 1 we rewrite 'c' as /k/ because /u/ is a strong vowel. In Step 2 we drop the 's' from '-osis' getting /osi/. In Step 3 ?tuberkulosi breaks up into the phrase Tu berkulosi ('You are "bercular", whatever that means'); so we have to glue it. We insert /h/ after the consonant that follows the word-break; and this gives ?tubherkulosi. In Step 4 we notice that the sequence /-kulosi/ cannot end a Loglan complex unless it is part of a borrowed segment. There are no hyphens; so this settles the matter without further testing. The Loglan word is tubherkulosi and it means 'suffers from tuberculosis/is tubercular'. Again, the disease itself will be designated by lopu tubherkulosi, and the mass of all disease states, by lopo tubherkulosi. (If you want to develop some skill in pronouncing these long words, practice saying [loh-poh-toob-hehr-koo-LOH-see] a few times. You might alternate it with the distinctly easier production [loh-poh-seek-hoh-PAH-tee]. Soon you will be talking like a loglandian physician.)
Borrowing will never be completely algorithmic, of course. Every now and then a first-cut borrowing will imitate a complex, and when it does, a repair appropriate to the local circumstances of the problem will have to be devised. Section 6.9 describes the several strategies that have been devised to deal with this non-algorithmic side of the borrowing problem.
In the above examples we have concentrated on problems. So let me conclude this section by showing the reader how swiftly the borrowing procedure works with non-problematic source words. Here is a group of Italian musical words, also widely borrowed internationally and so borrowable in Loglan as well. The procedure works like an algorithm here; it brings all six words through Step 4 without appealing to human judgement:
| Italian | Step 1 | Step 2 | Step 3 | Step 4 |
|---|---|---|---|---|
| viola | - | - | violha | - |
| violino | - | violina | violhina | - |
| violone | - | violona | violhona | - |
| violoncello | violoncelo | violoncela | violhoncela | - |
| tromba | - | - | - | - |
| trombone | - | trombona | - | - |
This is a typical set of borrowed words. As they are Italian, we see that almost no rewriting is necessary in Step 1. In fact, only the double 'll' of 'violoncello' gets rewritten as 'l'. In Step 2, however, four of the six trial-words exchange their Italian '-o' and '-e' endings for the Loglan music-word ending '-a'. (The other conventional word-endings are given in Section 6.7.) In Step 3, another four require /h/-insertion to prevent break-up (into the phrases vio la, vio li na, vio lo na, and vio loncela). In Step 4, none of the six words prove to have segmentation problems. Finally, we may note that honcela [hohn-SHEH-lah] will make a nice abbreviation of Loglan's new violhoncela, the also overlong name for the mellow instrument whose name we abbreviate as 'cello' in English.
In the next four sections we consider the four steps of the borrowing procedure in more detail.
Once a word is chosen to be the source of a borrowing, one should rewrite whatever characters in it match the left halves of the following rules. Rewriting rules will not be very different for the various types of borrowings--for music as opposed to food words, for example--but certain problems arise in borrowing science words that require special rules. Table 6.1 includes the solutions to all known instances of such problems for all categories of borrowings. The trial-words given as examples in the table are the products only of rewriting; no other borrowing steps have yet been performed on them.
| aa | aa => a. 'aardvark' => ?ardvark. |
| ae | ai-or-ae- => e. 'Aegyptopithecus' => ?egiptopithekus. |
| c | c before C/a/o/u => k; c followed by e/i/y, or matched in the symbol of an element word, unchanged. 'canid' => ?kanid; but 'violoncello' => ?violoncelo and 'californium' = ?californium. |
| cc | cc followed by e/i/y => kc; otherwise cc => k. 'succinta' => ?sukcinta; but 'Echinococcus' => ?ekinokokus. |
| CC | Any double instance of a C not c is replaced by a single instance of that C. 'Bettongia' => ?betongia. |
| ch | ch => k. 'Escherichia' => ?eskerikia. |
| ee | ee => i. 'sakeen' => ?sakin. |
| eigh | eigh => ei. 'leightonii' => ?leitoni. |
| ew | ew => u. 'Andrewsarchus' => ?andrusarkus. |
| h | h in VhC => VCh to make it pronounceable. 'ahli' => ?alhi. |
| ie | -ie => i; non-final ie unchanged. 'calorie' => ?calori. |
| igh | igh => ai. 'lightfooti' => ?laitfuti. |
| ii | -ii => i; non-final ii unchanged. 'livingstonii' => ?livinstoni. |
| lh | lh- => elh; non-initial lhC => lC; non-initial lhV unchanged. 'lhoesti' => ?elhoesti. |
| ng | ngC => nC; ngV unchanged. 'livingstonii' => ?livinstoni, but 'Bettongia' => ?betongia. If ngC => nC is later found to cause segmenting, then repair with nC => ngiC. E.g., ?livinstoni breaks as 'li vinstoni' and /h/-insertion then gives ?livhinstoni which segments as liv+hin+stoni. This triggers the rule nC => ngiC which inserts /gi/ into the trial word, giving ?livhingistoni, which no longer segments. |
| oo | oo => u; except in 'zoo-' and other disyllables in which oo remains unchanged. 'lightfooti' => ?laitfuti, but 'Zoothera' => ?zoothera. |
| ou | ou => u when within a natural form: 'youngi' => ?iungi; when split between two forms, unchanged: 'thiourea' = 'thio-' + '-urea' => ?tiourea. |
| ow | ow => ao. 'owstoni' => ?aostoni. |
| ph | ph => f; ph- => eph in very short words; non-initial ph unchanged when it will glue an otherwise breaking word. 'phenyl' => ?fenil; but 'phyla' => ?ephila; and 'ophiocomina' => ?ophiokomina. |
| pt | pt- => ept; non-initial pt remains unchanged. 'Pterocera => ?epterocera. |
| pn | pn- => n; non-initial pn remains unchanged. 'pneumonia' => ?neumonia. |
| q | q => k. 'antiquitas' => ?antikuitas. |
| qu | in Sp. source words quV => kV; but in a Linnaean, quV => kuV. 'braquiopodo' => ?brakiopodo; but 'Madoqua' => ?madokua. |
| rh | rh- => r; rhC => rC; non-initial rhV unchanged. 'rhodopsin' => ?rodopsin. |
| sc | sc => sk. 'sclerosis' => ?sklerosis. |
| th | when not followed by a stressed V, th => t; when followed by V', h is retained. Also, if the th-containing word is related to one in which h precedes V', keep the h. Thus 'ethylene' ?ethileni [et-hee-LEH-nee] is related to 'ethyl' ?ethili [et-HEE-lee] and so keeps its h. |
| y | y => i. 'Amphyosemian' => ?amfiosemian. |
| yn | yn => un whenever the n is the last C in the word. 'butenyne' => ?butenune; but 'dynamo' => ?dinamo. |
| x | x- => z unless matched in the symbol of an element word, when unchanged; when non-initial, => ks. 'xenyl' => ?zenil; but 'Atilax' => ?atilaks; 'xenon' unchanged. |
| w | w => u unless matched in the symbol of an element word, e.g., ?lawrencium. |
Note that 'j' is not rewritten; it is retained and pronounced as Loglan [zh].
Perhaps the most awkward rewriting problems are presented by the Linnaean vocabulary of biology. This is true principally because species and genus names often incorporate the name of their discoverer. This brings into science a riotous variety of names and spelling styles: 'Escher' in 'Escherichia', 'Livingston' in 'livingstonii', 'Lightfoot' in 'lightfooti', and so on. Many rewriting rules are necessary just to make such words pronounceable in Loglan phonemes. In general our goal has been to develop words from these Linnaean sources that will be regularly pronounced in Loglan much as Spanish or Italian scholars would pronounce the originals.
By convention, certain characteristic vowels are used for ending loan words of given types when they are six letters long or longer. Shorter words, like iglu, may keep their natural endings. In fact, the endings of 4-letter loans and single-source primitives, like simba, should follow the natural sources as closely as possible without regard to these conventions except that all element words, short or long, should end in /o/.
These are the five characteristic vowel endings:
| -a | for musical forms and instruments. |
| -e | for local foods, dishes, plants, and animals. |
| -i | for scientific, technological and medical words. |
| -o and -io | for element words. |
| -u | for all the rest, e.g., tools, clothing, dwellings. |
Most of the rules for adjusting the ending of a borrowing are obvious. For example, if after being rewritten, a trial loan word already ends in its characteristic vowel, e.g., ?livinstoni from the Linnaean 'livingstonii' and ?tromba from Italian 'tromba' ('trumpet'), then we of course leave the ending unchanged. If it ends in a single V which is not characteristic of its type, then we change it to the proper one. E.g., the science word ?amfidela becomes ?amfideli, and the musical word ?trombone becomes ?trombona. There is one exception to these obvious rules: the Spanish endings '-ico/-ica', which will have been rewritten as /-iko -ika/ during Step 1, are entirely replaced by whatever vowel is appropriate. Thus, as we have already seen, the medical term ?sikopatiko is reduced to ?sikopati.
Genuine problems arise only with trial words which end in vowel-groups or consonants. Then the following rules apply. Remember that adjusting the end of a trial-word takes place after it has been rewritten, if necessary, in Loglan phonemes:
1. All word-final vowel-groups are reduced to the characteristic vowel except the endings /-ae -ea -oa -ua/ on Linnaean-derived science words. In the case of /-ae/, the whole ending is replaced by /ei/ if the source is Linnaean. In the case of the /-Va/ endings, the vowel /i/ is added. Thus the Linnaean-derived trial-words ?kimera (from 'chimaera') ?sturio and ?artemia are changed or reduced to ?kimeri ?sturi and ?artemi, respectively, but the rewritten Linnaeans ?kimerae (from 'chimaerae') ?cinerea and ?madokua are changed or expanded to ?kimerei ?cinereai and ?madokuai. This is to preserve distinctions which are finer-grained among Linnaeans than among other categories of possible source words.
2. The characteristic vowel is added to all consonant-final trial words except those which end with /-(i)es -is -(i)us -(i)um -ik/. In these exceptional cases, the word-maker should replace the entire ending with the characteristic vowel unless it is an
/-(i)um/ ending on a Linnaean, in which case add /i/ to it, or an
/-(i)us/ ending on a Linnaean, in which case replace it with /ui/. Thus the trial scientific words ?andrias, ?acipenser and ?dalton issuing from Step 1 become ?andriasi, ?acipenseri and ?daltoni, while ?intestinalis, ?humerus and ?organic become ?intestinali, ?humeri, and ?organi after end-adjustment. But the Linnaean 'Bittium' keeps its ending by becoming ?bitiumi, and the /-us/ ending of the Linnaean 'Gymnorhinus' is changed to /ui/ in order to preserve its distinction from 'Gymnorhina'. Thus ?gimnorhinui and ?gimnorhini are the ending-adjusted words, and the gender difference is preserved.
Thus as far as possible the special requirements of each body of words to be borrowed--for example, the occasional importance of a gender distinction among the biological Linnaeans--are taken into account.
There is more about the importation of Linnaean words in Section 6.14.
That a trial-word should break into a phrase of two or more shorter words--as ?atomi breaks into a to mi--is perhaps the most common problem that the Loglan word-borrower will encounter. The frequency of the problem is understandable. Science words, especially, are mainly derived from Greek or Latin roots, and in these languages, simple consonant-vowel alternation is a very common word pattern. This is a common pattern in Loglan, too, of course. But in Loglan consonant-vowel alternation means that one is listening to a string of compound or simple little words. So if we were to borrow words like 'leukemia', 'modularis' and 'molecule' intact, they too would be heard, after end-adjustment, as strings of Loglan little words: leu ke mi, mo du la ri, mo le ku li, and so on.
Even when there is a consonant-pair in a Greek or Latin source word, it often comes too late to hold the whole Loglan trial-word together. Thus after end-adjustment 'rhododendron' and 'kilocycle' will be heard as ro do dendroni and ki lo cikli. Besides, not all consonant-pairs save trial-words from breakage. In ?isopropili', for example, the /pr/ pair is itself a permissible initial. So despite its having a consonant-pair, the trial-word breaks up into the quite reasonable phrase I so propili = 'And six propyls'. Similarly, 'retrovirus/-ri', after end-adjustment, comes out ?retroviri; and this also breaks up, in this case as the phrase re troviri. Obviously, such fragile words may not be permitted in Loglan, where speech and writing are isomorphic. After all, we speak a language in which one can infallibly set down in writing what one hears...provided the word-borrowers do their job right.
Despite the frequency of the breakage problem, however, there is always a simple algorithmic solution to it once identified. There are two principal gluing moves, each called for by its own particular set of circumstances. Inserting "gluing /h/" after the first word-break is, as we have seen, by far the most common of these repair moves. It fixes all the breakages except the last one, re troviri, that were listed in the last paragraph. In all other cases the phoneme /h/ is simply inserted after the first consonant or consonant-group that comes after the first word-break: after the /k/ in leu ke mi, the /d/ in mo du la ri, and so on. This gives us leukhemi [leigh-ook- HEH-mee] for 'leukemia', modhulari [mohd-hoo-LAH-ree] for the Linnaean species name 'modularis', molhekuli [mohl-heh-KOO-lee] for 'molecule', and rodhodendroni [rohd-hoh-den-DROH-nee], kilhocikli [keel-hoh-SHEEK-lee], and ishopropili [ees-hoh-proh-PEE-lee] for 'rhododendron', 'kilocycle' and 'isopropyl'. Once /h/s have been installed, these are all perfectly good words which have no segmentation problems. (Kilhocikli will, however, have to compete with the complex kilcikli ('thousand-cycles') for our vote as the best word for this concept; and it will probably lose.) It is not until we get to the breakup of ?retroviri as re troviri that the second, and far less common gluing move, is called into play.
We will call the second repair move "continuant doubling". It is applied only when one of the four continuants /r l m n/ is the last consonant in a group of two or more consonants that immediately follows the first word-break in a pseudo-phrase. This is the case in the phrase re troviri, for example. The cure is to double the continuant /r/. This produces the new trial word ?retrroviri [reh-trr-oh-VEE-ree], which does not break up.
But how does continuant-doubling arise? In particular, how is it related to /h/-insertion? Well, if we followed the more common practice in the above case and tried to insert an /h/ after the consonant-group in re troviri-- not noticing, perhaps, that the last element in the group was the continuant /r/--we would get ?retrhoviri as our new trial-word. Certainly this looks more difficult to pronounce; and if we do try to pronounce it in a Loglan way, we are very likely to produce a vocalic rendering of the continuant anyway: [reh-trr-hoh-VEE-ree]. Indeed, this is probably the only way one could pronounce the medial consonant sequence /trh/ in Loglan, that is, by vocalizing the /r/ in this inter-consonantal context. So, since vocalicization is an all but inevitable consequence of /h/-insertion in such contexts, and the vocalicized consonant is in many ways simpler to both produce and hear than the inserted /h/, we decided to simplify matters and not insert the /h/ at all in these cases, but just double the continuant. Thus retrroviri is now the recommended respelling of 'retrovirus/-ri' as imported into Loglan. Note that its pronunciation is surprisingly easy: [reh-trr-oh-VEE-ree]. Indeed, for most people, this will be an easier word to pronounce than words that have been fixed by normal /h/-insertion. 'Retrroviri' is also visually simpler than 'retrhoviri', as well as closer to their common source in 'retroviri'. So except for the necessity for introducing an ad hoc resolution rule that will keep such words from being heard as phrases, continuant doubling is a simplification of the /h/-insertion move in a context in which the latter encounters phonological difficulties.
The required ad hoc rule is, of course, that doubled continuants be confined to loan words and to the borrowed parts of complexes, and that even in these contexts they not be allowed in word-initial syllables. It is this rule, of course, that makes the sequence /trr/ an impermissible one initially, and so saves /retrroVIri/ from being heard as the two-word phrase re *trroviri.
Both /h/-insertion and continuant-doubling require a certain care in pronunciation. All such repaired words are meant to contrast successfully with the phrases into which they were not allowed to break: thus leukhemi [leigh-ook-HEH-mee] should contrast intelligibly with leu ke mi [leigh-oo-kEH-mee], and retrroviri [reh-trr-oh-VEE-ree] must successfully contrast with re troviri [reh-troh-VEE-ree], which is the phrase it was not allowed to be. Notice that the second contrast is far easier to produce and hear than the first one. The vocalicized continuant is in fact quite audible and gets an extra syllable of its own. Indeed, even syllable count is different in the word compared to the phrase (5 syllables vs. 4). In producing leukhemi intelligibly, however, one must deliberately separate the [k] from the vowel [eh] with a strongly aspirated [h]. A little practice listening to one's /h/s makes one reasonably good at this. However, you needn't be concerned about perfecting this skill. It is mainly your computer that will care if you fail to "emphasize your /h/s" in a way that can always be heard. Your human interlocutors will probably understand you even if you mumble them.
It is true that after all the breakups have been repaired--and about half your scientific borrowings will require gluing repairs--a Loglan text that is largely made up of such borrowings--a scientific article, say--will at first fairly bristle with its 'h's. But experience has shown that the 'h's soon drop into the visual background in such documents. They become virtually invisible, in fact. What the English-trained eye comes to see in athomi, molhekuli, ishopropili, rodhodendroni and leukhemi are strings of letters in which the 'h's and 'i's have been blanked out: 'at-om-', 'mol-ekul-', 'is-opropil-', 'rod-odendron-' and 'leuk-em-'. A possible cause of this phenomenon is that the 'h's bear almost no semantic burden. They seldom occur in such positions in the original Greek and Latin constructions. 'Methane', 'atherosclerosis' and 'ethyl' and kin are the chief counter-examples I've run into. Besides, in such words the "real" 'h's have a way of coming back to life: methani, atherosklerosi, ethili, ethileni, and so on. As for the inserted 'h's, all they have to say to the reader is 'This is probably a Loglan borrowing' The often-added final /i/s bear a similarly slender metalinguistic message: 'This is probably a Loglan scientific borrowing.'
So in the end what the reading eye does is look past the gluing 'h's and the characteristic vowels to the phonemes that actually identify each borrowing...in a sense the load-bearing ones. Among these the reader will almost always find the main ingredients of an often-borrowed international word laid out in seemly order. Beyond atlatlu is the Nahuatl word 'atlatl'; beyond ionhi is the Greek word 'ion'.
Loglanists, we have seen, expect to be able to spot and decipher new complexes as they read and listen. So a loan-word that ends up looking and sounding like a complex is no good to us. It throws the attentive reader/listener off the track. And loglanists are likely to be fairly attentive readers and listeners. In fact, we are likely to be constantly on the lookout for segmentation patterns like mek+y+kiu, roj+mao+ses+mao, num+sensi and marty+sensi in what we see and hear. These are the hallmarks of the Loglan complex predicate. Complexes are far more important in our language than borrowings are. So any borrowing that can be seen or heard as a complex probably will be.
In a world of well-constructed borrowings, attention to segmentation patterns will be amply repaid in Loglan. For one thing, new complexes are likely to be much easier to understand than novel scientific words are. The reason is that all regular complexes, i.e., those made of non-borrowed parts, are combinations of about 600 plainly derived affixes as well as the 860 Loglan primitives themselves. In Loglan, these basic semantic ingredients will be routinely learned by everyone. Once they are learned, any new complex--no matter how strange or long--can be easily deciphered. So deciphering a strange new complex is likely to be a much easier task for the ordinary loglanist than fathoming an unfamiliar scientific borrowing...like albocinerei, for example.
The word I've chosen to make this point is obviously a Loglan borrowing; its form is not primitive and it doesn't segment. But what it borrows is a natural complex, originally made from several parts. Its source was 'albocinereous', an English ISV word that was once derived from Latin 'albus' = 'white', 'cinis' = 'ashes' and '-eus' = 'composed of'. So 'albocinereous' is a technical word meaning 'composed of white and grey material'. It is used mainly by anatomists.
To know this one has to know a lot about how Greek and Latin roots have been used in making scientific words. At least one must know how to use the etymological information given about them in large dictionaries. All scientists must learn at least a portion of this immense body of Western scholarship; but usually they master only enough of it to understand the derivations of the words used in their own and related disciplines. It will probably not be different for the Loglan-reading scientist. So da will probably be unwilling to give up the clarity of the words da can easily decipher--the regular Loglan complexes--just to make the borrowing of the natural complexes a little more automatic.
For it is the automaticity of the borrowing process--and very little else--it turns out, that is at stake here. That is what we give up when we require that every word-borrower check da's products for segmentation, and then correct them before da publishes them. For, unlike breakage, there is no automatic way of curing a segmentation problem once one has been found. Suppose we have made ?palheozoi from the English source word 'Paleozoic' (Spanish 'Paleozoico' gives the same result). If we fail to notice that our trial word could be segmented as pal+heo+zoi, then at least some of our unfortunate readers would be sent on a wild goose chase when they encounter our new word. The most attentive of them would attempt to decipher pal+heo+zoi [pahl-heigh-OH-zoy] as a three-term complex. If they do, they will soon be frustrated. To be sure, pal- has an assigned meaning as an affix of spali = 'side'; but both -heo and -zoi are still unassigned. That doesn't mean that they couldn't have meanings, however, or that the day will not come when they do. Were such a day to come, borrowings like ?palheozoi, if we permitted them, would be un-untiable knots in the fabric of the language. They would be words that could only be legitimately interpreted as complexes but which context demanded be seen as loans.
So we avoid such words with their mixed messages. We carefully examine each loan-word we make for the two ways in which it might imitate a complex: (1) all by itself, as ?palheozoi imitates pal+heo+ zoi; and (2) as part of a CV-word initiated phrase, as ?spektri in the phrase to spektri imitates tos+pek+tri. For this reason words like ?spektri are also disallowed.19
The tests themselves are easy to conduct. Test 1 is simply to attempt to resolve one's trial-word as a complex. If one succeeds, as one does in the case of ?palheozoi, the word must be repaired; its segmentation pattern must be spoiled. Test 2 applies only to CC-initial words. Of these one asks, Will the sequel of the initial consonant segment as a complex? If it does--if ?spektri segments as s+pek+tri--it is a Spektri-type word; and it, too, has a segmentation pattern that must be spoiled. The tests require no human judgement. So performing them could well be part of a borrowing algorithm. It is the repair of a segmentation problem once identified that cannot yet be done algorithmically.
As mentio