On the Existence of Powerful Natural Languages

A common dream in philosophy and politics and religion is the idea of languages superior to evolved demotics, whether Latin or Lojban, which grant speakers greater insight into reality and rationality, analogous to well-known efficacy of mathematical sub-languages in solving problems. This dream fails because such languages gain power inherently from specialization.
computer-science, philosophy, psychology, politics, transhumanism
2016-12-182019-05-05 finished certainty: possible importance: 6


Designed for­mal nota­tions & dis­tinct vocab­u­lar­ies are often employed in STEM fields, and these spe­cial­ized lan­guages are cred­ited with greatly enhanc­ing research & com­mu­ni­ca­tion. Many philoso­phers and other thinkers have attempted to cre­ate more gen­er­al­ly-ap­plic­a­ble designed lan­guages for use out­side of spe­cific tech­ni­cal fields to enhance human think­ing, but the empir­i­cal track record is poor and no such designed lan­guage has demon­strated sub­stan­tial improve­ments to human cog­ni­tion such as resist­ing cog­ni­tive biases or log­i­cal fal­lac­i­es. I sug­gest that the suc­cess of spe­cial­ized lan­guages in fields is inher­ently due to encod­ing large amounts of pre­vi­ous­ly-dis­cov­ered infor­ma­tion spe­cific to those fields, and this explains their inabil­ity to boost human cog­ni­tion across a wide vari­ety of domains.

Along­side the axis of effi­ciency in terms of time or words or char­ac­ters to con­vey a cer­tain amount of infor­ma­tion, one might think about nat­ural lan­guages in terms of some sort of ‘power’ met­ric, akin to the , in which some lan­guages are bet­ter at allow­ing one to think or express impor­tant thoughts and thus a is nec­es­sary. For exam­ple, Chi­nese is some­times said to be too ‘con­crete’ and makes sci­en­tific thought diffi­cult com­pared to some other lan­guages; ‘fix­ing’ lan­guage and ter­mi­nol­ogy has been a peren­nial focus of West­ern politi­cians in the 20th & 21st cen­tu­ry, ascribed great pow­ers on soci­ety through names and pro­nouns; less con­tro­ver­sial­ly, math­e­mati­cians & physi­cists unan­i­mously agree that nota­tion is extremely impor­tant and that a good nota­tion for a topic can make rou­tine results eas­ier to cre­ate & under­stand and also can enable pre­vi­ously unthink­able thoughts and sug­gest impor­tant research direc­tion­s—Ara­bic numer­als vs Roman numer­als, New­ton’s cal­cu­lus nota­tion ver­sus Leib­niz’s, etc.

This has been taken to extremes: if good nota­tion can help, surely there is some des­ignable ideal lan­guage in which all thoughts can be expressed most per­fectly in a way which quickly & log­i­cally guides thoughts to their cor­rect con­clu­sions with­out falling prey to fal­lac­ies or cog­ni­tive bias­es, and make its users far more intel­li­gent & effec­tive than those forced to use nat­ural lan­guages. The usual exam­ple here being Leib­niz’s pro­posed log­i­cal lan­guage in which philo­soph­i­cal dis­putes could be resolved sim­ply by cal­cu­la­tion, or . The hope that bet­ter lan­guages can lead to bet­ter human thought has also been a moti­va­tion for devel­op­ing like , with ratio­nal­ized vocab­u­lary & gram­mars. Con­langs, and less rad­i­cal lin­guis­tic inno­va­tions like (and coin­ing neol­o­gisms for new con­cepts) show up occa­sion­ally in move­ments like and quite often in sci­ence fic­tion & tech­nol­ogy (eg Engel­bart 1962); and of course many polit­i­cal or social or philo­soph­i­cal move­ments have explic­itly wanted to change lan­guages to affect thought (from Com­mu­nism, mem­o­rably dra­ma­tized in , to fem­i­nism, post-mod­ernisms, , or Ukrain­ian nation­al­ists; I’ve noticed that this trend seems par­tic­u­larly pro­nounced in the mid-20th cen­tu­ry).

This is sen­si­ble, since it seems clear that our lan­guage influ­ences our thoughts (for bet­ter or worse), that there’s no par­tic­u­larly rea­son to expect evolved lan­guages like Eng­lish to be all that great (they clearly are bru­tally irreg­u­lar and ineffi­cient in ways that have no pos­si­ble util­i­ty, and com­par­isons of writ­ing sys­tems are equally clear that some sys­tems like are just plain bet­ter), and if bet­ter nota­tion and vocab­u­lary can be so help­ful in STEM areas, can con­structed lan­guages be help­ful in gen­er­al?

I would have to say that the answer appears to be no. First, the Sapir-Whorf hypoth­e­sis has largely panned out as triv­i­al­i­ties: there are few or no impor­tant cross-cul­tural differ­ences ascrib­able to lan­guages’ gram­mars or vocab­u­lar­ies, only slight differ­ences like empha­sis in color per­cep­tion or pos­si­bly dis­count rates1 Nor do we see huge differ­ences between pop­u­la­tions speak­ing rad­i­cally differ­ent lan­guages when they learn differ­ent nat­ural lan­guages—Chi­nese peo­ple might find it some­what eas­ier to think sci­en­tifi­cally in Eng­lish than in Man­dar­in, but it hardly seems to make a huge differ­ence. Sec­ond­ly, look­ing over the his­tory of such attempts, there does not appear to be any notice­able gain from switch­ing to E-Prime or Lojban etc. Move­ments like Gen­eral Seman­tics have not demon­strated notable addi­tional real-world suc­cess in their adher­ents (although I do think that such prac­tices as E-Prime do improve philo­soph­i­cal writ­ing) Attempts at gen­er­al-pur­pose effi­cient lan­guages arguably have been able to decrease the learn­ing time for con­langs and pro­vide mod­est edu­ca­tional advan­tages; for par­tic­u­lar fields like chem­istry & physics & com­puter sci­ence (eg , , ) have con­tin­ued to demon­strate their tremen­dous val­ue; but con­structed gen­er­al-pur­pose lan­guages have not made peo­ple either more ratio­nal or more intel­li­gent.

“For ‘Tragedy’ [τραγωδία] and ‘Com­edy’ [τρυγωδία] come to be out of the same let­ters.”

, quoted/paraphrased by Aris­to­tle2

Why not? Think­ing about it from an sense, all Tur­ing-com­plete com­pu­ta­tional lan­guages are equiv­a­lently expres­sive, because any pro­gram that can be writ­ten in one lan­guage can be writ­ten in another by first writ­ing an inter­preter for the other and then run­ning the pro­gram; so with a con­stant length penalty (the inter­preter), all lan­guages are about equiv­a­lent. The con­stant penalty might be large & painful, though, as a lan­guage might be a , where every­thing is diffi­cult to say. Or from another infor­ma­tion the­ory per­spec­tive, most bit­strings are uncom­press­ible, most the­o­rems are unprov­able, most pro­grams are unde­cid­able etc; a com­pres­sion or pre­dic­tion algo­rithm only works effec­tively on a small sub­set of pos­si­ble inputs (while work­ing poorly or not at all on ran­dom input­s). So from that per­spec­tive, any lan­guage is on aver­age just as good as any oth­er—­giv­ing a . The goal of research, then, is to find algo­rithms which trade off per­for­mance on data that does not occur in the real world in exchange for per­for­mance on the kinds of reg­u­lar­i­ties which do turn up in real-world data. From “John Wilkin­s’s Ana­lyt­i­cal Lan­guage” (Borges 1942):

In the uni­ver­sal lan­guage con­ceived by in the mid­dle of the sev­en­teenth cen­tu­ry, each word defines itself. Descartes, in a let­ter dated Novem­ber 1619, had already noted that, by using the dec­i­mal sys­tem of numer­a­tion, we could learn in a sin­gle day to name all quan­ti­ties to infin­i­ty, and to write them in a new lan­guage, the lan­guage of num­bers; he also pro­posed the cre­ation of a sim­i­lar, gen­eral lan­guage that would orga­nize and con­tain all human thought. Around 1664, John Wilkins under­took that task.

He divided the uni­verse into forty cat­e­gories or class­es, which were then sub­di­vided into differ­ences, and sub­di­vided in turn into species. To each class he assigned a mono­syl­la­ble of two let­ters; to each differ­ence, a con­so­nant; to each species, a vow­el. For exam­ple, de means ele­ment; deb, the first of the ele­ments, fire; deba, a por­tion of the ele­ment of fire, a flame.

Hav­ing defined Wilkins’ pro­ce­dure, we must exam­ine a prob­lem that is impos­si­ble or diffi­cult to post­pone: the merit of the forty-part table on which the lan­guage is based. Let us con­sider the eighth cat­e­go­ry: stones. Wilkins divides them into com­mon (flint, grav­el, slate); mod­er­ate (mar­ble, amber, coral); pre­cious (pearl, opal); trans­par­ent (amethyst, sap­phire); and insol­u­ble (coal, , and arsenic). The ninth cat­e­gory is almost as alarm­ing as the eighth. It reveals that met­als can be imper­fect (, quick­sil­ver); arti­fi­cial (bronze, brass); recre­men­tal (fil­ings, rust); and nat­ural (gold, tin, cop­per). The whale appears in the six­teenth cat­e­go­ry: it is a vivip­a­rous, oblong fish. These ambi­gu­i­ties, redun­dan­cies, and defi­cien­cies recall those attrib­uted by Dr. Franz Kuhn to a cer­tain Chi­nese ency­clo­pe­dia called the Heav­enly Empo­rium of Benev­o­lent Knowl­edge. In its dis­tant pages it is writ­ten that ani­mals are divided into (a) those that belong to the emper­or; (b) embalmed ones; (c) those that are trained; (d) suck­ling pigs; (e) mer­maids; (f) fab­u­lous ones; (g) stray dogs; (h) those that are included in this clas­si­fi­ca­tion; (i) those that trem­ble as if they were mad; (j) innu­mer­able ones; (k) those drawn with a very fine camel’s-hair brush; (1) et-cetera; (m) those that have just bro­ken the flower vase; (n) those that at a dis­tance resem­ble flies.

The also exor­cises chaos: it has parceled the uni­verse into 1,000 sub­di­vi­sions, of which num­ber 262 cor­re­sponds to the Pope, num­ber 282 to the Roman Catholic Church, num­ber 263 to the Lord’s Day, num­ber 268 to Sun­day schools, num­ber 298 to Mor­monism, and num­ber 294 to Brah­man­ism, Bud­dhism, Shin­to­ism, and Tao­ism. Nor does it dis­dain the employ­ment of het­ero­ge­neous sub­di­vi­sions, for exam­ple, num­ber 179: “Cru­elty to ani­mals. Pro­tec­tion of ani­mals. Duel­ing and sui­cide from a moral point of view. Var­i­ous vices and defects. Var­i­ous virtues and qual­i­ties.”

I have noted the arbi­trari­ness of Wilkins, the unknown (or apoc­ryphal) Chi­nese ency­clo­pe­dist, and the Bib­li­o­graph­i­cal Insti­tute of Brus­sels; obvi­ously there is no clas­si­fi­ca­tion of the uni­verse that is not arbi­trary and spec­u­la­tive. The rea­son is quite sim­ple: we do not know what the uni­verse is. “This world,” wrote David Hume, “was only the first rude essay of some infant deity who after­wards aban­doned it, ashamed of his lame per­for­mance; it is the work only of some depen­dent, infe­rior deity, and is the object of deri­sion to his supe­ri­ors; it is the pro­duc­tion of old age and dotage in some super­an­nu­ated deity, and ever since his death has run on . . .” ( V [1779]). We must go even fur­ther, and sus­pect that there is no uni­verse in the organ­ic, uni­fy­ing sense of that ambi­tious word.

Attempts to derive knowl­edge from analy­sis of San­skrit or gema­tria on Bib­li­cal pas­sages fail because they have no rel­e­vant infor­ma­tion con­tent—who would put them there? An omni­scient god? To assume they have infor­ma­tional con­tent and that analy­sis of such lin­guis­tic cat­e­gories helps may let us spin our intel­lec­tual wheels indus­tri­ously and pro­duce all man­ner of ‘results’, but we will get nowhere in real­i­ty. (“The method of ‘pos­tu­lat­ing’ what we want has many advan­tages; they are the same as the advan­tages of theft over hon­est toil. Let us leave them to oth­ers and pro­ceed with our hon­est toil.”)

So if all lan­guages are equiv­a­lent, why do domain-spe­cific lan­guages work? Well, they work because they are inher­ently not gen­eral: they encode domain knowl­edge. John Stu­art Mill, “Ben­tham”:

It is a sound max­im, and one which all close thinkers have felt, but which no one before Ben­tham ever so con­sis­tently applied, that error lurks in gen­er­al­i­ties: that the human mind is not capa­ble of embrac­ing a com­plex whole, until it has sur­veyed and cat­a­logued the parts of which that whole is made up; that abstrac­tions are not real­i­ties per se, but an abridged mode of express­ing facts, and that the only prac­ti­cal mode of deal­ing with them is to trace them back to the facts (whether of expe­ri­ence or of con­scious­ness) of which they are the expres­sion.

Pro­ceed­ing on this prin­ci­ple, makes short work with the ordi­nary modes of moral and polit­i­cal rea­son­ing. The­se, it appeared to him, when hunted to their source, for the most part ter­mi­nated in phras­es. In pol­i­tics, lib­er­ty, social order, con­sti­tu­tion, law of nature, social com­pact, &c., were the catch-words: ethics had its anal­o­gous ones. Such were the argu­ments on which the gravest ques­tions of moral­ity and pol­icy were made to turn; not rea­sons, but allu­sions to rea­sons; sacra­men­tal expres­sions, by which a sum­mary appeal was made to some gen­eral sen­ti­ment of mankind, or to some maxim in famil­iar use, which might be true or not, but the lim­i­ta­tions of which no one had ever crit­i­cally exam­ined.

One could write the equiv­a­lent of a reg­u­lar expres­sion in , but it would take a lot longer than writ­ing a nor­mal reg­u­lar expres­sion, because the reg­u­lar expres­sions priv­i­lege a small sub­set of pos­si­ble kinds of text matches & trans­for­ma­tions and are not nearly as gen­eral as a Brain­fuck pro­gram could be. Sim­i­lar­ly, any math­e­mat­i­cal lan­guage priv­i­leges cer­tain approaches and the­o­rems, and this is why they are help­ful: they assign short sym­bol sequences to com­mon oper­a­tions, while uncom­mon things become long & hard. As White­head put it (in a quote which is often cited as an exem­plar of the idea that pow­er­ful gen­eral lan­guages exist but when quoted more ful­ly, makes clear that White­head is dis­cussing this idea of spe­cial­iza­tion as the advan­tage of bet­ter notation/languages):

By reliev­ing the brain of all unnec­es­sary work, a good nota­tion sets it free to con­cen­trate on more advanced prob­lems, and, in effect, increases the men­tal power of the race. Before the intro­duc­tion of the Ara­bic nota­tion, mul­ti­pli­ca­tion was diffi­cult, and the divi­sion even of inte­gers called into play the high­est math­e­mat­i­cal fac­ul­ties. Prob­a­bly noth­ing in the mod­ern world would have more aston­ished a Greek math­e­mati­cian than to learn that … a large pro­por­tion of the pop­u­la­tion of West­ern Europe could per­form the oper­a­tion of divi­sion for the largest num­bers. This fact would have seemed to him a sheer impos­si­bil­ity … Our mod­ern power of easy reck­on­ing with dec­i­mal frac­tions is the almost mirac­u­lous result of the grad­ual dis­cov­ery of a per­fect nota­tion. […] By the aid of sym­bol­ism, we can make tran­si­tions in rea­son­ing almost mechan­i­cal­ly, by the eye, which oth­er­wise would call into play the higher fac­ul­ties of the brain. […] It is a pro­foundly erro­neous tru­ism, repeated by all copy­-books and by emi­nent peo­ple when they are mak­ing speech­es, that we should cul­ti­vate the habit of think­ing of what we are doing. The pre­cise oppo­site is the case. Civil­i­sa­tion advances by extend­ing the num­ber of impor­tant oper­a­tions which we can per­form with­out think­ing about them. Oper­a­tions of thought are like cav­alry charges in a bat­tle—they are strictly lim­ited in num­ber, they require fresh hors­es, and must only be made at deci­sive moments.

This makes sense his­tor­i­cal­ly, as few nota­tions are dropped from the sky in fin­ished form and only then assist major dis­cov­er­ies; rather, nota­tions evolve in tan­dem with new ideas and dis­cov­er­ies to bet­ter express what is later rec­og­nized as essen­tial and drop what is less impor­tant. Early writ­ing or numeral sys­tems (like Sumer­ian ) were often pecu­liar hybrid sys­tems mix­ing aspects of alpha­bets, es, syl­labaries, and ideograms, some­times simul­ta­ne­ously within the same sys­tem or in oddly bro­ken ways. (A good guide to the won­der­ful and weird diver­sity of numeral/counting/arithmetic sys­tems is Ifrah 2000’s The Uni­ver­sal His­tory of Num­bers: From Pre­his­tory to the Inven­tion of the Com­puter.) Could a syl­labary as ele­gant as have been invented as the first writ­ten script? (Prob­a­bly not.) Leib­niz’s cal­cu­lus nota­tion evolved from its ori­gins; syn­tax for log­ics devel­oped con­sid­er­ably (the proofs of are much harder to read than con­tem­po­rary nota­tion, and pre­sen­ta­tions of logic are con­sid­er­ably aided by inno­va­tions such as ); the is one of con­stant sim­pli­fi­ca­tion & recast­ing into new notation/mathematics, where they were first pub­lished in a form far more baroque of 20 equa­tions involv­ing and slowly evolved by & oth­ers into the famil­iar 4 differ­en­tial equa­tions writ­ten using hardly 23 char­ac­ters (“Sim­plic­ity does not pre­cede com­plex­i­ty, but fol­lows it.”); pro­gram­ming lan­guages have also evolved from their often-bizarre begin­nings like to far more read­able pro­gram­ming lan­guages like Python & Haskell.3 Even fic­tion gen­res are ‘tech­nolo­gies’ or ‘lan­guages’ of tropes which must be devel­oped: in dis­cussing the early devel­op­ment of detective/mystery fic­tion, Moretti notes the appar­ent incom­pre­hen­sion of authors of what a “clue” is, cit­ing the exam­ple of how early authors “used them wrong: thus one detec­tive, hav­ing deduced that ‘the drug is in the third cup of coffee’, pro­ceeds to drink the coffee.” (em­pha­sis in orig­i­nal; see //Batu­man 2005)

“Sim­plic­ity does not pre­cede com­plex­i­ty, but fol­lows it.”

Alan Perlis, 1982

“Math­e­mat­ics is an exper­i­men­tal sci­ence, and defi­n­i­tions do not come first, but later on. They make them­selves, when the nature of the sub­ject has devel­oped itself.”

, 18934

“Cer­tainly ordi­nary lan­guage has no claim to be the last word, if there is such a thing. It embod­ies, indeed, some­thing bet­ter than the meta­physics of the Stone Age, name­ly, as was said, the inher­ited expe­ri­ence and acu­men of many gen­er­a­tions of men. But then, that acu­men has been con­cen­trated pri­mar­ily upon the prac­ti­cal busi­ness of life. If a dis­tinc­tion works well for prac­ti­cal pur­poses in ordi­nary life (no mean feat, for even ordi­nary life is full of hard cas­es), then there is sure to be some­thing in it, it will not mark noth­ing: yet this is likely enough to be not the best way of arrang­ing things if our inter­ests are more exten­sive or intel­lec­tual than the ordi­nary. And again, that expe­ri­ence has been derived only from the sources avail­able to ordi­nary men through­out most of civilised his­to­ry: it has not been fed from the resources of the micro­scope and its suc­ces­sors. And it must be added too, that super­sti­tion and error and fan­tasy of all kinds do become incor­po­rated in ordi­nary lan­guage and even some­times stand up to the sur­vival test (on­ly, when they do, why should we not detect it?). Cer­tain­ly, then, ordi­nary lan­guage is not the last word: in prin­ci­ple it can every­where be sup­ple­mented and improved upon and super­seded. Only remem­ber, it is the first word.”

, 1956

“The Pla­ton­ists sense intu­itively that ; the Aris­totelians, that they are gen­er­al­iza­tions; for the for­mer, lan­guage is noth­ing but a sys­tem of arbi­trary sym­bols; for the lat­ter, it is the map of the uni­verse…­Mau­rice de Wulf writes:”Ultra­-re­al­ism gar­nered the first adher­ents. The chron­i­cler (eleventh cen­tu­ry) gives the name ‘antiqui doc­tores’ to those who teach dialec­tics in re [of ]; speaks of it as an ‘antique doc­trine,’ and until the end of the twelfth cen­tu­ry, the name mod­erni is applied to its adver­saries." A hypoth­e­sis that is now incon­ceiv­able seemed obvi­ous in the ninth cen­tu­ry, and lasted in some form into the four­teenth. Nom­i­nal­ism, once the nov­elty of a few, today encom­passes every­one; its vic­tory is so vast and fun­da­men­tal that its name is use­less. No one declares him­self a nom­i­nal­ist because no one is any­thing else. Let us try to under­stand, nev­er­the­less, that for the men of the Mid­dle Ages the fun­da­men­tal thing was not men but human­i­ty, not indi­vid­u­als but the species, not the species but the genus, not the gen­era but God."

Jorge Luis Borges, “From Alle­gories to Nov­els” (pg338–339, Selected Non-Fic­tions)

So the infor­ma­tion embed­ded in the lan­guages did not come from nowhere—the cre­ation & evo­lu­tion of lan­guages reflects the hard-earned knowl­edge of prac­ti­tion­ers & researchers about what com­mon uses are and what should be made easy to say, and what abstract pat­terns can be gen­er­al­ized and named and made cit­i­zens in a lan­guage. The effec­tive­ness of these lan­guages, in usual prac­tice and in assist­ing major dis­cov­er­ies, then comes from the nota­tion reduc­ing fric­tion in exe­cut­ing nor­mal research (eg one could mul­ti­ply any Roman numer­als, but it is much eas­ier to mul­ti­ply Ara­bic numer­al­s), and from sug­gest­ing areas which log­i­cally fol­low from exist­ing results by sym­me­try or com­bi­na­tions but which are cur­rently gaps.

To be effec­tive, a gen­er­al-pur­pose lan­guage would have to encode some knowl­edge or algo­rithm which offers con­sid­er­able gains across many human domains but which does not oth­er­wise affect the com­pu­ta­tional power of a human brain or do any exter­nal com­pu­ta­tion of its own. (Of course, indi­vid­ual humans differ greatly in their own intel­li­gence and ratio­nal­ity due in large part to indi­vid­ual neu­ro­log­i­cal & bio­log­i­cal differ­ences, and sup­port appa­ra­tus such as com­put­ers can greatly extend the capa­bil­i­ties of a human—but those aren’t lan­guages.) There doubt­less are such pieces of knowl­edge like the sci­en­tific method itself, but it would appear that humans learn them ade­quately with­out a lan­guage encod­ing them; and on the other hand, there are count­less extremely impor­tant pieces of knowl­edge for indi­vid­ual domains, which would be worth encod­ing into a lan­guage, except most peo­ple have no need to work in those small domains, so… we get jar­gon and domain-spe­cific lan­guages, but not Gen­eral Seman­tics super­heros speak­ing a pow­er­ful con­lang.

Gen­er­al-pur­pose lan­guages would only encode gen­eral weak knowl­edge, such as (tak­ing an AI per­spec­tive) prop­er­ties like objec­t-ness and a causally sparse world in which objects and agents can be mean­ing­fully described with short descrip­tions & the inten­tional stance, and vocab­u­lary encodes a human­ly-rel­e­vant descrip­tion of human life (eg the in which reg­u­lar­i­ties in human per­son­al­ity are diffusely encoded into thou­sands of word­s—if per­son­al­ity became a major inter­est, peo­ple would likely start using a much smaller & more refined set of per­son­al­ity words like the ). As a spe­cific domain devel­ops domain-spe­cific knowl­edge which would be valu­able to have encoded into a lan­guage, it grad­u­ally drifts from the gen­er­al-pur­pose lan­guage and, when the knowl­edge is encoded as words, becomes jar­gon-dense, and when it can be encoded in the syn­tax & gram­mar of sym­bols, a for­mal nota­tion. (“Every­thing should be built top-down, except the first time.”) Much like how sci­en­tific fields fis­sion as they devel­op. ()

The uni­ver­sal nat­ural lan­guage serves as a ‘glue’ or ‘host’ lan­guage for com­mu­ni­cat­ing things not cov­ered by spe­cific fields and for com­bin­ing results from domain-spe­cific lan­guages, and are jack­-of-al­l-trades: not good at any­thing in par­tic­u­lar, but, shorn of most acci­den­tal com­plex­ity like gram­mat­i­cal gen­der or totally ran­dom­ized spelling (“Fools ignore com­plex­i­ty. Prag­ma­tists suffer it. Some can avoid it. Geniuses remove it.”), about good as any com­peti­tor at every­thing. (“Opti­miza­tion hin­ders evo­lu­tion.”) And to the extent that the domain-spe­cific lan­guages encode any­thing gen­er­ally impor­tant into vocab­u­lary, the host gen­er­al-pur­pose lan­guage can try to steal the idea (shorn of its frame­work) as iso­lated vocab­u­lary words.

So the gen­er­al-pur­pose lan­guages gen­er­ally remain equiv­a­lently pow­er­ful, and are rea­son­ably effi­cient enough that switch­ing to con­langs does not offer a suffi­cient advan­tage.

This per­spec­tive explains why we see pow­er­ful spe­cial-pur­pose lan­guages and weak gen­er­al-pur­pose lan­guages: the require­ment of encod­ing impor­tant infor­ma­tion forces gen­eral lan­guages into becom­ing ever nar­rower if they want to be pow­er­ful.

It also is an inter­est­ing per­spec­tive to take on intel­lec­tual trends, ide­o­log­i­cal move­ments, and magical/religious think­ing. Through his­to­ry, many major philosophical/religious/scientific thinkers have been deeply fas­ci­nated by ety­mol­o­gy, philol­o­gy, and lin­guis­tics, indeed, to the point of bas­ing major parts of their philoso­phies & ideas on their analy­sis of words. (It would be invid­i­ous to name spe­cific exam­ples.) Who has not seen an intel­lec­tual who in dis­cussing a topic spends more time recount­ing the ‘intel­lec­tual his­tory’ of it rather than the actual sub­stance of the top­ic, and whose idea of ‘intel­lec­tual his­tory’ appears to con­sist solely of recount­ing dic­tio­nary defi­n­i­tions, selec­tive quo­ta­tion, ety­molo­gies, nuances of long-dis­carded the­o­ries, end­less pro­pos­als of unnec­es­sary new neol­o­gisms, and pro­vid­ing a ‘his­tory’ or ‘evo­lu­tion’ of ideas which con­sist solely of post hoc ergo propter hoc ele­vated to a fine art? One would think that that much time spent in philol­o­gy, show­ing the sheer flex­i­bil­ity of most words (often mutat­ing into their oppo­sites) and arbi­trari­ness of their mean­ings and evo­lu­tions, would also show how lit­tle any ety­mol­ogy or prior use mat­ters. What does a philoso­pher like Niet­zsche think he is doing when—in­ter­ested in con­tem­po­rary moral phi­los­o­phy and not his­tory for the sake of his­to­ry—he nev­er­the­less spends many pages lay­ing out spec­u­la­tions on Ger­man word ety­molo­gies and link­ing them to Chris­tian­ity as a ‘geneal­ogy of morals’, when Chris­tian­ity had not the slight­est thing to do with Ger­many for many cen­turies after its found­ing, all the peo­ple involved knew far less than we do (like a doc­tor con­sult­ing Galen for advice), and this is all extremely spec­u­la­tive even as a mat­ter of his­to­ry, much less actual moral phi­los­o­phy?

These lin­guis­ti­cal­ly-based claims are fre­quently among the most bizarre & deeply wrong parts of their beliefs, and it would be rea­son­able to say that they have been blinded by a love of word­s—­for­get­ting that a word is merely a word, and the map is not the ter­ri­to­ry. What are the prob­lems there? At least one prob­lem is the implicit belief that lin­guis­tic analy­sis can tell us any­thing more than some diffuse pop­u­la­tion beliefs about the world or the rel­a­tively nar­row ques­tions of for­mal lin­guis­tics (eg about his­tory or lan­guage fam­i­lies), the belief that lin­guis­tic analy­sis can reveal deep truths about the mean­ing of con­cepts, about the nature of real­i­ty, the gods, moral con­duct, pol­i­tics, etc—that we can get out of words far more than any­one ever put in. Since words & lan­guages have evolved through a highly ran­dom nat­ural process filled with arbi­trari­ness, con­tin­gen­cy, and con­stant mutation/decay, there is not much infor­ma­tion in them, there is no one who has been putting infor­ma­tion in them, and to the extent that any­one has, their deep thoughts & empir­i­cal evi­dence are bet­ter accessed through their explicit writ­ings, and in any case, are likely super­seded. So, any infor­ma­tion a nat­ural lan­guage encodes is impov­er­ished, weak, and out­dat­ed; and unhelp­ful, if not actively harm­ful, to attempt to base any kind of rea­son­ing on.

In the case of reli­gious thinkers who, start­ing from that incor­rect premise, believe in the divine author­ship of the Bible/Koran/scriptures or that Hebrew or San­skrit encode deep cor­re­spon­dences and gov­ern the nature of the cos­mos in every detail, this belief is defen­si­ble: being spoken/written by an omni­scient being, it is rea­son­able that the god might have encoded arbi­trar­ily much infor­ma­tion in it and care­ful study unearth it. In its crud­est form, it is sym­pa­thetic mag­ic, hid­ing of ‘true names’, judg­ing on word-sounds, and schiz­o­phre­ni­a-like word-salad free-as­so­cia­tive think­ing (eg or the ). Sec­u­lar thinkers have no such defense for being blinded by words, but the same mys­ti­cal belief in the deep causal pow­ers of spelling and word use and word choice is per­va­sive.

See Also


  1. All of which is dubi­ous on causal grounds: does the lan­guage actu­ally cause such cross-cul­tural differ­ences or are such cor­re­lates merely again?↩︎

  2. Book 1, ; for defense of the inter­pre­ta­tion that this is word­play & not merely a generic obser­va­tion about alpha­betic writ­ing, see .↩︎

  3. An inter­est­ing exam­ple of con­cise pro­gram­ming lan­guages—and also inspired by Maxwell’s equa­tion­s—­comes from the STEPS research project (), an effort to write a full oper­at­ing sys­tem with a GUI desk­top envi­ron­ment, sound, text edi­tor, Inter­net-ca­pable net­work­ing & web browser etc in ~20,000 lines of code. This is accom­plished, in part, by care­ful spec­i­fi­ca­tion of a core, and then build­ing on it with lay­ers of domain-spe­cific lan­guages. For exam­ple, to define TCP/IP net­work­ing code, which involves a lot of low-level binary bit-twid­dling and rigid­ly-de­fined for­mats and typ­i­cally takes at least 20,000 lines of code all on its own, STEPS uses the ASCII art dia­grams from the RFC spec­i­fi­ca­tions as a for­mal spec­i­fi­ca­tion of how to do net­work pars­ing. It also makes trade­offs of com­put­ing power ver­sus lines of code: a nor­mal com­puter oper­at­ing sys­tem or library might spend many lines of code deal­ing with the fid­dly details of how pre­cisely to lay out indi­vid­ual char­ac­ters of text on the screen in the fastest & most cor­rect pos­si­ble way, whereas STEPS defines some rules and uses a generic opti­miza­tion algo­rithm to iter­a­tively posi­tion text (or any­thing else) cor­rect­ly.↩︎

  4. Heav­i­side 1893, “On Oper­a­tors in Phys­i­cal Math­e­mat­ics, Part II” (pg121)↩︎