On the Existence of Powerful Natural Languages

A common dream in philosophy and politics and religion is the idea of languages superior to evolved demotics, whether Latin or Lojban, which grant speakers greater insight into reality and rationality, analogous to well-known efficacy of mathematical sub-languages in solving problems. This dream fails because such languages gain power inherently from specialization.
computer-science, philosophy, psychology, politics, transhumanism
2016-12-182019-05-05 finished certainty: possible importance: 6


De­signed for­mal no­ta­tions & dis­tinct vo­cab­u­lar­ies are often em­ployed in STEM fields, and these spe­cial­ized lan­guages are cred­ited with greatly en­hanc­ing re­search & com­mu­ni­ca­tion. Many philoso­phers and other thinkers have at­tempted to cre­ate more gen­er­al­ly-ap­plic­a­ble de­signed lan­guages for use out­side of spe­cific tech­ni­cal fields to en­hance hu­man think­ing, but the em­pir­i­cal track record is poor and no such de­signed lan­guage has demon­strated sub­stan­tial im­prove­ments to hu­man cog­ni­tion such as re­sist­ing cog­ni­tive bi­ases or log­i­cal fal­lac­i­es. I sug­gest that the suc­cess of spe­cial­ized lan­guages in fields is in­her­ently due to en­cod­ing large amounts of pre­vi­ous­ly-dis­cov­ered in­for­ma­tion spe­cific to those fields, and this ex­plains their in­abil­ity to boost hu­man cog­ni­tion across a wide va­ri­ety of do­mains.

Along­side the axis of effi­ciency in terms of time or words or char­ac­ters to con­vey a cer­tain amount of in­for­ma­tion, one might think about nat­ural lan­guages in terms of some sort of ‘power’ met­ric, akin to the , in which some lan­guages are bet­ter at al­low­ing one to think or ex­press im­por­tant thoughts and thus a is nec­es­sary. For ex­am­ple, Chi­nese is some­times said to be too ‘con­crete’ and makes sci­en­tific thought diffi­cult com­pared to some other lan­guages; ‘fix­ing’ lan­guage and ter­mi­nol­ogy has been a peren­nial fo­cus of West­ern politi­cians in the 20th & 21st cen­tu­ry, as­cribed great pow­ers on so­ci­ety through names and pro­nouns; less con­tro­ver­sial­ly, math­e­mati­cians & physi­cists unan­i­mously agree that no­ta­tion is ex­tremely im­por­tant and that a good no­ta­tion for a topic can make rou­tine re­sults eas­ier to cre­ate & un­der­stand and also can en­able pre­vi­ously un­think­able thoughts and sug­gest im­por­tant re­search di­rec­tion­s—Ara­bic nu­mer­als vs Ro­man nu­mer­als, New­ton’s cal­cu­lus no­ta­tion ver­sus Leib­niz’s, etc.

This has been taken to ex­tremes: if good no­ta­tion can help, surely there is some des­ignable ideal lan­guage in which all thoughts can be ex­pressed most per­fectly in a way which quickly & log­i­cally guides thoughts to their cor­rect con­clu­sions with­out falling prey to fal­lac­ies or cog­ni­tive bi­as­es, and make its users far more in­tel­li­gent & effec­tive than those forced to use nat­ural lan­guages. The usual ex­am­ple here be­ing Leib­niz’s pro­posed log­i­cal lan­guage in which philo­soph­i­cal dis­putes could be re­solved sim­ply by cal­cu­la­tion, or . The hope that bet­ter lan­guages can lead to bet­ter hu­man thought has also been a mo­ti­va­tion for de­vel­op­ing like , with ra­tio­nal­ized vo­cab­u­lary & gram­mars. Con­langs, and less rad­i­cal lin­guis­tic in­no­va­tions like (and coin­ing ne­ol­o­gisms for new con­cepts) show up oc­ca­sion­ally in move­ments like and quite often in sci­ence fic­tion & tech­nol­ogy (eg En­gel­bart 1962); and of course many po­lit­i­cal or so­cial or philo­soph­i­cal move­ments have ex­plic­itly wanted to change lan­guages to affect thought (from Com­mu­nism, mem­o­rably dra­ma­tized in , to fem­i­nism, post-mod­ernisms, , or Ukrain­ian na­tion­al­ists; I’ve no­ticed that this trend seems par­tic­u­larly pro­nounced in the mid-20th cen­tu­ry).

This is sen­si­ble, since it seems clear that our lan­guage in­flu­ences our thoughts (for bet­ter or worse), that there’s no par­tic­u­larly rea­son to ex­pect evolved lan­guages like Eng­lish to be all that great (they clearly are bru­tally ir­reg­u­lar and in­effi­cient in ways that have no pos­si­ble util­i­ty, and com­par­isons of writ­ing sys­tems are equally clear that some sys­tems like are just plain bet­ter), and if bet­ter no­ta­tion and vo­cab­u­lary can be so help­ful in STEM ar­eas, can con­structed lan­guages be help­ful in gen­er­al?

I would have to say that the an­swer ap­pears to be no. First, the Sapir-Whorf hy­poth­e­sis has largely panned out as triv­i­al­i­ties: there are few or no im­por­tant cross-cul­tural differ­ences as­crib­able to lan­guages’ gram­mars or vo­cab­u­lar­ies, only slight differ­ences like em­pha­sis in color per­cep­tion or pos­si­bly dis­count rates1 Nor do we see huge differ­ences be­tween pop­u­la­tions speak­ing rad­i­cally differ­ent lan­guages when they learn differ­ent nat­ural lan­guages—Chi­nese peo­ple might find it some­what eas­ier to think sci­en­tifi­cally in Eng­lish than in Man­dar­in, but it hardly seems to make a huge differ­ence. Sec­ond­ly, look­ing over the his­tory of such at­tempts, there does not ap­pear to be any no­tice­able gain from switch­ing to E-Prime or Lo­jban etc. Move­ments like Gen­eral Se­man­tics have not demon­strated no­table ad­di­tional re­al-world suc­cess in their ad­her­ents (although I do think that such prac­tices as E-Prime do im­prove philo­soph­i­cal writ­ing) At­tempts at gen­er­al-pur­pose effi­cient lan­guages ar­guably have been able to de­crease the learn­ing time for con­langs and pro­vide mod­est ed­u­ca­tional ad­van­tages; for par­tic­u­lar fields like chem­istry & physics & com­puter sci­ence (eg , , ) have con­tin­ued to demon­strate their tremen­dous val­ue; but con­structed gen­er­al-pur­pose lan­guages have not made peo­ple ei­ther more ra­tio­nal or more in­tel­li­gent.

“For ‘Tragedy’ [τραγωδία] and ‘Com­edy’ [τρυγωδία] come to be out of the same let­ters.”

, quot­ed/­para­phrased by Aris­to­tle2

Why not? Think­ing about it from an sense, all Tur­ing-com­plete com­pu­ta­tional lan­guages are equiv­a­lently ex­pres­sive, be­cause any pro­gram that can be writ­ten in one lan­guage can be writ­ten in an­other by first writ­ing an in­ter­preter for the other and then run­ning the pro­gram; so with a con­stant length penalty (the in­ter­preter), all lan­guages are about equiv­a­lent. The con­stant penalty might be large & painful, though, as a lan­guage might be a , where every­thing is diffi­cult to say. Or from an­other in­for­ma­tion the­ory per­spec­tive, most bit­strings are un­com­press­ible, most the­o­rems are un­prov­able, most pro­grams are un­de­cid­able etc; a com­pres­sion or pre­dic­tion al­go­rithm only works effec­tively on a small sub­set of pos­si­ble in­puts (while work­ing poorly or not at all on ran­dom in­put­s). So from that per­spec­tive, any lan­guage is on av­er­age just as good as any oth­er—­giv­ing a . The goal of re­search, then, is to find al­go­rithms which trade off per­for­mance on data that does not oc­cur in the real world in ex­change for per­for­mance on the kinds of reg­u­lar­i­ties which do turn up in re­al-world da­ta. From “John Wilkin­s’s An­a­lyt­i­cal Lan­guage” (Borges 1942):

In the uni­ver­sal lan­guage con­ceived by in the mid­dle of the sev­en­teenth cen­tu­ry, each word de­fines it­self. Descartes, in a let­ter dated No­vem­ber 1619, had al­ready noted that, by us­ing the dec­i­mal sys­tem of nu­mer­a­tion, we could learn in a sin­gle day to name all quan­ti­ties to in­fin­i­ty, and to write them in a new lan­guage, the lan­guage of num­bers; he also pro­posed the cre­ation of a sim­i­lar, gen­eral lan­guage that would or­ga­nize and con­tain all hu­man thought. Around 1664, John Wilkins un­der­took that task.

He di­vided the uni­verse into forty cat­e­gories or class­es, which were then sub­di­vided into differ­ences, and sub­di­vided in turn into species. To each class he as­signed a mono­syl­la­ble of two let­ters; to each differ­ence, a con­so­nant; to each species, a vow­el. For ex­am­ple, de means el­e­ment; deb, the first of the el­e­ments, fire; deba, a por­tion of the el­e­ment of fire, a flame.

Hav­ing de­fined Wilkins’ pro­ce­dure, we must ex­am­ine a prob­lem that is im­pos­si­ble or diffi­cult to post­pone: the merit of the forty-part ta­ble on which the lan­guage is based. Let us con­sider the eighth cat­e­go­ry: stones. Wilkins di­vides them into com­mon (flint, grav­el, slate); mod­er­ate (mar­ble, am­ber, coral); pre­cious (pearl, opal); trans­par­ent (amethyst, sap­phire); and in­sol­u­ble (coal, , and ar­senic). The ninth cat­e­gory is al­most as alarm­ing as the eighth. It re­veals that met­als can be im­per­fect (, quick­sil­ver); ar­ti­fi­cial (bronze, brass); recre­men­tal (fil­ings, rust); and nat­ural (gold, tin, cop­per). The whale ap­pears in the six­teenth cat­e­go­ry: it is a vi­vip­a­rous, ob­long fish. These am­bi­gu­i­ties, re­dun­dan­cies, and de­fi­cien­cies re­call those at­trib­uted by Dr. Franz Kuhn to a cer­tain Chi­nese en­cy­clo­pe­dia called the Heav­enly Em­po­rium of Benev­o­lent Knowl­edge. In its dis­tant pages it is writ­ten that an­i­mals are di­vided into (a) those that be­long to the em­per­or; (b) em­balmed ones; (c) those that are trained; (d) suck­ling pigs; (e) mer­maids; (f) fab­u­lous ones; (g) stray dogs; (h) those that are in­cluded in this clas­si­fi­ca­tion; (i) those that trem­ble as if they were mad; (j) in­nu­mer­able ones; (k) those drawn with a very fine camel’s-hair brush; (1) et-cetera; (m) those that have just bro­ken the flower vase; (n) those that at a dis­tance re­sem­ble flies.

The also ex­or­cises chaos: it has parceled the uni­verse into 1,000 sub­di­vi­sions, of which num­ber 262 cor­re­sponds to the Pope, num­ber 282 to the Ro­man Catholic Church, num­ber 263 to the Lord’s Day, num­ber 268 to Sun­day schools, num­ber 298 to Mor­monism, and num­ber 294 to Brah­man­ism, Bud­dhism, Shin­to­ism, and Tao­ism. Nor does it dis­dain the em­ploy­ment of het­ero­ge­neous sub­di­vi­sions, for ex­am­ple, num­ber 179: “Cru­elty to an­i­mals. Pro­tec­tion of an­i­mals. Du­el­ing and sui­cide from a moral point of view. Var­i­ous vices and de­fects. Var­i­ous virtues and qual­i­ties.”

I have noted the ar­bi­trari­ness of Wilkins, the un­known (or apoc­ryphal) Chi­nese en­cy­clo­pe­dist, and the Bib­li­o­graph­i­cal In­sti­tute of Brus­sels; ob­vi­ously there is no clas­si­fi­ca­tion of the uni­verse that is not ar­bi­trary and spec­u­la­tive. The rea­son is quite sim­ple: we do not know what the uni­verse is. “This world,” wrote David Hume, “was only the first rude es­say of some in­fant de­ity who after­wards aban­doned it, ashamed of his lame per­for­mance; it is the work only of some de­pen­dent, in­fe­rior de­ity, and is the ob­ject of de­ri­sion to his su­pe­ri­ors; it is the pro­duc­tion of old age and dotage in some su­per­an­nu­ated de­ity, and ever since his death has run on . . .” ( V [1779]). We must go even fur­ther, and sus­pect that there is no uni­verse in the or­gan­ic, uni­fy­ing sense of that am­bi­tious word.

At­tempts to de­rive knowl­edge from analy­sis of San­skrit or gema­tria on Bib­li­cal pas­sages fail be­cause they have no rel­e­vant in­for­ma­tion con­tent—who would put them there? An om­ni­scient god? To as­sume they have in­for­ma­tional con­tent and that analy­sis of such lin­guis­tic cat­e­gories helps may let us spin our in­tel­lec­tual wheels in­dus­tri­ously and pro­duce all man­ner of ‘re­sults’, but we will get nowhere in re­al­i­ty. (“The method of ‘pos­tu­lat­ing’ what we want has many ad­van­tages; they are the same as the ad­van­tages of theft over hon­est toil. Let us leave them to oth­ers and pro­ceed with our hon­est toil.”)

So if all lan­guages are equiv­a­lent, why do do­main-spe­cific lan­guages work? Well, they work be­cause they are in­her­ently not gen­eral: they en­code do­main knowl­edge. John Stu­art Mill, “Ben­tham”:

It is a sound max­im, and one which all close thinkers have felt, but which no one be­fore Ben­tham ever so con­sis­tently ap­plied, that er­ror lurks in gen­er­al­i­ties: that the hu­man mind is not ca­pa­ble of em­brac­ing a com­plex whole, un­til it has sur­veyed and cat­a­logued the parts of which that whole is made up; that ab­strac­tions are not re­al­i­ties per se, but an abridged mode of ex­press­ing facts, and that the only prac­ti­cal mode of deal­ing with them is to trace them back to the facts (whether of ex­pe­ri­ence or of con­scious­ness) of which they are the ex­pres­sion.

Pro­ceed­ing on this prin­ci­ple, makes short work with the or­di­nary modes of moral and po­lit­i­cal rea­son­ing. The­se, it ap­peared to him, when hunted to their source, for the most part ter­mi­nated in phras­es. In pol­i­tics, lib­er­ty, so­cial or­der, con­sti­tu­tion, law of na­ture, so­cial com­pact, &c., were the catch-words: ethics had its anal­o­gous ones. Such were the ar­gu­ments on which the gravest ques­tions of moral­ity and pol­icy were made to turn; not rea­sons, but al­lu­sions to rea­sons; sacra­men­tal ex­pres­sions, by which a sum­mary ap­peal was made to some gen­eral sen­ti­ment of mankind, or to some maxim in fa­mil­iar use, which might be true or not, but the lim­i­ta­tions of which no one had ever crit­i­cally ex­am­ined.

One could write the equiv­a­lent of a reg­u­lar ex­pres­sion in Brain­fuck, but it would take a lot longer than writ­ing a nor­mal reg­u­lar ex­pres­sion, be­cause the reg­u­lar ex­pres­sions priv­i­lege a small sub­set of pos­si­ble kinds of text matches & trans­for­ma­tions and are not nearly as gen­eral as a Brain­fuck pro­gram could be. Sim­i­lar­ly, any math­e­mat­i­cal lan­guage priv­i­leges cer­tain ap­proaches and the­o­rems, and this is why they are help­ful: they as­sign short sym­bol se­quences to com­mon op­er­a­tions, while un­com­mon things be­come long & hard. As White­head put it (in a quote which is often cited as an ex­em­plar of the idea that pow­er­ful gen­eral lan­guages ex­ist but when quoted more ful­ly, makes clear that White­head is dis­cussing this idea of spe­cial­iza­tion as the ad­van­tage of bet­ter no­ta­tion/lan­guages):

By re­liev­ing the brain of all un­nec­es­sary work, a good no­ta­tion sets it free to con­cen­trate on more ad­vanced prob­lems, and, in effect, in­creases the men­tal power of the race. Be­fore the in­tro­duc­tion of the Ara­bic no­ta­tion, mul­ti­pli­ca­tion was diffi­cult, and the di­vi­sion even of in­te­gers called into play the high­est math­e­mat­i­cal fac­ul­ties. Prob­a­bly noth­ing in the mod­ern world would have more as­ton­ished a Greek math­e­mati­cian than to learn that … a large pro­por­tion of the pop­u­la­tion of West­ern Eu­rope could per­form the op­er­a­tion of di­vi­sion for the largest num­bers. This fact would have seemed to him a sheer im­pos­si­bil­ity … Our mod­ern power of easy reck­on­ing with dec­i­mal frac­tions is the al­most mirac­u­lous re­sult of the grad­ual dis­cov­ery of a per­fect no­ta­tion. […] By the aid of sym­bol­ism, we can make tran­si­tions in rea­son­ing al­most me­chan­i­cal­ly, by the eye, which oth­er­wise would call into play the higher fac­ul­ties of the brain. […] It is a pro­foundly er­ro­neous tru­ism, re­peated by all copy­-books and by em­i­nent peo­ple when they are mak­ing speech­es, that we should cul­ti­vate the habit of think­ing of what we are do­ing. The pre­cise op­po­site is the case. Civil­i­sa­tion ad­vances by ex­tend­ing the num­ber of im­por­tant op­er­a­tions which we can per­form with­out think­ing about them. Op­er­a­tions of thought are like cav­alry charges in a bat­tle—they are strictly lim­ited in num­ber, they re­quire fresh hors­es, and must only be made at de­ci­sive mo­ments.

This makes sense his­tor­i­cal­ly, as few no­ta­tions are dropped from the sky in fin­ished form and only then as­sist ma­jor dis­cov­er­ies; rather, no­ta­tions evolve in tan­dem with new ideas and dis­cov­er­ies to bet­ter ex­press what is later rec­og­nized as es­sen­tial and drop what is less im­por­tant. Early writ­ing or nu­meral sys­tems (like Sumer­ian ) were often pe­cu­liar hy­brid sys­tems mix­ing as­pects of al­pha­bets, es, syl­labaries, and ideograms, some­times si­mul­ta­ne­ously within the same sys­tem or in oddly bro­ken ways. (A good guide to the won­der­ful and weird di­ver­sity of nu­mer­al/­count­ing/arith­metic sys­tems is Ifrah 2000’s The Uni­ver­sal His­tory of Num­bers: From Pre­his­tory to the In­ven­tion of the Com­puter.) Could a syl­labary as el­e­gant as have been in­vented as the first writ­ten script? (Prob­a­bly not.) Leib­niz’s cal­cu­lus no­ta­tion evolved from its ori­gins; syn­tax for log­ics de­vel­oped con­sid­er­ably (the proofs of are much harder to read than con­tem­po­rary no­ta­tion, and pre­sen­ta­tions of logic are con­sid­er­ably aided by in­no­va­tions such as ); the is one of con­stant sim­pli­fi­ca­tion & re­cast­ing into new no­ta­tion/­math­e­mat­ics, where they were first pub­lished in a form far more baroque of 20 equa­tions in­volv­ing and slowly evolved by & oth­ers into the fa­mil­iar 4 differ­en­tial equa­tions writ­ten us­ing hardly 23 char­ac­ters (“Sim­plic­ity does not pre­cede com­plex­i­ty, but fol­lows it.”); pro­gram­ming lan­guages have also evolved from their often-bizarre be­gin­nings like to far more read­able pro­gram­ming lan­guages like Python & Haskell.3 Even fic­tion gen­res are ‘tech­nolo­gies’ or ‘lan­guages’ of tropes which must be de­vel­oped: in dis­cussing the early de­vel­op­ment of de­tec­tive/mys­tery fic­tion, Moretti notes the ap­par­ent in­com­pre­hen­sion of au­thors of what a “clue” is, cit­ing the ex­am­ple of how early au­thors “used them wrong: thus one de­tec­tive, hav­ing de­duced that ‘the drug is in the third cup of coffee’, pro­ceeds to drink the coffee.” (em­pha­sis in orig­i­nal; see //Ba­tu­man 2005)

“Sim­plic­ity does not pre­cede com­plex­i­ty, but fol­lows it.”

Alan Perlis, 1982

“Math­e­mat­ics is an ex­per­i­men­tal sci­ence, and de­fi­n­i­tions do not come first, but later on. They make them­selves, when the na­ture of the sub­ject has de­vel­oped it­self.”

, 18934

“Cer­tainly or­di­nary lan­guage has no claim to be the last word, if there is such a thing. It em­bod­ies, in­deed, some­thing bet­ter than the meta­physics of the Stone Age, name­ly, as was said, the in­her­ited ex­pe­ri­ence and acu­men of many gen­er­a­tions of men. But then, that acu­men has been con­cen­trated pri­mar­ily upon the prac­ti­cal busi­ness of life. If a dis­tinc­tion works well for prac­ti­cal pur­poses in or­di­nary life (no mean feat, for even or­di­nary life is full of hard cas­es), then there is sure to be some­thing in it, it will not mark noth­ing: yet this is likely enough to be not the best way of ar­rang­ing things if our in­ter­ests are more ex­ten­sive or in­tel­lec­tual than the or­di­nary. And again, that ex­pe­ri­ence has been de­rived only from the sources avail­able to or­di­nary men through­out most of civilised his­to­ry: it has not been fed from the re­sources of the mi­cro­scope and its suc­ces­sors. And it must be added too, that su­per­sti­tion and er­ror and fan­tasy of all kinds do be­come in­cor­po­rated in or­di­nary lan­guage and even some­times stand up to the sur­vival test (on­ly, when they do, why should we not de­tect it?). Cer­tain­ly, then, or­di­nary lan­guage is not the last word: in prin­ci­ple it can every­where be sup­ple­mented and im­proved upon and su­per­seded. Only re­mem­ber, it is the first word.”

, 1956

“The Pla­ton­ists sense in­tu­itively that ; the Aris­totelians, that they are gen­er­al­iza­tions; for the for­mer, lan­guage is noth­ing but a sys­tem of ar­bi­trary sym­bols; for the lat­ter, it is the map of the uni­verse…­Mau­rice de Wulf writes:”Ul­tra­-re­al­ism gar­nered the first ad­her­ents. The chron­i­cler (eleventh cen­tu­ry) gives the name ‘an­tiqui doc­tores’ to those who teach di­alec­tics in re [of ]; speaks of it as an ‘an­tique doc­trine,’ and un­til the end of the twelfth cen­tu­ry, the name mod­erni is ap­plied to its ad­ver­saries." A hy­poth­e­sis that is now in­con­ceiv­able seemed ob­vi­ous in the ninth cen­tu­ry, and lasted in some form into the four­teenth. Nom­i­nal­ism, once the nov­elty of a few, to­day en­com­passes every­one; its vic­tory is so vast and fun­da­men­tal that its name is use­less. No one de­clares him­self a nom­i­nal­ist be­cause no one is any­thing else. Let us try to un­der­stand, nev­er­the­less, that for the men of the Mid­dle Ages the fun­da­men­tal thing was not men but hu­man­i­ty, not in­di­vid­u­als but the species, not the species but the genus, not the gen­era but God."

Jorge Luis Borges, “From Al­le­gories to Nov­els” (pg338–339, Se­lected Non-Fic­tions)

So the in­for­ma­tion em­bed­ded in the lan­guages did not come from nowhere—the cre­ation & evo­lu­tion of lan­guages re­flects the hard-earned knowl­edge of prac­ti­tion­ers & re­searchers about what com­mon uses are and what should be made easy to say, and what ab­stract pat­terns can be gen­er­al­ized and named and made cit­i­zens in a lan­guage. The effec­tive­ness of these lan­guages, in usual prac­tice and in as­sist­ing ma­jor dis­cov­er­ies, then comes from the no­ta­tion re­duc­ing fric­tion in ex­e­cut­ing nor­mal re­search (eg one could mul­ti­ply any Ro­man nu­mer­als, but it is much eas­ier to mul­ti­ply Ara­bic nu­mer­al­s), and from sug­gest­ing ar­eas which log­i­cally fol­low from ex­ist­ing re­sults by sym­me­try or com­bi­na­tions but which are cur­rently gaps.

To be effec­tive, a gen­er­al-pur­pose lan­guage would have to en­code some knowl­edge or al­go­rithm which offers con­sid­er­able gains across many hu­man do­mains but which does not oth­er­wise affect the com­pu­ta­tional power of a hu­man brain or do any ex­ter­nal com­pu­ta­tion of its own. (Of course, in­di­vid­ual hu­mans differ greatly in their own in­tel­li­gence and ra­tio­nal­ity due in large part to in­di­vid­ual neu­ro­log­i­cal & bi­o­log­i­cal differ­ences, and sup­port ap­pa­ra­tus such as com­put­ers can greatly ex­tend the ca­pa­bil­i­ties of a hu­man—but those aren’t lan­guages.) There doubt­less are such pieces of knowl­edge like the sci­en­tific method it­self, but it would ap­pear that hu­mans learn them ad­e­quately with­out a lan­guage en­cod­ing them; and on the other hand, there are count­less ex­tremely im­por­tant pieces of knowl­edge for in­di­vid­ual do­mains, which would be worth en­cod­ing into a lan­guage, ex­cept most peo­ple have no need to work in those small do­mains, so… we get jar­gon and do­main-spe­cific lan­guages, but not Gen­eral Se­man­tics su­per­heros speak­ing a pow­er­ful con­lang.

Gen­er­al-pur­pose lan­guages would only en­code gen­eral weak knowl­edge, such as (tak­ing an AI per­spec­tive) prop­er­ties like ob­jec­t-ness and a causally sparse world in which ob­jects and agents can be mean­ing­fully de­scribed with short de­scrip­tions & the in­ten­tional stance, and vo­cab­u­lary en­codes a hu­man­ly-rel­e­vant de­scrip­tion of hu­man life (eg the in which reg­u­lar­i­ties in hu­man per­son­al­ity are diffusely en­coded into thou­sands of word­s—if per­son­al­ity be­came a ma­jor in­ter­est, peo­ple would likely start us­ing a much smaller & more re­fined set of per­son­al­ity words like the ). As a spe­cific do­main de­vel­ops do­main-spe­cific knowl­edge which would be valu­able to have en­coded into a lan­guage, it grad­u­ally drifts from the gen­er­al-pur­pose lan­guage and, when the knowl­edge is en­coded as words, be­comes jar­gon-dense, and when it can be en­coded in the syn­tax & gram­mar of sym­bols, a for­mal no­ta­tion. (“Every­thing should be built top-down, ex­cept the first time.”) Much like how sci­en­tific fields fis­sion as they de­vel­op. ()

The uni­ver­sal nat­ural lan­guage serves as a ‘glue’ or ‘host’ lan­guage for com­mu­ni­cat­ing things not cov­ered by spe­cific fields and for com­bin­ing re­sults from do­main-spe­cific lan­guages, and are jack­-of-al­l-trades: not good at any­thing in par­tic­u­lar, but, shorn of most ac­ci­den­tal com­plex­ity like gram­mat­i­cal gen­der or to­tally ran­dom­ized spelling (“Fools ig­nore com­plex­i­ty. Prag­ma­tists suffer it. Some can avoid it. Ge­niuses re­move it.”), about good as any com­peti­tor at every­thing. (“Op­ti­miza­tion hin­ders evo­lu­tion.”) And to the ex­tent that the do­main-spe­cific lan­guages en­code any­thing gen­er­ally im­por­tant into vo­cab­u­lary, the host gen­er­al-pur­pose lan­guage can try to steal the idea (shorn of its frame­work) as iso­lated vo­cab­u­lary words.

So the gen­er­al-pur­pose lan­guages gen­er­ally re­main equiv­a­lently pow­er­ful, and are rea­son­ably effi­cient enough that switch­ing to con­langs does not offer a suffi­cient ad­van­tage.

This per­spec­tive ex­plains why we see pow­er­ful spe­cial-pur­pose lan­guages and weak gen­er­al-pur­pose lan­guages: the re­quire­ment of en­cod­ing im­por­tant in­for­ma­tion forces gen­eral lan­guages into be­com­ing ever nar­rower if they want to be pow­er­ful.

It also is an in­ter­est­ing per­spec­tive to take on in­tel­lec­tual trends, ide­o­log­i­cal move­ments, and mag­i­cal/re­li­gious think­ing. Through his­to­ry, many ma­jor philo­soph­i­cal/re­li­gious/­sci­en­tific thinkers have been deeply fas­ci­nated by et­y­mol­o­gy, philol­o­gy, and lin­guis­tics, in­deed, to the point of bas­ing ma­jor parts of their philoso­phies & ideas on their analy­sis of words. (It would be in­vid­i­ous to name spe­cific ex­am­ples.) Who has not seen an in­tel­lec­tual who in dis­cussing a topic spends more time re­count­ing the ‘in­tel­lec­tual his­tory’ of it rather than the ac­tual sub­stance of the top­ic, and whose idea of ‘in­tel­lec­tual his­tory’ ap­pears to con­sist solely of re­count­ing dic­tio­nary de­fi­n­i­tions, se­lec­tive quo­ta­tion, et­y­molo­gies, nu­ances of long-dis­carded the­o­ries, end­less pro­pos­als of un­nec­es­sary new ne­ol­o­gisms, and pro­vid­ing a ‘his­tory’ or ‘evo­lu­tion’ of ideas which con­sist solely of post hoc ergo propter hoc el­e­vated to a fine art? One would think that that much time spent in philol­o­gy, show­ing the sheer flex­i­bil­ity of most words (often mu­tat­ing into their op­po­sites) and ar­bi­trari­ness of their mean­ings and evo­lu­tions, would also show how lit­tle any et­y­mol­ogy or prior use mat­ters. What does a philoso­pher like Ni­et­zsche think he is do­ing when—in­ter­ested in con­tem­po­rary moral phi­los­o­phy and not his­tory for the sake of his­to­ry—he nev­er­the­less spends many pages lay­ing out spec­u­la­tions on Ger­man word et­y­molo­gies and link­ing them to Chris­tian­ity as a ‘ge­neal­ogy of morals’, when Chris­tian­ity had not the slight­est thing to do with Ger­many for many cen­turies after its found­ing, all the peo­ple in­volved knew far less than we do (like a doc­tor con­sult­ing Galen for ad­vice), and this is all ex­tremely spec­u­la­tive even as a mat­ter of his­to­ry, much less ac­tual moral phi­los­o­phy?

These lin­guis­ti­cal­ly-based claims are fre­quently among the most bizarre & deeply wrong parts of their be­liefs, and it would be rea­son­able to say that they have been blinded by a love of word­s—­for­get­ting that a word is merely a word, and the map is not the ter­ri­to­ry. What are the prob­lems there? At least one prob­lem is the im­plicit be­lief that lin­guis­tic analy­sis can tell us any­thing more than some diffuse pop­u­la­tion be­liefs about the world or the rel­a­tively nar­row ques­tions of for­mal lin­guis­tics (eg about his­tory or lan­guage fam­i­lies), the be­lief that lin­guis­tic analy­sis can re­veal deep truths about the mean­ing of con­cepts, about the na­ture of re­al­i­ty, the gods, moral con­duct, pol­i­tics, etc—that we can get out of words far more than any­one ever put in. Since words & lan­guages have evolved through a highly ran­dom nat­ural process filled with ar­bi­trari­ness, con­tin­gen­cy, and con­stant mu­ta­tion/de­cay, there is not much in­for­ma­tion in them, there is no one who has been putting in­for­ma­tion in them, and to the ex­tent that any­one has, their deep thoughts & em­pir­i­cal ev­i­dence are bet­ter ac­cessed through their ex­plicit writ­ings, and in any case, are likely su­per­seded. So, any in­for­ma­tion a nat­ural lan­guage en­codes is im­pov­er­ished, weak, and out­dat­ed; and un­help­ful, if not ac­tively harm­ful, to at­tempt to base any kind of rea­son­ing on.

In the case of re­li­gious thinkers who, start­ing from that in­cor­rect premise, be­lieve in the di­vine au­thor­ship of the Bible/Ko­ran/scrip­tures or that He­brew or San­skrit en­code deep cor­re­spon­dences and gov­ern the na­ture of the cos­mos in every de­tail, this be­lief is de­fen­si­ble: be­ing spo­ken/writ­ten by an om­ni­scient be­ing, it is rea­son­able that the god might have en­coded ar­bi­trar­ily much in­for­ma­tion in it and care­ful study un­earth it. In its crud­est form, it is sym­pa­thetic mag­ic, hid­ing of ‘true names’, judg­ing on word-sounds, and schiz­o­phre­ni­a-like word-salad free-as­so­cia­tive think­ing (eg or the ). Sec­u­lar thinkers have no such de­fense for be­ing blinded by words, but the same mys­ti­cal be­lief in the deep causal pow­ers of spelling and word use and word choice is per­va­sive.

See Also


  1. All of which is du­bi­ous on causal grounds: does the lan­guage ac­tu­ally cause such cross-cul­tural differ­ences or are such cor­re­lates merely again?↩︎

  2. Book 1, ; for de­fense of the in­ter­pre­ta­tion that this is word­play & not merely a generic ob­ser­va­tion about al­pha­betic writ­ing, see .↩︎

  3. An in­ter­est­ing ex­am­ple of con­cise pro­gram­ming lan­guages—and also in­spired by Maxwell’s equa­tion­s—­comes from the STEPS re­search project (), an effort to write a full op­er­at­ing sys­tem with a GUI desk­top en­vi­ron­ment, sound, text ed­i­tor, In­ter­net-ca­pable net­work­ing & web browser etc in ~20,000 lines of code. This is ac­com­plished, in part, by care­ful spec­i­fi­ca­tion of a core, and then build­ing on it with lay­ers of do­main-spe­cific lan­guages. For ex­am­ple, to de­fine TCP/IP net­work­ing code, which in­volves a lot of low-level bi­nary bit-twid­dling and rigid­ly-de­fined for­mats and typ­i­cally takes at least 20,000 lines of code all on its own, STEPS uses the ASCII art di­a­grams from the RFC spec­i­fi­ca­tions as a for­mal spec­i­fi­ca­tion of how to do net­work pars­ing. It also makes trade­offs of com­put­ing power ver­sus lines of code: a nor­mal com­puter op­er­at­ing sys­tem or li­brary might spend many lines of code deal­ing with the fid­dly de­tails of how pre­cisely to lay out in­di­vid­ual char­ac­ters of text on the screen in the fastest & most cor­rect pos­si­ble way, whereas STEPS de­fines some rules and uses a generic op­ti­miza­tion al­go­rithm to it­er­a­tively po­si­tion text (or any­thing else) cor­rect­ly.↩︎

  4. Heav­i­side 1893, “On Op­er­a­tors in Phys­i­cal Math­e­mat­ics, Part II” (pg121)↩︎