Why Tool AIs Want to Be Agent AIs

AIs limited to pure computation (Tool AIs) supporting humans, will be less intelligent, efficient, and economically valuable than more autonomous reinforcement-learning AIs (Agent AIs) who act on their own and meta-learn, because all problems are reinforcement-learning problems.
decision-theory, statistics, NN, computer-science, transhumanism, AI, Bayes, insight-porn
2016-09-072018-08-28 finished certainty: likely importance: 9

Au­tonomous AI sys­tems (A­gent AIs) trained us­ing can do harm when they take wrong ac­tions, es­pe­cially su­per­in­tel­li­gent Agent AIs. One so­lu­tion would be to elim­i­nate their agency by not giv­ing AIs the abil­ity to take ac­tions, con­fin­ing them to purely in­for­ma­tional or in­fer­en­tial tasks such as clas­si­fi­ca­tion or pre­dic­tion (Tool AIs), and have all ac­tions be ap­proved & ex­e­cuted by hu­mans, giv­ing equiv­a­lently su­per­in­tel­li­gent re­sults with­out the risk.

I ar­gue that this is not an effec­tive so­lu­tion for two ma­jor rea­sons. First, be­cause Agent AIs will by de­fi­n­i­tion be bet­ter at ac­tions than Tool AIs, giv­ing an eco­nomic ad­van­tage. Sec­ond­ly, be­cause Agent AIs will be bet­ter at in­fer­ence & learn­ing than Tool AIs, and this is in­her­ently due to their greater agen­cy: the same al­go­rithms which learn how to per­form ac­tions can be used to se­lect im­por­tant dat­a­points to learn in­fer­ence over, how long to learn, how to more effi­ciently ex­e­cute in­fer­ence, how to de­sign them­selves, how to op­ti­mize hy­per­pa­ra­me­ters, how to make use of ex­ter­nal re­sources such as long-term mem­o­ries or ex­ter­nal soft­ware or large data­bases or the In­ter­net, and how best to ac­quire new da­ta. All of these ac­tions will re­sult in Agent AIs more in­tel­li­gent than Tool AIs, in ad­di­tion to their greater eco­nomic com­pet­i­tive­ness. Thus, Tool AIs will be in­fe­rior to Agent AIs in both ac­tions and in­tel­li­gence, im­ply­ing use of Tool AIs is a even more highly un­sta­ble equi­lib­rium than pre­vi­ously ar­gued, as users of Agent AIs will be able to out­com­pete them on two di­men­sions (and not just one).

One pro­posed so­lu­tion to AI risk is to sug­gest that AIs could be lim­ited purely to su­per­vised/un­su­per­vised learn­ing, and not given ac­cess to any sort of ca­pa­bil­ity that can di­rectly affect the out­side world such as ro­botic arms. In this frame­work, AIs are treated purely as math­e­mat­i­cal func­tions map­ping data to an out­put such as a clas­si­fi­ca­tion prob­a­bil­i­ty, sim­i­lar to a lo­gis­tic or lin­ear model but far more com­plex; most deep learn­ing neural net­works like Im­a­geNet im­age clas­si­fi­ca­tion con­vo­lu­tional neural net­works (CNN)s would qual­i­fy. The gains from AI then come from train­ing the AI and then ask­ing it many ques­tions which hu­mans then re­view & im­ple­ment in the real world as de­sired. So an AI might be trained on a large dataset of chem­i­cal struc­tures la­beled by whether they turned out to be a use­ful drug in hu­mans and asked to clas­sify new chem­i­cal struc­tures as use­ful or non-use­ful; then doc­tors would run the ac­tual med­ical tri­als on the drug can­di­dates and de­cide whether to use them in pa­tients etc. Or an AI might look like /: it an­swers your ques­tions about how best to drive places bet­ter than any hu­man could, but it does not con­trol any traffic lights coun­try-wide to op­ti­mize traffic flows nor will it run a self­-driv­ing car to get you there. This the­o­ret­i­cally avoids any pos­si­ble run­away of AIs into ma­lig­nant or un­car­ing ac­tors who harm hu­man­ity by sat­is­fy­ing dan­ger­ous util­ity func­tions and de­vel­op­ing in­stru­men­tal dri­ves. After all, if they can’t take any ac­tions, how can they do any­thing that hu­mans do not ap­prove of?

Two vari­a­tions on this lim­it­ing or box­ing theme are

  1. Or­a­cle AI: , in (pg145–158) notes that while they can be eas­ily ‘boxed’ and in some cases like P/NP prob­lems the an­swers can be cheaply checked or ran­dom sub­sets ex­pen­sively ver­i­fied, there are sev­eral is­sues with or­a­cle AIs:

    • the AI’s de­fi­n­i­tion of ‘re­sources’ or ‘stay­ing in­side the box’ can change as it learns more about the world (on­to­log­i­cal crises)
    • re­sponses might ma­nip­u­late users into ask­ing easy (and use­less prob­lems)
    • mak­ing changes in the world can make it eas­ier to an­swer ques­tions about, by sim­pli­fy­ing or con­trol­ling it (“All processes that are sta­ble we shall pre­dict. All processes that are un­sta­ble we shall con­trol.”)
    • even a suc­cess­fully boxed and safe or­a­cle or tool AI can be mis­used1
  2. Tool AI (the term, as “tool mode” or “tool AGI”, was coined by Holden Karnof­sky in a July 2011 dis­cus­sion & elab­o­rated on in a May 2013 es­say, but the idea has prob­a­bly been pro­posed be­fore). To quote Karnof­sky:

    Google Map­s—by which I mean the com­plete soft­ware pack­age in­clud­ing the dis­play of the map it­self—­does not have a “util­ity” that it seeks to max­i­mize. (One could fit a util­ity func­tion to its ac­tions, as to any set of ac­tions, but there is no sin­gle “pa­ra­me­ter to be max­i­mized” dri­ving its op­er­a­tions.)

    Google Maps (as I un­der­stand it) con­sid­ers mul­ti­ple pos­si­ble routes, gives each a score based on fac­tors such as dis­tance and likely traffic, and then dis­plays the best-s­cor­ing route in a way that makes it eas­ily un­der­stood by the user. If I don’t like the route, for what­ever rea­son, I can change some pa­ra­me­ters and con­sider a differ­ent route. If I like the route, I can print it out or email it to a friend or send it to my phone’s nav­i­ga­tion ap­pli­ca­tion. Google Maps has no sin­gle pa­ra­me­ter it is try­ing to max­i­mize; it has no rea­son to try to “trick” me in or­der to in­crease its util­i­ty. In short, Google Maps is not an agent, tak­ing ac­tions in or­der to max­i­mize a util­ity pa­ra­me­ter. It is a tool, gen­er­at­ing in­for­ma­tion and then dis­play­ing it in a user-friendly man­ner for me to con­sid­er, use and ex­port or dis­card as I wish.

    Every soft­ware ap­pli­ca­tion I know of seems to work es­sen­tially the same way, in­clud­ing those that in­volve (spe­cial­ized) ar­ti­fi­cial in­tel­li­gence such as Google Search, Siri, Wat­son, Ry­bka, etc. Some can be put into an “agent mode” (as Wat­son was on Jeop­ardy) but all can eas­ily be set up to be used as “tools” (for ex­am­ple, Wat­son can sim­ply dis­play its top can­di­date an­swers to a ques­tion, with the score for each, with­out speak­ing any of them.)…Tool-AGI is not “trapped” and it is not Un­friendly or Friend­ly; it has no mo­ti­va­tions and no dri­ving util­ity func­tion of any kind, just like Google Maps. It scores differ­ent pos­si­bil­i­ties and dis­plays its con­clu­sions in a trans­par­ent and user-friendly man­ner, as its in­struc­tions say to do; it does not have an over­ar­ch­ing “want,” and so, as with the spe­cial­ized AIs de­scribed above, while it may some­times “mis­in­ter­pret” a ques­tion (thereby scor­ing op­tions poorly and rank­ing the wrong one #1) there is no rea­son to ex­pect in­ten­tional trick­ery or ma­nip­u­la­tion when it comes to dis­play­ing its re­sults.

    …An­other way of putting this is that a “tool” has an un­der­ly­ing in­struc­tion set that con­cep­tu­ally looks like: “(1) Cal­cu­late which ac­tion A would max­i­mize pa­ra­me­ter P, based on ex­ist­ing data set D. (2) Sum­ma­rize this cal­cu­la­tion in a user-friendly man­ner, in­clud­ing what Ac­tion A is, what likely in­ter­me­di­ate out­comes it would cause, what other ac­tions would re­sult in high val­ues of P, etc.” An “agent,” by con­trast, has an un­der­ly­ing in­struc­tion set that con­cep­tu­ally looks like: “(1) Cal­cu­late which ac­tion, A, would max­i­mize pa­ra­me­ter P, based on ex­ist­ing data set D. (2) Ex­e­cute Ac­tion A.” In any AI where (1) is sep­a­ra­ble (by the pro­gram­mers) as a dis­tinct step, (2) can be set to the “tool” ver­sion rather than the “agent” ver­sion, and this sep­a­ra­bil­ity is in fact present with most/all mod­ern soft­ware. Note that in the “tool” ver­sion, nei­ther step (1) nor step (2) (nor the com­bi­na­tion) con­sti­tutes an in­struc­tion to max­i­mize a pa­ra­me­ter—to de­scribe a pro­gram of this kind as “want­ing” some­thing is a cat­e­gory er­ror, and there is no rea­son to ex­pect its step (2) to be de­cep­tive…This is im­por­tant be­cause an AGI run­ning in tool mode could be ex­tra­or­di­nar­ily use­ful but far more safe than an AGI run­ning in agent mode. In fact, if de­vel­op­ing “Friendly AI” is what we seek, a tool-AGI could likely be help­ful enough in think­ing through this prob­lem as to ren­der any pre­vi­ous work on “Friend­li­ness the­ory” moot.

    …Is a tool-AGI pos­si­ble? I be­lieve that it is, and fur­ther­more that it ought to be our de­fault pic­ture of how AGI will work

    There are sim­i­lar gen­eral is­sues with Tool AIs as with Or­a­cle AIs:

    • a hu­man check­ing each re­sult is no guar­an­tee of safe­ty; even Homer nods. A ex­tremely dan­ger­ous or sub­tly dan­ger­ous an­swer might slip through; Stu­art Arm­strong notes that the sum­mary may sim­ply not men­tion the im­por­tant (to hu­mans) down­side to a sug­ges­tion, or frame it in the most at­trac­tive light pos­si­ble. The more a Tool AI is used, or trusted by users, the less check­ing will be done of its an­swers be­fore the user mind­lessly im­ple­ments it.
    • an in­tel­li­gent, never mind su­per­in­tel­li­gent Tool AI, will have built-in search processes and plan­ners which may be quite in­tel­li­gent them­selves, and in ‘plan­ning how to plan’, dis­cover dan­ger­ous in­stru­men­tal dri­ves and the sub­-plan­ning process ex­e­cute them.2 (This struck me as mostly the­o­ret­i­cal un­til I saw how well could role­play & im­i­tate agents purely by offline self­-su­per­vised pre­dic­tion on large text data­bas­es—im­i­ta­tion learn­ing is (batch) re­in­force­ment learn­ing too!)
    • de­vel­op­ing a Tool AI in the first place might re­quire an­other AI, which it­self is dan­ger­ous

Or­a­cle AIs re­main mostly hy­po­thet­i­cal be­cause it’s un­clear how to write such util­ity func­tions. The sec­ond ap­proach, Tool AI, is just an ex­trap­o­la­tion of cur­rent sys­tems but has two ma­jor prob­lems aside from the al­ready iden­ti­fied ones which cast doubt on Karnof­sky’s claims that Tool AIs would be “ex­tra­or­di­nar­ily use­ful” & that we should ex­pect fu­ture AGIs to re­sem­ble Tool AIs rather than Agent AIs.


First and most com­monly pointed out, agent AIs are more eco­nom­i­cally com­pet­i­tive as they can re­place tool AIs (as in the case of YouTube up­grad­ing from to 3) or ‘hu­mans in the loop’.4 In any sort of process, notes that as steps get op­ti­mized, the op­ti­miza­tion does less and less as the out­put be­comes dom­i­nated by the slow­est step—if a step only takes 10% of the time or re­sources, then even in­fi­nite op­ti­miza­tion of that step down to zero time/re­sources means that the out­put will in­crease by no more than 10%. So if a hu­man over­see­ing a, say, (HFT) al­go­rithm, ac­counts for 50% of the la­tency in de­ci­sions, then the HFT al­go­rithm will never run more than twice as fast as it does now, which is a crip­pling dis­ad­van­tage. (Hence, the de­ba­cle is not too sur­pris­ing—no profitable HFT firm could afford to put too many hu­mans into its loops, so when some­thing does go wrong, it can be diffi­cult for hu­mans to fig­ure out the prob­lem & in­ter­vene be­fore the losses moun­t.) As the AI gets bet­ter, the gain from re­plac­ing the hu­man in­creases great­ly, and may well jus­tify re­plac­ing them with an AI in­fe­rior in many other re­spects but su­pe­rior in some key as­pect like cost or speed. This could also ap­ply to er­ror rates—in air­line ac­ci­dents, hu­man er­ror now causes the over­whelm­ing ma­jor­ity of ac­ci­dents due to their pres­ence as over­seers of the and it’s un­clear that a hu­man pi­lot rep­re­sents a net safety gain; and in ‘ad­vanced chess’, grand­mas­ters ini­tially chose most moves and used the chess AI for check­ing for tac­ti­cal er­rors and blun­ders, which tran­si­tioned through the late ‘90s and early ’00s to hu­man play­ers (not even grand­mas­ters) turn­ing over most play­ing to the chess AI but con­tribut­ing a great deal of win per­for­mance by pick­ing & choos­ing which of sev­eral AI-sug­gested moves to use, but as the chess AIs im­proved, at some point around 2007 vic­to­ries in­creas­ingly came from the hu­mans mak­ing mis­takes which the op­pos­ing chess AI could ex­ploit, even mis­takes as triv­ial as ’misclicks’ (on the com­puter screen), and now in ad­vanced chess, hu­man con­tri­bu­tion has de­creased to largely prepar­ing the chess AIs’ open­ing books & look­ing for novel open­ing moves which their chess AI can be bet­ter pre­pared for.

At some point, there is not much point to keep­ing the hu­man in the loop at all since they have lit­tle abil­ity to check the AI choices and be­come ‘deskilled’ (think GPS di­rec­tions), cor­rect­ing less than they screw up and demon­strat­ing that tool­ness is no guar­an­tee of safety nor re­spon­si­ble use. (Hence the old joke: “the fac­tory of the fu­ture will be run by a man and a dog; the dog will be there to keep the man away from the fac­tory con­trols.”) For a suc­cess­ful au­tonomous pro­gram, just keep­ing up with growth alone makes it diffi­cult to keep hu­mans in the loop; the US drone war­fare pro­gram has be­come such a cen­tral tool of US war­fare that the US Air Force finds it ex­tremely diffi­cult to hire & re­tain enough hu­man pi­lots over­see­ing its drones, and there are in­di­ca­tions that op­er­a­tional pres­sures are slowly erod­ing the hu­man con­trol & turn­ing them into rub­ber­stamps, and for all its protes­ta­tions that it would al­ways keep a hu­man in the de­ci­sion-mak­ing loop, the Pen­ta­gon is, un­sur­pris­ing­ly, in­evitably, slid­ing to­wards fully au­tonomous drone war­fare as the next tech­no­log­i­cal step to main­tain mil­i­tary su­pe­ri­or­ity over Rus­sia & Chi­na. (See “Meet The New Mav­er­icks: An In­side Look At Amer­i­ca’s Drone Train­ing Pro­gram”; “Fu­ture is as­sured for death-deal­ing, life-sav­ing drones”; “Sam Alt­man’s Man­i­fest Des­tiny”; “The Pen­tagon’s ‘Ter­mi­na­tor Co­nun­drum’: Ro­bots That Could Kill on Their Own”; “At­tack of the Killer Ro­bots”)

Fun­da­men­tal­ly, au­tonomous agent AIs are what we and the free mar­ket want; every­thing else is a sur­ro­gate or ir­rel­e­vant loss func­tion. We don’t want low log-loss er­ror on Im­a­geNet, we want to re­find a par­tic­u­lar per­sonal pho­to; we don’t want ex­cel­lent ad­vice on which stock to buy for a few mi­crosec­onds, we want a money pump spit­ting cash at us; we don’t want a drone to tell us where Osama bin Laden was an hour ago (but not now), we want to have killed him on sight; we don’t want good ad­vice from Google Maps about what route to drive to our des­ti­na­tion, we want to be at our des­ti­na­tion with­out do­ing any dri­ving etc. Idio­syn­cratic sit­u­a­tions, le­gal reg­u­la­tion, fears of tail risks from very bad sit­u­a­tions, wor­ries about cor­re­lated or sys­tem­atic fail­ures (like hack­ing a drone fleet), and so on may slow or stop the adop­tion of Agent AIs—but the pres­sure will al­ways be there.

So for this rea­son alone, we ex­pect to see Agent AIs to sys­tem­at­i­cally be pre­ferred over Tool AIs un­less they’re con­sid­er­ably worse.


Agent AIs will be cho­sen over Tool AIs—­for rea­sons aside from not be­ing what any­one wants and some­thing that will be se­verely pe­nal­ized by free mar­kets or sim­ply there be­ing mul­ti­ple agents choos­ing whether to use a Tool AI or an Agent AI in any kind of com­pet­i­tive sce­nar­i­o—also suffer from the prob­lem that the best Tool AI’s per­for­mance/in­tel­li­gence will be equal to or worse than the best Agent AI, prob­a­bly worse, and pos­si­bly much worse. Bostrom notes that “Such ‘cre­ative’ [dan­ger­ous] plans come into view when the [Tool AI] soft­ware’s cog­ni­tive abil­i­ties reach a suffi­ciently high lev­el.”; we might re­verse this to say that to make the Tool AI reach a suffi­ciently high lev­el, we must put such cre­ativ­ity in view. (A lin­ear model may be ex­tremely safe & pre­dictable, but it would be hope­less to ex­pect every­one to use them in­stead of neural net­work­s.)

An Agent AI clearly ben­e­fits from be­ing a bet­ter Tool AI, so it can bet­ter un­der­stand its en­vi­ron­ment & in­puts; but less in­tu­itive­ly, any Tool AI ben­e­fits from agen­ti­ness. An Agent AI has the po­ten­tial, often re­al­ized in prac­tice, to out­per­form any Tool AI: it can get bet­ter re­sults with less com­pu­ta­tion, less data, less man­ual de­sign, less post-pro­cess­ing of its out­puts, on harder do­mains.

(Triv­ial proof: Agent AIs are su­per­sets of Tool AIs—an Agent AI, by not tak­ing any ac­tions be­sides com­mu­ni­ca­tion or ran­dom choice, can re­duce it­self to a Tool AI; so in cases where ac­tions are un­help­ful, it per­forms the same as the Tool AI, and when ac­tions can help, it can per­form bet­ter; hence, an Agent AI can al­ways match or ex­ceed a Tool AI. At least, as­sum­ing suffi­cient data that in the en­vi­ron­ments where ac­tions are not help­ful, it can learn to stop act­ing, and in the ones where they are, it has a dis­tant enough hori­zon to pay for the ex­plo­ration. Of course, you might agree with this but sim­ply be­lieve that in­tel­li­gence-wise, Agent AIs == Tool AIs.)

Every suffi­ciently hard prob­lem is a re­in­force­ment learn­ing prob­lem.

More se­ri­ous­ly, not all data is cre­ated equal. Not all data points are equally valu­able to learn from, re­quire equal amounts of com­pu­ta­tion, should be treated iden­ti­cal­ly, should in­spire iden­ti­cal fol­lowup data sam­pling, or ac­tions. In­fer­ence and learn­ing can be much more effi­cient if the al­go­rithm can choose how to com­pute on what data with which ac­tions.

There is no hard Carte­sian bound­ary such that con­trol of the en­vi­ron­ment is ir­rel­e­vant to the al­go­rithm and vice-versa and its com­pu­ta­tion can be car­ried out with­out re­gard to the en­vi­ron­men­t—there are sim­ply many lay­ers be­tween the core of the al­go­rithm and the fur­thest part of the en­vi­ron­ment, and the more lay­ers that the al­go­rithm can model & con­trol, the more it can do. Con­sider Google Map­s/Waze5. On the sur­face they are ‘merely’ Tool AIs which pro­duce lists of pos­si­ble routes which would op­ti­mize cer­tain re­quire­ments; but the en­tire point of such Tool AIs—and all large-s­cale Tool AIs and —is that count­less dri­vers will act on them (what’s the point of get­ting dri­ving di­rec­tions if you don’t then dri­ve?), and this will greatly change traffic pat­terns as dri­vers be­come ap­pendages of the ‘Tool’ AI, po­ten­tially mak­ing dri­ving in an area much worse by their er­rors or my­opic per-driver op­ti­miza­tion caus­ing (and far from be­ing a the­o­ret­i­cal cu­rios­i­ty, GPS, Google Maps, and Waze are reg­u­larly ac­cused of that in many places, es­pe­cially Los An­ge­les).

This is a highly gen­eral point which can be ap­plied on many lev­els. This point often arises in clas­si­cal sta­tis­tics//de­ci­sion the­ory where adap­tive tech­niques can greatly out­per­form fixed-sam­ple tech­niques for both in­fer­ence and ac­tion­s/loss­es: nu­mer­i­cal in­te­gra­tion , a trial test­ing a hy­poth­e­sis can often ter­mi­nate after a frac­tion of the equiv­a­lent fixed-sam­ple tri­al’s sam­ple size (and/or loss) while ; an will have much lower re­gret than any non-adap­tive so­lu­tion, but it will also be in­fer­en­tially bet­ter at es­ti­mat­ing which arm is best and what the per­for­mance of that arm is (see the ‘best-arm prob­lem’: , Au­dib­ert et al 2010, Gabil­lon et al 2011, Mel­lor 2014, Jamieson & Nowak 2014, ), and an adap­tive can con­stan­t-fac­tor (gains of 50% or more are pos­si­ble com­pared to naive de­signs like even al­lo­ca­tion; Mc­Clel­land 1997) min­i­mize to­tal by fo­cus­ing on un­ex­pect­edly diffi­cult-to-es­ti­mate arms (while a fixed-sam­ple trial can be seen as ideal for when one val­ues pre­cise es­ti­mates of all arms equally and they have equal vari­ance, which is usu­ally not the case); even a or block­ing or de­sign rather than sim­ple ran­dom­iza­tion can be seen as re­flect­ing this ben­e­fit (avoid­ing the po­ten­tial for im­bal­ance in al­lo­ca­tion across arms by de­cid­ing in ad­vance the se­quence of ‘ac­tions’ taken in col­lect­ing sam­ples). An­other ex­am­ple comes from , where se­lect­ing the best of 2 pos­si­ble queues to wait in rather than se­lect­ing 1 queue at ran­dom im­proves the ex­pected max­i­mum de­lay from to in­stead (and in­ter­est­ing­ly, al­most all the gain comes from be­ing able to make any choice at all, go­ing from 1 to 2—­choos­ing from 3 or more queues adds only some con­stan­t-fac­tor gain­s).

The wide va­ri­ety of uses of ac­tion is a ma­jor theme in re­cent work in AI (specifi­cal­ly, /neural net­works) re­search and in­creas­ingly key to achiev­ing the best per­for­mance on in­fer­en­tial tasks as well as re­in­force­ment learn­ing/op­ti­miza­tion/a­gen­t-y tasks. Al­though these ad­van­tages ap­ply to most AI par­a­digms, be­cause of the power and wide va­ri­ety of tasks NNs get ap­plied to, and so­phis­ti­cated ar­chi­tec­tures, we can see the per­va­sive ad­van­tage of agen­ti­ness much more clearly than in nar­rower con­texts like bio­sta­tis­tics.

Actions for intelligence

Rough­ly, we can try to cat­e­go­rize the differ­ent kinds of agen­ti­ness by the ‘level’ of the NN they work on. There are:

  1. ac­tions in­ter­nal to a com­pu­ta­tion:

    • in­puts
    • in­ter­me­di­ate states
    • ac­cess­ing the ex­ter­nal ‘en­vi­ron­ment’
    • amount of com­pu­ta­tion
    • en­forc­ing con­straints/fine­tun­ing qual­ity of out­put
    • chang­ing the loss func­tion ap­plied to out­put
  2. ac­tions in­ter­nal to train­ing the NN:

    • the gra­di­ent it­self
    • size & di­rec­tion of gra­di­ent de­scent steps on each pa­ra­me­ter
    • over­all gra­di­ent de­scent learn­ing rate and learn­ing rate sched­ule
    • choice of data sam­ples to train on
  3. in­ter­nal to the dataset

    • ac­tive learn­ing
    • op­ti­mal ex­per­i­ment de­sign
  4. in­ter­nal to the NN de­sign step

    • hy­per­pa­ra­me­ter op­ti­miza­tion
    • NN ar­chi­tec­ture
  5. in­ter­nal to in­ter­ac­tion with en­vi­ron­ment

    • adap­tive ex­per­i­ment / mul­ti­-armed ban­dit / ex­plo­ration for re­in­force­ment learn­ing

Actions internal to a computation

In­side a spe­cific NN, while com­put­ing the out­put for an in­put ques­tion, a NN can make choices about how to han­dle it.

It can choose what parts of the in­put to run most of its com­pu­ta­tions on, while throw­ing away or com­put­ing less on other parts of the in­put, which are less rel­e­vant to the out­put, us­ing “at­ten­tion mech­a­nisms” (eg Olah & Carter 2016, , Bel­lver et al 2016, , , Xu 2015, Larochelle & Hin­ton 2010, , , Mnih et al 2014, , Kaiser & Ben­gio 2016). At­ten­tion mech­a­nisms are re­spon­si­ble for many in­creases in per­for­mance, but es­pe­cially im­prove­ments in RNNs’ abil­ity to do se­quence-to-se­quence trans­la­tion by re­vis­it­ing im­por­tant parts of the se­quence (), im­age gen­er­a­tion and cap­tion­ing, and in CNNs’ abil­ity to rec­og­nize im­ages by fo­cus­ing on am­bigu­ous or small parts of the im­age, even for ad­ver­sar­ial ex­am­ples (). They are a ma­jor trend in deep learn­ing, as it is often the case that some parts of the in­put are more im­por­tant than oth­ers and en­able both global & lo­cal op­er­a­tions to be learned, with in­creas­ingly too many ex­am­ples of at­ten­tion to list (with a trend as of 2018 to­wards us­ing at­ten­tion as the ma­jor or only con­struc­t).

Many de­signs can be in­ter­preted as us­ing at­ten­tion. The bidi­rec­tional RNN also often used in nat­ural lan­guage trans­la­tion does­n’t ex­plic­itly use at­ten­tion mech­a­nisms but is be­lieved to help by giv­ing the RNN a sec­ond look at the se­quence. In­deed, so uni­ver­sal that it often goes with­out men­tion is that the /GRU mech­a­nism which im­proves al­most all RNNs is it­self a kind of at­ten­tion mech­a­nism: the LSTM cells learn which parts of the hid­den state/his­tory are im­por­tant and should be kept, and whether and when the mem­o­ries should be for­got­ten and fresh mem­o­ries loaded into the LSTM cells. While LSTM RNNs are the de­fault for se­quence tasks, they have oc­ca­sion­ally been beaten by feed­for­ward neural net­work­s—us­ing in­ter­nal at­ten­tion or “self­-at­ten­tion”, like the Trans­former ar­chi­tec­ture (eg Vaswani et al 2017 or ).

Ex­tend­ing at­ten­tion, a NN can choose not just which parts of an in­put to look at mul­ti­ple times, but also how long to keep com­put­ing on it, “adap­tive com­pu­ta­tion” (, , Sil­ver et al 2016b, , , , , Teer­apit­tayanon et al 2017, , , , , , , , , , ): so it it­er­a­tively spends more com­pu­ta­tion on hard parts of prob­lem within a given com­pu­ta­tional bud­get6. Neural ODEs are an in­ter­est­ing ex­am­ple of a model which are sort of like adap­tive RNNs in that they can be run re­peat­edly by the ODE solver, adap­tive­ly, to re­fine their out­put to a tar­get ac­cu­ra­cy, and the ODE solver can be con­sid­ered a kind of agent as well.

At­ten­tion gen­er­ally does­n’t change the na­ture of the com­pu­ta­tion aside from the ne­ces­sity of ac­tions over the in­put, but ac­tions can be used to bring in differ­ent com­put­ing par­a­digms. For ex­am­ple, the en­tire field of “differ­en­tiable neural com­puter”/“neural Tur­ing ma­chines” (, ) or “neural stack ma­chines” or “neural GPUs” or most de­signs with some sort of scal­able ex­ter­nal mem­ory mech­a­nism larger than LSTMs () de­pends on fig­ur­ing out a clever way to back­prop­a­gate through the ac­tion of mem­ory ac­cesses or us­ing re­in­force­ment learn­ing tech­niques like REINFORCE for train­ing the non-d­iffer­en­tiable ac­tions. And such a mem­ory is like a data­base which is con­structed on the fly per-prob­lem, so it’ll help with data­base queries & in­for­ma­tion re­trieval & knowl­edge graphs (, , , , , ). An in­trigu­ing vari­ant on this idea of ‘query­ing’ re­sources is mix­ture-of-ex­perts () NN ar­chi­tec­tures (Shazeer et al 2016). Jeff Dean (Google Brain) asks where should we use RL tech­niques in our OS­es, net­works, and com­pu­ta­tions these days and an­swers: ( re­view). RL should be used for: pro­gram place­ment on servers (/Mirho­seini et al 2018), /Bloom fil­ters for data­bases, , search query can­di­dates (, ), com­piler set­tings (), quan­tum com­puter con­trol (), dat­a­cen­ter & server cool­ing con­trollers… Dean asks “Where Else Could We Use Learn­ing?”, and replies:

Any­where We’re Us­ing Heuris­tics To Make a De­ci­sion!

  • Com­pil­ers: in­struc­tion sched­ul­ing, reg­is­ter al­lo­ca­tion, loop nest par­al­leliza­tion strate­gies, …
  • Net­work­ing: TCP win­dow size de­ci­sions, back­off for re­trans­mits, data com­pres­sion, …
  • Op­er­at­ing sys­tems: process sched­ul­ing, buffer cache in­ser­tion/re­place­ment [eg La­gar-Cav­illa et al 2019 for ], file sys­tem prefetch­ing [eg , mem­ory al­lo­ca­tion ()], …
  • Job sched­ul­ing sys­tems: which tasks/VMs to co-lo­cate on same ma­chine, which tasks to pre-empt, … [eg ]
  • ASIC de­sign: , test case se­lec­tion, …

Any­where We’ve Punted to a User-Tun­able Per­for­mance Op­tion! Many pro­grams have huge num­bers of tun­able com­mand-line flags, usu­ally not changed from their de­faults (--eventmanager_threads=16 --bigtable_scheduler_batch_size=8 --mapreduce_merge_memory=134217728 --lexicon_cache_size=1048576 --storage_server_rpc_freelist_size=128 …)

Meta-learn every­thing. ML:

  • learn­ing place­ment de­ci­sions
  • learn­ing fast ker­nel im­ple­men­ta­tions
  • learn­ing op­ti­miza­tion up­date rules
  • learn­ing in­put pre­pro­cess­ing pipeline steps
  • learn­ing ac­ti­va­tion func­tions
  • learn­ing model ar­chi­tec­tures for spe­cific de­vice types, or that are fast for in­fer­ence on mo­bile de­vice X, learn­ing which pre-trained com­po­nents to reuse, …

Com­puter ar­chi­tec­ture/­dat­a­cen­ter net­work­ing de­sign:

  • learn­ing best de­sign prop­er­ties by ex­plor­ing de­sign space au­to­mat­i­cally (via sim­u­la­tor) [see ]

Fi­nal­ly, one in­ter­est­ing vari­ant on this theme is treat­ing an in­fer­en­tial or gen­er­a­tive prob­lem as a re­in­force­ment learn­ing prob­lem in a sort of en­vi­ron­ment with global re­wards. Many times the stan­dard loss func­tion is in­ap­plic­a­ble, or the im­por­tant things are glob­al, or the task is not re­ally well-de­fined enough (in a “I know it when I see it” sense for the hu­man) to nail down as a sim­ple differ­en­tiable loss with pre­de­fined la­bels such as in an im­age clas­si­fi­ca­tion prob­lem; in these cas­es, one can­not do stan­dard su­per­vised train­ing to min­i­mize the loss but must start us­ing re­in­force­ment learn­ing to di­rectly op­ti­mize a re­ward—treat­ing out­puts such as clas­si­fi­ca­tion la­bels as ‘ac­tions’ which may even­tu­ally re­sult in a re­ward. For ex­am­ple, in a char-RNN gen­er­a­tive text model trained by pre­dict­ing a char­ac­ter con­di­tional on the pre­vi­ous, one can gen­er­a­tive rea­son­able text sam­ples by pick­ing the most likely next char­ac­ter and oc­ca­sion­ally a less likely char­ac­ter for di­ver­si­ty, but one can gen­er­ate higher qual­ity sam­ples by ex­plor­ing longer se­quences with or nu­cleus sam­pling, and one can im­prove gen­er­a­tion fur­ther by adding util­ity func­tions for global prop­er­ties & ap­ply­ing RL al­go­rithms such as (MCTS) for train­ing or run­time max­i­miza­tion of an over­all trait like trans­la­tion/­sum­ma­riza­tion qual­ity (se­quence-to-se­quence prob­lems in gen­er­al) or win­ning or pro­gram writ­ing (eg Jaques et al 2016, Norouzi et al 2016, , , , /, , , , , He et al 2016, Bello et al 2017, , , , , , , , , Lewis et al 2017, , , , , , , , , , , , , , , ). Most ex­ot­i­cal­ly, the loss func­tion can it­self be a sort of ac­tion/RL set­ting—­con­sider the close con­nec­tions (, , , , ) be­tween ac­tor-critic re­in­force­ment learn­ing, syn­thetic gra­di­ents (), and game-the­o­ry-based gen­er­a­tive ad­ver­sar­ial net­works (GANs; , Zhu et al 2017/).

Actions internal to training

The train­ing of a NN by might seem to be in­de­pen­dent of any con­sid­er­a­tions of ‘ac­tions’, but it turns to be an­other do­main where you can go “what if we treated this as a ?” and it’s ac­tu­ally use­ful. Specifi­cal­ly, gra­di­ent de­scent re­quires se­lec­tion of which data to put into a mini­batch, how large a change to make to pa­ra­me­ters in gen­eral based on the er­ror in the cur­rent mini­batch (the learn­ing rate hy­per­pa­ra­me­ter), or how much to up­date each in­di­vid­ual pa­ra­me­ter each mini­batch (per­haps hav­ing some neu­rons which get tweaked much less than oth­er­s). Ac­tions are things like se­lect­ing 1 out of n pos­si­ble mini­batches to do gra­di­ent de­scent on, or se­lect­ing 1 out of n pos­si­ble learn­ing rates with the learn­ing rate in­creas­ing/de­creas­ing over time (, , Bello et al 2017, Fu et al 2016, Xu et al 2016, Jader­berg et al 2016, , , , , , , ; pri­or­i­tized traces, pri­or­i­tized ex­pe­ri­ence re­play, boost­ing, hard-neg­a­tive min­ing, (), pri­or­i­tiz­ing hard sam­ples, , Fan et al 2016, , , learn­ing in­ter­nal nor­mal­iza­tions, ).

Actions internal to data selection

We have pre­vi­ously looked at sam­pling from ex­ist­ing datasets: train­ing on hard sam­ples, and so on. One prob­lem with ex­ist­ing datasets is that they can be in­effi­cien­t—per­haps they have class im­bal­ance prob­lems where some kinds of data are over­rep­re­sented and what is re­ally needed for im­proved per­for­mance is more of the other kinds of da­ta. An im­age clas­si­fi­ca­tion CNN does­n’t need 99 dog pho­tos & 1 cat pho­tos, it wants 50 dog pho­tos & 50 cat pho­tos. (Quite aside from the fact that there’s not enough in­for­ma­tion to clas­sify other cat pho­tos based on just 1 ex­em­plar, the CNN will sim­ply learn to al­ways clas­sify pho­tos as ‘dog’.) One can try to fix this by , or by chang­ing the loss func­tion to make clas­si­fy­ing the mi­nor­ity class cor­rectly much more valu­able than clas­si­fy­ing the ma­jor­ity class.

Even bet­ter is if the NN can some­how ask for new data, be given ad­di­tion­al/­cor­rected data when it makes a mis­take, or even cre­ate new data (pos­si­bly based on old data: ). This leads us to : given pos­si­ble ad­di­tional dat­a­points (such as a large pool of un­la­beled dat­a­points), the NN can ask for the dat­a­point which it will learn the most from (, Is­lam 2016, Gal 2016, , , , , , ). One could, for ex­am­ple, train a RL agent to query a search en­gine and se­lect the most use­ful im­ages/videos for learn­ing a clas­si­fi­ca­tion task (eg YouTube: ). We can think of it as a lit­tle anal­o­gous to how kids7 ask par­ents not ran­dom ques­tions, but ones they’re most un­sure about, with the most im­pli­ca­tions one way or an­oth­er. Set­tles 2010 dis­cusses the prac­ti­cal ad­van­tages to ma­chine learn­ing al­go­rithms of care­ful choice of data points to learn from or ‘la­bel’, and gives some of the known the­o­ret­i­cal re­sults on how large the ben­e­fits can be—on a toy prob­lem, an er­ror rate e de­creas­ing in sam­ple count from to , or in a Bayesian set­ting, a de­crease of to . Ac­tive learn­ing also con­nects back, from a ma­chine learn­ing per­spec­tive, to some of the sta­tis­ti­cal ar­eas cov­er­ing the ben­e­fits of adap­tive/se­quen­tial tri­al­s—op­ti­mal ex­per­i­ments query the most un­cer­tain as­pects, which the most can be learned from.

Actions internal to NN design

“I sus­pect that less than 10 years from now, all of the DL train­ing/ar­chi­tec­ture tricks that came from the arXiv fire­hose over 2015–2019 will have been en­tirely su­per­seded by au­to­mated search tech­niques. The fu­ture: no alche­my, just clean APIs, and quite a bit of com­pute.”

François Chol­let, 2019-01-7

Mov­ing on to more fa­mil­iar ter­ri­to­ry, we have us­ing ran­dom search or grid search or Bayesian to try train­ing a pos­si­ble NN, ob­serve in­terim () and fi­nal per­for­mance, and look for bet­ter hy­per­pa­ra­me­ters. But if “hy­per­pa­ra­me­ters are pa­ra­me­ters we don’t know how to learn yet”, then we can see the rest of neural net­work ar­chi­tec­ture de­sign as be­ing hy­per­pa­ra­me­ters too: what is the prin­ci­pled differ­ence be­tween set­ting a rate and set­ting the num­ber of NN lay­ers? Or be­tween set­ting a learn­ing rate sched­ule and the width of NN lay­ers or the num­ber of con­vo­lu­tions or what kind of pool­ing op­er­a­tors are used? There is none; they are all hy­per­pa­ra­me­ters, just that usu­ally we feel it is too diffi­cult for hy­per­pa­ra­me­ter op­ti­miza­tion al­go­rithms to han­dle many op­tions and we limit them to a small set of key hy­per­pa­ra­me­ters and use “grad stu­dent de­scent” to han­dle the rest of the de­sign. So… what if we used pow­er­ful al­go­rithms (viz. neural net­works) to de­sign com­piled code, neural ac­ti­va­tions, units like LSTMs, or en­tire ar­chi­tec­tures (, , , , , Cas­tronovo 2016, , , Ravi & Larochelle 2017, , , , , , , , , , , , , , , , Anony­mous 2017, , , , , , , , , , , , , Anony­mous 2018, , , , , , , , Gupta & Tan 2019, )?

The log­i­cal ex­ten­sion of these “neural net­works all the way down” pa­pers is that an ac­tor like Google/Baidu/­Face­book/MS could effec­tively turn NNs into a black box: a user/de­vel­oper up­loads through an API a dataset of in­put/out­put pairs of a spec­i­fied and a mon­e­tary loss func­tion, and a top-level NN run­ning on a large GPU clus­ter starts au­tonomously op­ti­miz­ing over ar­chi­tec­tures & hy­per­pa­ra­me­ters for the NN de­sign which bal­ances GPU cost and the mon­e­tary loss, in­ter­leaved with fur­ther op­ti­miza­tion over the thou­sands of pre­vi­ous sub­mit­ted tasks, shar­ing its learn­ing across all of the dataset­s/loss func­tion­s/ar­chi­tec­tures/hy­per­pa­ra­me­ters, and the orig­i­nal user sim­ply sub­mits fu­ture data through the API for pro­cess­ing by the best NN so far. (Google and Face­book have al­ready taken steps to­ward this us­ing dis­trib­uted hy­per­pa­ra­me­ter op­ti­miza­tion ser­vices which ben­e­fit from trans­fer learn­ing across tasks; Google Vizier/Hy­per­Tune, FBLearner Flow.)

Actions external to the agent

Fi­nal­ly, we come to ac­tions in en­vi­ron­ments which aren’t purely vir­tu­al. Adap­tive ex­per­i­ments, mul­ti­-armed ban­dits, re­in­force­ment learn­ing etc will out­per­form any purely su­per­vised learn­ing. For ex­am­ple, trained as a pure su­per­vised-learn­ing Tool AI, pre­dict­ing next moves of hu­man Go games in a dataset, but that was only a pre­lude to the self­-play, which boosted it from pro­fes­sional player to su­per­hu­man lev­el; aside from re­plac­ing loss func­tions (a clas­si­fi­ca­tion loss like log loss vs vic­to­ry), the Al­phaGo NNs were able to ex­plore tac­tics and po­si­tions that never ap­peared in the orig­i­nal hu­man dataset. The re­wards can also help turn an un­su­per­vised prob­lem (what is the struc­ture or la­bel of each frame of a video game?) into more of a prob­lem by pro­vid­ing some sort of mean­ing­ful sum­ma­ry: the re­ward. A DQN Atari Learn­ing En­vi­ron­ment (ALE) agent will, with­out any ex­plicit im­age clas­si­fi­ca­tion, learn to rec­og­nize & pre­dict ob­jects in a game which are rel­e­vant to achiev­ing a high score.


So to put it con­crete­ly: CNNs with adap­tive com­pu­ta­tions will be com­pu­ta­tion­ally faster for a given ac­cu­racy rate than fixed-it­er­a­tion CNNs, CNNs with at­ten­tion clas­sify bet­ter than CNNs with­out at­ten­tion, CNNs with fo­cus over their en­tire dataset will learn bet­ter than CNNs which only get fed ran­dom im­ages, CNNs which can ask for spe­cific kinds of im­ages do bet­ter than those query­ing their dataset, CNNs which can trawl through Google Im­ages and lo­cate the most in­for­ma­tive one will do bet­ter still, CNNs which ac­cess re­wards from their user about whether the re­sult was use­ful will de­liver more rel­e­vant re­sults, CNNs whose hy­per­pa­ra­me­ters are au­to­mat­i­cally op­ti­mized by an RL al­go­rithm (and pos­si­bly trained di­rectly by a NN) will per­form bet­ter than CNNs with hand­writ­ten hy­per­pa­ra­me­ters, CNNs whose ar­chi­tec­ture as well as stan­dard hy­per­pa­ra­me­ters are de­signed by RL agents will per­form bet­ter than hand­writ­ten CNNs… and so on. (It’s ac­tions all the way down.)

The draw­back to all this is the im­ple­men­ta­tion diffi­culty is high­er, the sam­ple effi­ciency can be bet­ter or worse (in­di­vid­ual parts will have greater sam­ple-effi­ciency but data will be used up train­ing the ad­di­tional flex­i­bil­ity of other part­s), and the com­pu­ta­tion re­quire­ments for train­ing can be much high­er; but the as­ymp­totic per­for­mance is bet­ter, and the gap prob­a­bly grows as GPUs & datasets get big­ger and tasks get more diffi­cult & valu­able in the real world.

Why You Shouldn’t Be A Tool

Why does treat­ing all these lev­els as de­ci­sion or re­in­force­ment learn­ing prob­lems help so much?

One an­swer is that most points are not near any de­ci­sion bound­ary, or are highly pre­dictable and con­tribute lit­tle in­for­ma­tion. Op­ti­miz­ing ex­plo­rations can often lead to pre­dic­tion/­clas­si­fi­ca­tion/in­fer­ence gains. These points need not be com­puted ex­ten­sive­ly, nor trained on much, nor col­lected fur­ther. If a par­tic­u­lar com­bi­na­tion of vari­ables is al­ready be­ing pre­dicted with high ac­cu­racy (per­haps be­cause it’s com­mon), adding even an in­fi­nite num­ber of ad­di­tional sam­ples will do lit­tle; one sam­ple from an un­sam­pled re­gion far away from the pre­vi­ous sam­ples may be dra­mat­i­cally in­for­ma­tive. A model trained on purely su­per­vised data col­lected from hu­mans or ex­perts may have huge gap­ing holes in its un­der­stand­ing, be­cause most of its data will be col­lected from rou­tine use and will not sam­ple many re­gions of state-space, lead­ing to well-known brit­tle­ness and bizarre ex­trap­o­la­tions, caused by pre­cisely the fact that the hu­man­s/­ex­perts avoid the dumb­est & most cat­a­strophic mis­takes and those sit­u­a­tions are not rep­re­sented in the dataset at all! (Thus, a Tool AI might be ‘safe’ in the sense that it is not an agent, but very un­safe be­cause it is dumb as soon as it goes out­side of rou­tine use.) Such flaws in the dis­crim­i­na­tive model would be ex­posed quickly in any kind of real world or com­pet­i­tive set­ting or by RL train­ing.8 You need the right data, not more da­ta. (“39. Re graph­ics: A pic­ture is worth 10K word­s—but only those to de­scribe the pic­ture. Hardly any sets of 10K words can be ad­e­quately de­scribed with pic­tures.”)

An­other an­swer is the “curse of di­men­sion­al­ity”: in many en­vi­ron­ments, the tree of pos­si­ble ac­tions and sub­se­quent re­wards grows ex­po­nen­tial­ly, so any se­quence of ac­tions over more than a few timesteps is in­creas­ingly un­likely to ever be sam­pled, and sparse re­wards will be in­creas­ingly likely to be ob­served. Even if an im­por­tant tra­jec­tory is ex­e­cuted at ran­dom and a re­ward ob­tained, it will be equally un­likely to ever be ex­e­cuted again—whereas some sort of RL agent, whose be­liefs affect its choice of ac­tions, can sam­ple the im­por­tant tra­jec­tory re­peat­ed­ly, and rapidly con­verge on an es­ti­mate of its high value and con­tinue ex­plor­ing more deeply.

A dataset of ran­domly gen­er­ated se­quences of ro­bot arm move­ments in­tended to grip an ob­ject would likely in­clude no re­wards (suc­cess­ful grips) at all, be­cause it re­quires a long se­quence of finely cal­i­brated arm move­ments; with no suc­cess­es, how could the tool AI learn to ma­nip­u­late an arm? It must be able to make progress by test­ing its best arm move­ment se­quence can­di­date, then learn from that and test the bet­ter arm move­ment, and so on, un­til it suc­ceeds. With­out any re­wards or abil­ity to hone in good ac­tions, only the ini­tial states will be ob­served and progress will be ex­tremely slow com­pared to an agent who can take ac­tions and ex­plore novel parts of the en­vi­ron­ment (eg the prob­lem of in the Atari Learn­ing En­vi­ron­ment: be­cause of re­ward spar­si­ty, an ep­silon-greedy might as well not be an agent com­pared to some bet­ter method of ex­plor­ing like den­si­ty-es­ti­ma­tion in .)

Or imag­ine train­ing a Go pro­gram by cre­at­ing a large dataset of ran­domly gen­er­ated Go boards, then eval­u­at­ing each pos­si­ble move’s value by play­ing out a game be­tween ran­dom agents from it; this would not work nearly as well as train­ing on ac­tual hu­man-gen­er­ated board po­si­tions which tar­get the van­ish­ingly small set of high­-qual­ity games & moves. The ex­plo­ration homes in on the ex­po­nen­tially shrink­ing op­ti­mal area of the move­ment tree based on its cur­rent knowl­edge, dis­card­ing the enor­mous space of bad pos­si­ble moves. In con­trast, a tool AI can­not lift it­self up by its boot­straps. It merely gives its best guess on the sta­tic cur­rent dataset, and that’s that. If you don’t like the re­sults, you can gather more data, but it prob­a­bly won’t help that much be­cause you’ll give it more of what it al­ready has.

Hence, be­ing a se­cret agent is much bet­ter than be­ing a tool.

See Also

  1. Su­per­in­tel­li­gence, pg148:

    Even if the or­a­cle it­self works ex­actly as in­tend­ed, there is a risk that it would be mis­used. One ob­vi­ous di­men­sion of this prob­lem is that an or­a­cle AI would be a source of im­mense power which could give a de­ci­sive strate­gic ad­van­tage to its op­er­a­tor. This power might be il­le­git­i­mate and it might not be used for the com­mon good. An­other more sub­tle but no less im­por­tant di­men­sion is that the use of an or­a­cle could be ex­tremely dan­ger­ous for the op­er­a­tor her­self. Sim­i­lar wor­ries (which in­volve philo­soph­i­cal as well as tech­ni­cal is­sues) arise also for other hy­po­thet­i­cal castes of su­per­in­tel­li­gence. We will ex­plore them more thor­oughly in Chap­ter 13. Suffice it here to note that the pro­to­col de­ter­min­ing which ques­tions are asked, in which se­quence, and how the an­swers are re­ported and dis­sem­i­nated could be of great sig­nifi­cance. One might also con­sider whether to try to build the or­a­cle in such a way that it would refuse to an­swer any ques­tion in cases where it pre­dicts that its an­swer­ing would have con­se­quences clas­si­fied as cat­a­strophic ac­cord­ing to some rough-and-ready cri­te­ria.

  2. Su­per­in­tel­li­gence, pg152–153, pg158:

    With ad­vances in ar­ti­fi­cial in­tel­li­gence, it would be­come pos­si­ble for the pro­gram­mer to offload more of the cog­ni­tive la­bor re­quired to fig­ure out how to ac­com­plish a given task. In an ex­treme case, the pro­gram­mer would sim­ply spec­ify a for­mal cri­te­rion of what counts as suc­cess and leave it to the AI to find a so­lu­tion. To guide its search, the AI would use a set of pow­er­ful heuris­tics and other meth­ods to dis­cover struc­ture in the space of pos­si­ble so­lu­tions. It would keep search­ing un­til it found a so­lu­tion that sat­is­fied the suc­cess cri­te­ri­on…Rudi­men­tary forms of this ap­proach are quite widely de­ployed to­day…A sec­ond place where trou­ble could arise is in the course of the soft­ware’s op­er­a­tion. If the meth­ods that the soft­ware uses to search for a so­lu­tion are suffi­ciently so­phis­ti­cat­ed, they may in­clude pro­vi­sions for man­ag­ing the search process it­self in an in­tel­li­gent man­ner. In this case, the ma­chine run­ning the soft­ware may be­gin to seem less like a mere tool and more like an agent. Thus, the soft­ware may start by de­vel­op­ing a plan for how to go about its search for a so­lu­tion. The plan may spec­ify which ar­eas to ex­plore first and with what meth­ods, what data to gath­er, and how to make best use of avail­able com­pu­ta­tional re­sources. In search­ing for a plan that sat­is­fies the soft­ware’s in­ter­nal cri­te­rion (such as yield­ing a suffi­ciently high prob­a­bil­ity of find­ing a so­lu­tion sat­is­fy­ing the user-spec­i­fied cri­te­rion within the al­lot­ted time), the soft­ware may stum­ble on an un­ortho­dox idea. For in­stance, it might gen­er­ate a plan that be­gins with the ac­qui­si­tion of ad­di­tional com­pu­ta­tional re­sources and the elim­i­na­tion of po­ten­tial in­ter­rupters (such as hu­man be­ings). Such “cre­ative” plans come into view when the soft­ware’s cog­ni­tive abil­i­ties reach a suffi­ciently high lev­el. When the soft­ware puts such a plan into ac­tion, an ex­is­ten­tial cat­a­stro­phe may en­sue….The ap­par­ent safety of a tool-AI, mean­while, may be il­lu­so­ry. In or­der for tools to be ver­sa­tile enough to sub­sti­tute for su­per­in­tel­li­gent agents, they may need to de­ploy ex­tremely pow­er­ful in­ter­nal search and plan­ning process­es. Agen­t-like be­hav­iors may arise from such processes as an un­planned con­se­quence. In that case, it would be bet­ter to de­sign the sys­tem to be an agent in the first place, so that the pro­gram­mers can more eas­ily see what cri­te­ria will end up de­ter­min­ing the sys­tem’s out­put.

  3. As the lead au­thor put it in a , the ben­e­fit is not sim­ply bet­ter pre­dic­tion but in su­pe­rior con­sid­er­a­tion of down­stream effects of all rec­om­men­da­tions, which are ig­nored by pre­dic­tive mod­els: this pro­duced “The largest sin­gle launch im­prove­ment in YouTube for two years” be­cause “We can re­ally lead the users to­ward a differ­ent state, ver­sus rec­om­mend­ing con­tent that is fa­mil­iar”.↩︎

  4. Su­per­in­tel­li­gence, pg151:

    It might be thought that by ex­pand­ing the range of tasks done by or­di­nary soft­ware, one could elim­i­nate the need for ar­ti­fi­cial gen­eral in­tel­li­gence. But the range and di­ver­sity of tasks that a gen­eral in­tel­li­gence could profitably per­form in a mod­ern econ­omy is enor­mous. It would be in­fea­si­ble to cre­ate spe­cial-pur­pose soft­ware to han­dle all of those tasks. Even if it could be done, such a project would take a long time to carry out. Be­fore it could be com­plet­ed, the na­ture of some of the tasks would have changed, and new tasks would have be­come rel­e­vant. There would be great ad­van­tage to hav­ing soft­ware that can learn on its own to do new tasks, and in­deed to dis­cover new tasks in need of do­ing. But this would re­quire that the soft­ware be able to learn, rea­son, and plan, and to do so in a pow­er­ful and ro­bustly cross-do­main man­ner. In other words, it would need gen­eral in­tel­li­gence. Es­pe­cially rel­e­vant for our pur­poses is the task of soft­ware de­vel­op­ment it­self. There would be enor­mous prac­ti­cal ad­van­tages to be­ing able to au­to­mate this. Yet the ca­pac­ity for rapid self­-im­prove­ment is just the crit­i­cal prop­erty that en­ables a seed AI to set off an in­tel­li­gence ex­plo­sion.

  5. While Google Maps was used as a par­a­dig­matic ex­am­ple of a Tool AI, it’s not clear how hard this can be pushed, even if we ex­clude the road sys­tem it­self: Google Map­s/Waze is, of course, try­ing to max­i­mize some­thing—­traffic & ad rev­enue. Google Maps, like any Google prop­er­ty, is doubt­less con­stantly run­ning on its users to op­ti­mize for max­i­mum us­age, its users are con­stantly feed­ing in data about routes & traffic con­di­tions to Google Map­s/Waze through the web­site in­ter­face & smart­phone GPS/WiFi ge­o­graphic logs, and to the ex­tent that users make any use of the in­for­ma­tion & in­crease/de­crease their use of Google Maps which many do so blind­ly, Google Maps will get feed­back after chang­ing the real world (some­times to the in­tense frus­tra­tion of those affected, who try to ma­nip­u­late it back)… Is Google Map­s/Waze a Tool AI or a large-s­cale Agent AI?

    It is in a en­vi­ron­ment, it has a clear re­ward func­tion in terms of web­site traffic, and it has a wide set of ac­tions it con­tin­u­ously ex­plores with ran­dom­iza­tion from var­i­ous sources; even though it was de­signed to be a Tool AI, from an ab­stract per­spec­tive, one would have to con­sider it to have evolved into an Agent AI due to its com­mer­cial con­text and use in re­al-world ac­tions, whether Google likes it or not. We might con­sider Google Maps to be a “se­cret agent”: it is not a Tool AI but an Agent AI with a hid­den & highly opaque re­ward func­tion. This is prob­a­bly not an ideal sit­u­a­tion.↩︎

  6. If the NN is trained to min­i­mize er­ror alone, it’ll sim­ply spend as much time as pos­si­ble on every prob­lem; so a cost is im­posed on each it­er­a­tion to en­cour­age it to fin­ish as soon as it has a good an­swer, and learn to fin­ish soon­er. And how do we de­cide what costs to im­pose on the NN for de­cid­ing whether to loop an­other time or emit its cur­rent best guess as good enough? Well, that’ll de­pend on the cost of GPUs and the eco­nomic ac­tiv­ity and the util­ity of re­sults for the hu­mans…↩︎

  7. Kyunghyun Cho, 2015:

    One ques­tion I re­mem­ber came from Tiele­man. He asked the pan­elists about their opin­ions on ac­tive learn­ing/­ex­plo­ration as an op­tion for effi­cient un­su­per­vised learn­ing. Schmid­hu­ber and Mur­phy re­spond­ed, and be­fore I re­veal their re­spon­se, I re­ally liked it. In short (or as much as I’m cer­tain about my mem­o­ry), ac­tive ex­plo­ration will hap­pen nat­u­rally as the con­se­quence of re­ward­ing bet­ter ex­pla­na­tion of the world. Knowl­edge of the sur­round­ing world and its ac­cu­mu­la­tion should be re­ward­ed, and to max­i­mize this re­ward, an agent or an al­go­rithm will ac­tive ex­plore the sur­round­ing area (even with­out su­per­vi­sion.) Ac­cord­ing to Mur­phy, this may re­flect how ba­bies learn so quickly with­out much su­per­vis­ing sig­nal or even with­out much un­su­per­vised sig­nal (their way of ac­tive ex­plo­ration com­pen­sates the lack of un­su­per­vised ex­am­ples by al­low­ing a baby to col­lect high qual­ity un­su­per­vised ex­am­ples.)

  8. An ex­am­ple here might be the use of ‘lad­ders’ or ‘mir­ror­ing’ in Go—­mod­els trained in a purely su­per­vised fash­ion on a dataset of Go games can have se­ri­ous diffi­culty re­spond­ing to a lad­der or mir­ror be­cause those strate­gies are so bad that no hu­man would play them in the dataset. Once the Tool AI has been forced ‘off-pol­icy’, its pre­dic­tions & in­fer­ences may be­come garbage be­cause it’s never seen any­thing like those states be­fore; an agent will be bet­ter off be­cause it’ll have been forced into them by ex­plo­ration or ad­ver­sar­ial train­ing and have learned the proper re­spons­es. This sort of bad be­hav­ior leads to qua­drat­i­cally in­creas­ing re­gret with pass­ing time: Ross & Bag­nall 2010.↩︎