[Meditation on the problem of coordinating reaction to x-risks, and AI risks in particular. To quote Norbert Wiener:
Again and again I have heard the statement that learning machines cannot subject us to any new dangers, because we can turn them off when we feel like it. But can we? To turn a machine off effectively, we must be in possession of information as to whether the danger point has come. The mere fact that we have made the machine does not guarantee we shall have the proper information to do this.
A fire alarm, even if it is not 100% accurate, coordinates human reactions: it becomes permissible to leave the room and investigate, take precautions, and for everyone to evacuate the building. This is because we all agree that fires usually come with smoke and smoke can be objectively detected. But what is the fire alarm for AI? “AI is whatever we can’t do yet”, and whenever AI accomplishes a new feat, people will simply move the goalposts and say that that task turned out to be unexpectedly easy to solve. There is no agreement on what “imminent AGI” looks like. You can ask AI researchers, “how would the world look different if we were in fact heading towards AGI in the near future, the next decade or three?” and they are unable to answer. They do not know what is or is not a ringing alarm bell, the point at which everyone should start taking the prospect very seriously. It was not chess, it was not ImageNet classification, it was not Go…
AI so far resembles other technologies like airplanes or nuclear bombs where just years before, the physicists who would invent it, eminent physicists, and physicists in general, were highly uncertain or skeptical or outright convinced of their impossibility. This was because progress in nuclear physics looked much the same regardless of whether nuclear bombs were possible and impossible. There was large ineradicable uncertainty, which appears to have neutered any serious effort to prepare. And yet, these matters ought to be dealt with in advance. Things like nuclear bombs or AI should not just arrive with no one having done anything to prepare. Or consider pandemics. Those who tried to warn the world about coronavirus will find this essay eerily apt.]
Okay, let’s be blunt here. I don’t think most of the discourse about AGI being far away (or that it’s near) is being generated by models of future progress in machine learning. I don’t think we’re looking at wrong models; I think we’re looking at no models.
I was once at a conference…I got up in Q&A and said, “Okay, you’ve all told us that progress won’t be all that fast. But let’s be more concrete and specific. I’d like to know what’s the least impressive accomplishment that you are very confident cannot be done in the next two years.”
There was a silence.
Eventually, 2 people on the panel ventured replies, spoken in a rather more tentative tone than they’d been using to pronounce that AGI was decades out. They named “A robot puts away the dishes from a dishwasher without breaking them”, and Winograd schemas…A few months after that panel, there was unexpectedly a big breakthrough on Winograd schemas. The breakthrough didn’t crack 80%, so three cheers for wide credibility intervals with error margin, but I expect the predictor might be feeling slightly more nervous now with one year left to go…
But that’s not the point. The point is the silence that fell after my question, and that eventually I only got 2 replies, spoken in tentative tones. When I asked for concrete feats that were impossible in the next two years, I think that that’s when the luminaries on that panel switched to trying to build a mental model of future progress in machine learning, asking themselves what they could or couldn’t predict, what they knew or didn’t know. And to their credit, most of them did know their profession well enough to realize that forecasting future boundaries around a rapidly moving field is actually really hard, that nobody knows what will appear on arXiv next month, and that they needed to put wide credibility intervals with very generous upper bounds on how much progress might take place 24 months’ worth of arXiv papers later. (Also, Demis Hassabis was present, so they all knew that if they named something insufficiently impossible, Demis would have DeepMind go and do it.)
…When I observe that there’s no fire alarm for AGI, I’m not saying that there’s no possible equivalent of smoke appearing from under a door. What I’m saying rather is that the smoke under the door is always going to be arguable; it is not going to be a clear and undeniable and absolute sign of fire; and so there is never going to be a fire alarm producing common knowledge that action is now due and socially acceptable…There is never going to be a time before the end when you can look around nervously, and see that it is now clearly common knowledge that you can talk about AGI being imminent, and take action and exit the building in an orderly fashion, without fear of looking stupid or frightened.