It was called a perceptron for a reason, damn it

This Technology Review piece about the inventor of backpropagation in neural networks crystallises some important issues about the current AI boom. Advances (or supposed advances) in AI are often held to put our view of ourselves in question. Its failures are held, by anyone who pays attention to them, to put our view of intelligence in question. This has historically been more useful. Now, though, I think the lesson is about our view of machines.

What do you expect of a machine? Well, you expect it to do some well-defined task, without making mistakes of its own, faster or stronger or for longer than human or animal bodies could do it. We have machines that do this for elementary arithmetic operations – they’re called computers – and it turns out you can do all kinds of stuff with them. As long, of course, as you can break down the task into elementary logical primitives that act on inputs and outputs that you can define unambiguously.

And that’s what programming is. You try to define whatever it is in terms a computer can understand, and organise them in ways that let it use its heroic athletic stupidity. Increment, decrement, compare, conditional branch, jump. That’s why it’s hard.

Art tried to illustrate artificial intelligence with this distinction. The intelligent computer would be terrifyingly smart, but would be persuaded or confounded or subverted by human empathy or original creativity. The computer they imagined was a big version (sometimes, a literally huge one, missing the microprocessor future) of the mainframes or #5 crossbar switches or IBM 360s of the day; a machine that manufactured thought in the same way a steam engine revolves a wheel.

Around about this time, the first neural networks were developed. It’s important to remember this was already a retreat from the idea of building a reasoning machine directly – creating an analogue to a network of nerves is an effort to copy natural intelligence. Interestingly, the name “perceptron” was coined for them – not a thinking machine, but a perceiving machine.

Much later we learned how to make use of neural nets and how to build computers powerful enough to make that practical. What we do with them is invariably to perceive stuff. Is there a horse in this photograph? Are these search terms relevant to this document? Recently we’ve taken to calling this artificial intelligence or machine learning, but when it was in the research stage we used to call it machine vision.

If you imagine an intelligent machine you probably imagine that it would think, but it would be a machine – it would do it fast, and perfectly, within the limits of what it could do. It wouldn’t get the wrong day or make trivial errors of arithmetic or logic. It would be something like the reasoning, deliberative self Daniel Kahneman calls System 1, just without its tiresome flaws. It would reason as you might run, but without tiring so long as it had electric power. Awesome.

This is exactly not what deep learning systems deliver. Instead, they automate part of System 2’s tireless, swift, intuitive functions. They just don’t do it very well.

Machines don’t get tired; but then neither does human pattern-recognition. You have to be in dire straits indeed to mistake the man opposite you on the Tube for a dog, or your spouse for a hat. It is true that a machine will not daydream or sneak off for a cigarette, but this is really a statement about work discipline. Airport security screeners zone out, miss stuff, and know that nobody really takes it seriously. Air traffic controllers work shorter shifts, are paid much more, and are well aware that it matters. They seem far more attentive.

Notoriously, people are easily fooled in System 2 mode. There are whole books listing the standard cognitive biases. One of the most fascinating things about deep learning systems, though, is that there is a whole general class of inputs that fools them, and it just looks like…analogue TV interference or bad photo processing. The failure is radical. It’s not that the confidence of detection falls from 99% to 95% or 60%, it’s that they detect something completely, wonderfully, spectacularly wrong with very high confidence.

You might think that this is rather like one of the classic optical illusions, but it’s worse than that. If you notice that you look at something this way, and then that way, and it looks different, you’ll notice something is odd. This is not something our deep learner will do. Nor is it able to identify any bias that might exist in the corpus of data it was trained on…or maybe it is.

If there is any property of the training data set that is strongly predictive of the training criterion, it will zero in on that property with the ferocious clarity of Darwinism. In the 1980s, an early backpropagating neural network was set to find Soviet tanks in a pile of reconnaissance photographs. It worked, until someone noticed that the Red Army usually trained when the weather was good, and in any case the satellite could only see them when the sky was clear. The medical school at St Thomas’ Hospital in London found theirs had learned that their successful students were usually white.

The success of deep learning systems has given us better machine perception. This is really useful. What it does well is matching or identifying patterns, very fast, for longer than you can reasonably expect people to do. It automates a small part of the glorious wonder of intuition. It also automates everything terrible about it, and adds brilliantly creative mistakes of its own. There is something wonderful about the idea of a machine that gets it completely, hopelessly wrong.

Unfortunately, we have convinced ourselves that it is like System 1 but faster and stronger. This is nonsense.

A really interesting question, meanwhile, would be what a third approach to AI might be like. The high postwar imagined intelligence in terms of reasoning. Later, we tried perception. That leaves action. Research into natural – you might say real – cognition emphasises this more and more. Action, though, implies responsibility.

8 Comments on "It was called a perceptron for a reason, damn it"

  1. You’ve got it flipped: Kahneman’s System 1 is fast, intuitive thinking; System 2 is slow, analytical thinking.


  2. Great article. I’m investigating the overlap points of technologies like ML and behavioral economics, so really enjoyed what you had to say.

    A minor quibble: I think you have Systems 1 an 2 reversed. System 1 is the fast, effortless, automatic one that makes problematic quick guesses when interpreting input. System 2 is the deliberative, methodical, time/effort-intensive one.


  3. Analytical think arose in biology to solve a problem with fast decision systems doing stupid things. I guess AI will need to do the same as it matures.


Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.