Does AI need symbols?

Inspired by the brain, artificial neural networks are core to modern artificial intelligence. Grace Lindsay, author of Models of the Mind, explains concerns over the cognitive limits of these systems.

Computers are well suited for manipulating symbols. Researchers in the early days of artificial intelligence knew this. In the 1950s, scientists tried to map their own intuitions about how to solve problems onto symbols and algorithms that were deployed on emerging digital computers. This led to unprecedented successes, such as the ability to automatically prove mathematical theorems and have computers carry out conversations.

Today, artificial intelligence is having another golden era. Companies such as Google and Facebook invest billions into AI research and create impressive new product features as a result. The algorithms behind this surge, however, are not based on those developed in the 1950s.

That style of engineering, now referred to as ‘good old-fashioned AI’, or GOFAI, has been replaced by a method more directly inspired by the brain. Now, ‘artificial neural networks’ are trained from data directly to do see, speak, plan and play. With no explicit symbols in sight, these networks seem to prove that the GOFAI approach was misguided.

However some researchers disagree. The impressive results of neural networks notwithstanding, they do have weaknesses - weaknesses that seem to align very closely with the strengths of symbolic systems. Does this mean that AI needs to bring back symbols?

History of artificial neural networks

Ironically, artificial neural networks started as symbol processors. In 1943, Walter Pitts and Warren McCulloch published ‘A logical calculus of the ideas immanent in nervous activity’. This paper showed for the first time that neurons - through the way they send electrical signals - could be implementing the rules of binary logic.

This finding had a big impact on the history of computing. John von Neumann credited his design of the modern computer architecture to the work of McCulloch and Pitts.

Yet, the form of artificial neural networks used today differs from what McCulloch and Pitts imagined. The direct ancestor of modern artificial neural networks is the Perceptron, created by Frank Rosenblatt in 1958.

In modern artificial neural networks, the activity of an artificial neuron is represented by a positive number. This number gets multiplied by a weight and then serves as input to other neurons. The values of these weights are learned by the network to make it good at whatever task it is charged with. In the case of the Perceptron, for example, Rosenblatt taught the network to classify basic shapes.

Once these networks are trained, they work. However, because their connections are learned through a training algorithm, just how they work is not always known. And the activity of the neurons and their interactions are no longer easily mappable to symbols or the rules of logic.

How neural networks work today

From transcribing people’s words to beating professionals in the game of Go, artificial neural networks are making huge gains in a wide range of fields.

Different tasks are best solved using different architectures. The architecture of a neural network describes the number and type of neurons in it as well as how they are connected. When dealing with images, for example, ‘convolutional’ neural networks work best. The architecture of these networks is inspired by the brain’s visual system. As a result, they’ve been successful at vision problems such as diagnosing diseases in medical images.

There are also many options when it comes to how to train a neural network. While ‘supervised’ training requires having matching pairs of inputs and outputs (for example, an image and a label describing what is in it), ‘unsupervised’ training is a powerful way to make use of unlabelled data. For example, by simply training a network to predict missing words from sentences, scientists at OpenAI have created a language model capable of producing new text that is almost indistinguishable from human writing.

Today’s neural networks are particularly powerful because they can interface directly with the real world. By providing them a stream of data, they can learn to make sense of it.

The benefits of symbols

For all their successes, neural networks still fall short on some pretty simple tasks.

Consider, for example, scene understanding. Looking around the room in front of you, you are not only able to identify the objects you see, you also understand their relationships. The book is on the desk; the window is to the right of the door. While artificial neural networks can easily identify these objects, they struggle to explain their relationships or predict the effects of actions on them.

Symbolic systems are designed to explicitly capture relationships as well as compositionality. Compositionality means that parts can be combined into wholes. So, while a neural network may see a car as a car, a symbolic system could be designed to know that a car is made up of an engine, tyres, doors and so on.

Knowing both how parts relate to each other and how they relate to the whole lets symbolic systems extrapolate. When a symbolic system encounters a car it has never seen before, for example, it can still deduce that it has an engine. And it knows that the engine can be inside or outside the car.

That is because in a symbolic system, the concept of outside can be combined with the concept of an engine, even if the system has never specifically seen an engine outside a car. Neural networks, because they are trained on specific data, have trouble processing new situations that are too different from what they’ve seen before.

In effect, symbols allow for reasoning. And reasoning is core to intelligence.

Combined systems

In an effort to have the best of both worlds, research groups worldwide are working on building systems that think with symbols but learn like neural networks.

One example of a ‘neuro-symbolic’ approach is relation networks. Introduced in 2017, relation networks are artificial neural networks trained explicitly to predict the relationship between objects. In doing so, they can achieve much higher performance than traditional networks on scene understanding tasks.

Another system, known as logic tensor networks (LTN), implements the rules of logic using the maths of linear algebra. The network then learns within the constraints of these rules to represent and manipulate data.

Differentiable neural computers also aim to combine the benefits of traditional computing systems with the power of neural networks. Here, a neural network implements a Turing machine; that is, it is trained to read to and write from an external memory.

This memory allows it to store data in a more structured way than normal neural networks. Differentiable neural computers have been trained to solve many reasoning problems such as finding the shortest path between two points.

Going back to the brain

The goal of combining neural and symbolic systems assumes that they are separate ways of processing information. Yet, it was through reflection on their own ways of thinking combined with insights from cognitive science that inspired the pioneers of early symbolic AI. That is, symbolic systems aim - to some extent - to mimic the human mind. And the human mind runs on neurons.

When you peek inside a computer, you don’t see symbols. Symbolic processing is implemented through the interaction of transistors and other electrical components. The same is true in the brain. While it is hard to see symbolic processing at the neuron level, it is indeed the interaction of neurons that ultimately gives rise to how humans think, including how they think symbolically.

It is the challenge of future neuroscientists and AI engineers to understand exactly how neurons in the brain support all aspects of intelligence - including learning from data and symbolic thinking. Not only will this advance our understanding of the brain, it will pave the way for truly intelligent computers.