Machine vision works nowadays. Machines can: navigate using vision; separate object from background; and recognise a wide variety of objects and track their motion. These abilities are great spin-offs in their own right, but are also part of an extended adventure in understanding the nature of intelligence through visual perception.

One general question about intelligent systems is whether they will be dominated by "generative" models which explain data as a sequence of transformations, or by black-box machines that are trained on data at ever greater scale. In perception systems this boils down to the comparative roles of two paradigms: analysis-by-synthesis versus empirical recognisers.

Each approach has its strengths, and empirical recognisers especially have made great strides in performance in the last few years through “deep learning”. Exciting progress has already been made on integrating the two approaches. It is also fascinating to speculate what other new paradigms in learning might transform the speed at which artificial perception can develop.

Watch Professor Andrew Blake's 2017 Lovelace lecture