Deeph Chana, Co-Director of Imperial College’s Institute for Security, Science and Technology, talks to Johanna Hamilton AMBCS about machine learning and how it’s changing our lives.

Professor Chana has messed about with machine learning as an undergraduate, worked with the security services directing applied machine learning projects in their infancy, following the 7/7 London tube bombings and is now making what was once a futuristic dream into applied/commercial reality at Imperial College Business School in London.

Tell me about your career

I've spent my academic life playing around with machine learning methods, AI and neural networks. Even when I was doing my PhD and my undergrad in physics, that was a hot topic of research in numerate science and technology. Though, in those days, it was seen as a bit of a toy. We didn’t have such a massive explosion of data, we didn't have the kind of computing power that cloud computing now gives us. Back then, there was a lot of theoretical work being done, but it was difficult to see how it could be applied effectively to anything practical.

Post-PhD, I spent five years in a university physics research environment. I became very interested in how science and evidence were being used to shape government thinking and government policy on major topics, such as the Al Qaeda terror threat. I joined the government shortly after the London tube bombings and one of my first roles was to run a programme for the department for transport, looking at ways we could analyse and mitigate the threats to the transport system.

When I joined the government, we were starting to see the possibilities of machine learning and AI for national security applications. However, the problem with trying to use AI and machine learning to prevent terrorist attacks is the fact that you need big data in order to feed machine learning algorithms - and terrorism is, thankfully, very rare.

I currently work at Imperial College London, where I Co-Direct the Institute for Security Science and Technology and the Centre for Financial Technology. This involves numerous activities including directing research and education, setting up thought leadership and innovation initiatives and delivering some of the College’s executive education programmes.

Where has machine learning come from?

Machine learning started off quite a long time ago. There was a real buzz of activity about it in the 1950s. It was developed slowly over decades and was seen as a challenging, mathematical pursuit.

A pursuit for researchers really to try and drive our knowledge forward and for a while it didn't look like it was ever going to be something that we could actually utilise. But then, with the advent of the internet, we end up with a huge amount of data being generated coupled with powerful computing made available to everyone through the cloud.

Suddenly, we had a way to unlock the decades of research and use it at scale to make products and services.

Does this technology infringe our civil liberties?

The government machine has been very conscious, for some time, of the need to balance civil liberties and justice with the public use of this technology. Similar discussions were had over ID cards. If you think about it, you don't necessarily need machine learning and facial recognition to track people, as you could just make it mandatory for everyone to have a traceable ID card and make it illegal to leave your house without it.

I think it’s clear that there's a very important debate to be had around the implementation of technology whilst still preserving the underlying values that we hold and strive towards in society.

How do we only use AI for good?

Who decides what's bad and good? If you look at the huge amounts of benefit gained from nuclear power - it has actually transformed the world. Some would argue not for the best, but I would say it's been an amazing innovation. And then on the other side, of course, you've got nuclear weapons. I think the same sorts of concerns play out with AI and machine learning and how it could cause damage as well as benefit.

The big challenge is trying to figure out how we move forward, maximising the good stuff and minimising the potential for bad. People are worried about things like lethal autonomous weapons systems (LAWs). The whole idea of AI and machine powered robots and AI cyber agents effectively engaging in automated warfare is now not that fanciful.

Should machines have complete autonomy?

When you look at rules-based tasks, like driving, there's a decent amount of structure. So, eventually, we will get to a point where a machine operating a vehicle will probably be a lot safer than a human being. How much safer we have to be, before society accepts that the machine drives rather than the human being, I'm not sure, but I think that's an achievable goal.

On, say, problems such as ascribing a criminal risk score to individuals, I think society will be less accepting. There are already systems that try and identify hotspot regions where they think crime might occur and make predictive analysis on criminal activity. However, a lot of that is potentially very dubious because what you're doing is using historical data, which on its own, might contain a lot of bias. In fact, we know it contains bias, we know we have problems with bias in law enforcement and criminal investigations.

In that situation, the machine learning agents are likely to be more efficient at reinforcing that bias, resulting in a ‘faulty’ system. So, you now have a problem where past mistakes are being amplified, because the machine isn't equipped with a moral compass. It makes decisions based purely on the historic data.

We might say that an aspiration in society is to make sure that everybody is treated equally under the law. That's a value rather than data driven goal, as the data tells us that's not true currently. The fact is, a machine that just acts on data isn't necessarily going to drive towards that goal.

Can you make data non-biased?

In some ways, that’s a job in revisionist history. You have to go back and you have to start undertaking a process of data correction and data cleansing so that it's in line with our societal values now - that's one way that you might be able to address bias in historical data.

The other way is in the methods of training AI and machine learning agents in a way that would reward movements towards the values that we are looking to reinforce. How exactly you achieve bias elimination is an ongoing debate right now. How do we get to a stage where we can actually build these value systems into these algorithms?

Why not include every bias to make the outcome more ‘human’?

There is an argument that says if you get all the data in the world and you end up capturing all of these data feeds then you’re going to make much better (non-biased) decisions, which are more ‘human-like’. But I challenge that.

Both Amazon and IBM have recently made statements about their intention to bow out of facial recognition, particularly in the domain of police forces using their technology. The reason is, we end up with a data set which preferentially shows a bias towards picking one race over another, where one is ‘suspicious’ and one is not. So, if we just keep the original data set and use it in our justice system to make decisions with that kind of in-built bias, we will likely, as a society, not like the outcome.

We have to accept that history and data is already biased, but this can be negated to some extent by introducing a value statement. When you make a decision as a person, you will base it on all of the information that you've collected as a human being, up to this point in time - which is your data. But you will also base that decision on value judgments.

Some things which you believe, regardless of whether you’ve experienced them or not. That's not historical data-based knowledge that you're using. That's about aiming in the future. So, we will need a situation where both of those aspects play together for us to be able to have agents that we can trust and that we are confident will orientate their solutions in a direction which is in line with what we want for society.

Will AI help us to be better humans?

The technology is making us re-inspect who we are as humans and is one of the reasons why it seizes the public's imagination. I gave a talk a couple of years ago about the dangers and risks of AI and machine learning. One of the things I tried to point out was that in fact it's not really the machines that we're afraid of, it's actually the mistakes that we’ve made as humans being executed more efficiently by the machines.

Bias will always exist from humans, from our human history, from our actions, from the data that we produce. It's going to be very difficult for us to see a day when there isn't any bias. There will always be bias. It's possibly integral to what we are as human beings. That's the reason that we build societal norms and values. It's a way to make sure that there is some overall direction of travel in spite of all of the different biases we have.

How do you legislate an idea that’s constantly evolving?

We’re trying to govern 21st century ideas based on 19th century policy practices. Some people believe that we just need new laws, new legislation, new regulation. I think we need an entirely new process for the way that we generate laws, regulation and policy.

We need a tighter dialogue between industry, academia and government so there's a better shared understanding between all of those different communities about how each works.

I personally think we should see a smoother movement of individuals between all of those domains as well, throughout their careers, so you end up with inbuilt expertise of all those domains proliferated through different organisations. That would be a way to end up with a better policy making space.

Where is machine learning taking us?

It's taking us in any number of different directions. Much of the AI and machine learning popular literature is focused on what we would call supervised learning, where human beings are used to effectively mark the homework for a machine learning agent during its training phase and then that machine learning agent is put out in the world.

Machine learning allows us to look at huge amounts of data, extract patterns and information from that data and potentially predict a little bit in the future based on historical events. A lot of the actual implementation tasks for machine learning are around that kind of data workhorse capability.

What we are striving for, however, what originally motivated machine learning research was the ability to develop artificial intelligences, agents that can effectively think and reason for themselves without human input.

That's a direction of travel that will continue for fundamental AI research. The application of machine learning is in its infancy. It’s just the beginning of its use to revolutionise materials design, medicine and health, finance and fintech systems, security and defence - it really will touch every part of our lives.

When is an elephant not an elephant?

Computers deal with binary numbers. It’s relatively easy for people to understand that I can convert any sum that I write down into a binary sum and then I can feed it into a computer and we can attack the problem that way. It becomes a little bit more challenging when people think of things like images.

When you take a picture, make a video - digitise an image - you are turning it into a series of numbers. A display will turn those numbers into colours that we can perceive as an image when it finally hits our monitor or our tv screen. But within the computing system, it's just a big bunch of numbers that we can manipulate. So, we can transform an image into numbers, and numbers back into images very easily.

So, say you have a picture of an elephant and I turned it into an array of numbers. There are some statistical patterns inside that number array. Now those statistical patterns you perceive in your eye as the structure of the elephant - shape, ears, trunk, colour - it’s all present as mathematical, statistical patterns in the data.

The machine learning algorithm is then shown lots and lots and lots of pictures of elephants; after a while it starts understanding the salient numerical statistical features, the core features that relate one image to another - the essence of it being an elephant. That's essentially the learning process used, so if you show the agent a new image that it hasn’t seen before, it’s able to determine if it’s an elephant or not.

The more data you give the machine, the more likely it is to correctly identify elephants. However, if you train it exclusively on pictures of adult elephants, then show it a baby elephant, it is likely to say that’s not an elephant: a problem known as overfitting. So, the way in which you feed the data in and train a machine learning agent is very important to how it performs.

About the author

Prof Deeph Chana is currently Co-Director of the Institute for Security Science and Technology and the Centre for Financial Technology at Imperial College London. He has extensive experience of leading cutting-edge science and technology research and development in academia, industry and government, focusing on global risks and security.

He has published on the use of advanced machine learning methods for defensive and offensive cyber security, developed detection systems for national security applications and has consulted globally with multinationals, start-ups and governments on emerging and disruptive technologies.

Having previously worked for the UK Government he has advised four Secretaries of State and established numerous national and international science policies working with authorities around the world. Deeph holds MSci and PhD degrees in Physics from King’s College London.