Meet the augmented security analyst

Martin Borrett FBCS introduces Martin Cooper AMBCS to IBM Watson, for cyber security, and explains how augmented intelligence is helping cyber security experts to operate more effectively on the front line.

A cyber security analyst. It sounds like a glamorous job: one where there’s never a dull moment, one where you’re always learning something new, one where there’s a chance to really make a difference and, if a cursory skim through a jobs website is to be believed, one that pays well too.

Martin Borrett FBCS, Technical Executive and CTO for IBM’s European security business, doesn’t disagree. But, into this broadstroke portrait of a security analyst, Borrett adds another word: pressure.

‘We’re under tremendous pressure in the cyber security field,’ he explains. ‘We’re struggling with skills - organisations are finding it hard to find people with the right expertise.

‘And we’re struggling with speed of response. When you’re under attack and there is that real urgency - to borrow a medical analogy - to triage a problem when it arrives. We need that immediate assessment of criticality.’

Along with the pressured need to respond quickly, analysts are also drowning in data. Firstly, the network they’re guarding produces a huge amount of information about its health and wellbeing. Back in 2016, IBM research showed that ‘the average organisation sees over 200,000 pieces of security event data per day, with enterprises spending $1.3 million a year dealing with false positives alone, wasting nearly 21,000 hours.’

At the time of writing, there were just short of 100,000 common vulnerabilities and exposures (or CVEs) in the US Department of Commerce’s National Vulnerability Database.

To make sense of all this data, analysts need to feed themselves with a broad diet of new knowledge. This comes in the form of the latest research papers, blogs and posts that blaze across Twitter.

Summarising the situation, back in 2016, Marc van Zadelhoff, General Manager, IBM Security, said: ‘The volume and velocity of data in security is one of our greatest challenges in dealing with cybercrime.’

‘Working in a security operations centre is a pressured job and it’s a relentless job,’ Borrett says. Burnout isn’t uncommon.

A smarter solution

It’s against this backdrop that, in May 2016, IBM launched Watson for cyber security. The aim, Borrett says, was, and still is, to use Watson’s ability to bring context to huge amounts of unstructured data - amounts of data that humans just can’t process. Through this, Watson provides analysts with insights, knowledge and recommendations. The net result should be faster decision-making and a reduction in pressure.

Watson, in many ways, requires little or no introduction. It entered the popular consciousness, when, back in 2011, it won the American quiz show, Jeopardy!

‘Watson is IBM’s platform that provides cognitive capabilities,’ explains Borrett. Today there isn’t just one Watson. There are many.

‘It is a platform that is trained to operate within a particular industry or sectoral setting and one that provides developers with APIs so they can write enabling applications. Most recently, ITNOW explored how Watson is being employed in a health setting.

The theory runs that Watson ingests medical knowledge - journals, papers and studies. Equipped with this, clinicians can provide the system with a patient’s notes and receive Watson’s recommendations about a probable diagnosis.

‘It’s an example of where there’s a huge corpus of knowledge around a subject, a body of knowledge that Watson is taught about and against which you can ask questions,’ Borrett explains.

The probable culprit

Watson isn’t designed to be a black-and-white arbiter though. It doesn’t give definitive answers. Rather, it’s designed to provide human experts with different hypotheses and to give each a probability score. Within a clinical setting Watson Healthcare is seen very much as part of an expert care team - a group of human doctors and experts.

Cognitive computing is pitched as a tool to help, and not one that’s been designed to make expert humans redundant.

‘It provides a confidence level,’ explains Borrett. ‘It does its analysis across its corpus of knowledge... it’s able to understand the information, to understand the question and to understand context and it’s able to do that in natural language. It comes up with a reasoned hypothesis backed with evidence.’

Borrett feels that, on many levels, there are parallels between health care and cyber security. ‘Similarly,’ he says, ‘there is a tremendous amount of knowledge about cyber security... about past attacks. We know a lot about (criminal) organisations, and about the type of malware they use. We know a lot about their mode of behaviour and how they operate.

All this is well documented by experts, by the industry, by government, and by academics. There’s a wealth of knowledge.’ And again, like the medical world, this library of critical information is so vast, and grows so quickly, it’s impossible for one person to consume it, to retain it and use it.

‘As an analyst responding to an incident in my organisation, when I see indicators, I start forming my own theory about how serious it is. Now I can turn to Watson and say: ‘I see these indicators, have you seen it before, what do you think? Has this effected another organisation too?’ The bad guys, he says, do re-use code, exploits and attack vectors. What often prevents the good guys from surviving a re-used attack is that they just don’t know about how their counterparts reacted and how they solved the problem.

How to feed Watson

The quality of Watson’s cyber advice is, of course, dictated by the quality and the trustworthiness of the information that is provided to it. IBM works hard to ensure that it provides the system with qualified, reliable information. Firstly, Watson is fed with unstructured information: natural language, written information from human-approved sources.

Wise to the fact that not all information is equal, IBM limits what Watson can ingest to certified government repositories, academic and respected industry sources. This process is managed carefully by humans.

Secondly, Watson draws information from data from IBM X-Force Exchange - a real time threat intelligence database and sharing platform. It aggregates and broadcasts information about threats and actions.

‘IBM X-Force Exchange holds around 800 terabytes of threat intelligence data,’ Borrett says. ‘It’s information that we have gathered over the last 20 years and it’s being continually updated. For example, as WannaCry emerged during the summer, our threat intelligence - work done by our analysts and work done with other organisations - was being updated constantly.’

As Watson ingests the information, it joins the dots, makes relationships, sees patterns and generates a knowledge graph. These graphs can have billions of nodes. This means, to a degree, that one bad apple won’t ruin Watson’s barrel of knowledge. ‘It can learn,’ Borrett says, ‘which bits of information are most valuable. Not all information is used frequently.’

But what about zero-day attacks: threats that are completely new and previously unseen? ‘There are always new things emerging... but, we can learn rapidly about them and we can share what we’ve learned,’ Borrett says. ‘This is where collaboration is so critical. Because of the way we opened up IBM X-Force Exchange to collaboration, we see people around the world - not just consuming information but sharing that information. And, through that process, we all learn faster.’

When it comes to zero-day attacks then, the answer seems to be: don’t try and predict the future, try and solve the problems in the present. ‘We need to close the window for that type of attack more quickly.’

Borrett is, however, keen not to oversell Watson as a cyber security silver bullet. Cognitive computing is a journey, he says, and it’s one we’re just beginning. ‘There is something tangible here though,’ he concludes. ‘With our early adopters, investigation times have gone down from 60 minutes to a minute. It’ll get faster and it’ll learn more... but it only works in the context of a bigger immune system.’

Whatever Watson’s future in cyber, we’ll still need a set of evolving, multifaceted and carefully orchestrated cyber defences.

Are the bad guys using AI?

It’s a common narrative: cyber security is an arms race. So, should we worry about the bad guys using AI too? Martin Borrett says it’s highly likely. ‘What we see’, he says, ‘is that the bad guys are collaborating - across the dark web they’re sharing tools and techniques.

And it would be naive to think that they’re not exploring machine learning, that they don’t have access to data scientists and they’re not exploring these boundaries. We’d be naive to think that they’re not exploring AI in some shape or form; it’s inevitable.