A threat is looming. In cyber security, there is always a threat looming, writes Tim Clark, software engineer, cyber consultant and defender against the dark arts.

An Advanced Persistent Threat has successfully installed malware on one of the development servers in your network. Maybe one of your engineers clicked on a phishing link? Maybe they hacked in through some vulnerability in your firewall? Maybe it’s an insider who snuck in a USB stick loaded with the program?

That doesn’t matter now. All you can think of is your intellectual property. All of the code you have invested thousands of hours and millions of pounds into is on those servers. You scramble to put together a team to investigate this. Meanwhile the attackers start looking through all that valuable code on the server.

You desperately try to identify the compromised machine and shut it down. You struggle to find it. Should you just pull every plug now? The disruption would cost a fortune, effectively leaving all of your 135 developers unable to work. Meanwhile, the attacker silently disappears back into the internet. They achieved their objective.

Man versus machine

There are inherent limitations when it comes to securing a network using a human security team. People are expensive. Salaries are almost certainly the largest chunk of your budget, because you pay more for skilled people who know the current state of the threat landscape and can adapt as it shifts.

They must also sleep, take holidays and sometimes fall ill. 24/7 monitoring is key to ensuring you are protected from attackers who are never off the scene, but achieving this with a human team is prohibitively expensive for many organisations. And there is still a risk of something being overlooked or an undetected insider threat.

In the world of security, defenders are at a distinct disadvantage. In our new world, we face an avalanche of increasingly sophisticated threats. The devices on our corporate networks are increasingly heterogeneous and may not even be entirely managed by us.

In the face of all that, we have to be secure all of the time. From John in Accounting who needs to avoid clicking on that funny-looking link; to Sara in development who has to mitigate against SQL injection vulnerabilities in her code. Threat is persistent and pernicious.

Your attacker only has to be lucky once

Often, attackers can be inside a system without your knowledge for months. In 2020, a supply chain attack on Solarwinds Orion (dubbed ‘Sunburst’) affected at least 200 organisations worldwide. Most notably, attackers had access to the systems of the US federal government for eight to nine months.

While the idea that an attack could go unnoticed is horrifying, there are steps that can be taken to mitigate risk. In the world of cybersecurity there is a new concept emerging which aims to support organisations in their fight against existing and emerging threats. Defence through machine learning.

Machine learning (ML) is already revolutionising many industries, and is starting to become more prevalent in the cybersecurity industry. The key question we hope to answer is ‘why?’

Why should you use ML-based solutions in your security management? And why should you choose them instead of or alongside more traditional solutions?

AI is faster than humans

As you type, user inputs from a keyboard are transferred over a wired or wireless connection, decoded and mapped to a specific letter. All within milliseconds. Computers are astonishingly fast, and can make decisions at the speed of light.

Humans find traversing and analysing large data sets laborious, and sometimes impossible. We are great at being creative and solving problems, but computers are way better at maths. This is useful to apply in cybersecurity, because we can hand the tedium of searching our log files or network traffic over to ML.

Then, once an anomaly is detected in the data, we can hand it back to a person who can investigate it further and determine what actions need to be taken. This idea is called anomaly detection, and was originally proposed for application to Intrusion Detection Systems (IDS) in 1986 by Dorothy Denning, an American security researcher.

Instant detection, instant response

In more modern applications of ML to cybersecurity, decisions can be made by the computer in order to provide an instant response to anomalies.

For example, if we detected that credentials that belong to an employee based out of a London office were suddenly being used by someone using a residential IP address in Kolkata, something fishy is probably happening. In response to this anomaly, we could automatically shut down the connection and block them before they try to escalate privilege.

This, of course, could be done by a person looking at graphs and log files, but in a large organisation (or a small one with a large IT inventory), you’re going to need a lot of people. The key thing to note here is that we aren’t using a traditional approach of defining and detecting ‘misuse‘, we’re constantly analysing data to define what can be considered ‘normal’ and then detecting things that significantly differ from that.

This is a really great advantage to applying ML techniques to cybersecurity.

Adapting to unknown threats and new environments

ML is designed to adapt. That’s the great thing about it, and why it’s becoming so widespread in its use from learning about user activity to tailor content to them (think Netflix and YouTube), to identifying different types of plant species and performing speech recognition.

For you

Be part of something bigger, join BCS, The Chartered Institute for IT.

Organisations are having to secure an increasingly large and more diverse IT estate. With the rise of remote and hybrid work, the attack surface is wider than ever. ML can be used to monitor this wide range of devices, containing any threats that may arise. Even simply alerting about threats as they appear can provide much needed visibility to allow smaller teams to act quicker. ML can adapt and learn the ‘normal’ patterns of the devices on your network and quickly identify anything extraordinary.

This technology can also be applied to our cloud solutions. Whether it be software, infrastructure or even a complete desktop, many things are increasingly being provided ‘-as-a-Service’. Protecting cloud accounts requires monitoring for unusual user behaviour using trusted credentials, to detect possible threats or intrusions. If someone starts spinning up virtual machines when they work in payroll, you probably have some credentials that have been compromised.

We don’t know our unknowns. Often, advanced threats hide on a network for a long time, slowly putting the pieces together to stage an attack. Attacks are inevitable in the modern world, the key is to detect intrusion and malicious behaviour as soon as possible and respond. Can you imagine trying to search through every hard drive, every packet, every office to find something out of place? Instead we can rely on our algorithms to do this for us, and detect strange behaviour.

Zero trust out of the box

Unlike people, ML does not inherently trust. Increasingly, attacks caused by insider threats are becoming more common. Whether it be a scorned employee, an ex-employee who still has access to important systems or an employee who wittingly or unwittingly is taken advantage of by an external attacker – insider threats can cause huge damage.

The key factor here is trust. We trust our employees to handle sensitive information responsibly, keep passwords secure and not to abuse their access to internal systems. It can cause huge damage when this trust is abused. In 2014 the Lotería de Puerto Rico was infiltrated with help from an insider who had access to the servers, causing millions of dollars to be lost in prizes given out compared to ticket income. This was orchestrated by a drug and gun-running cartel who were using the lottery to launder money.

Sometimes insiders abuse their access for personal gain. In February 2022 a Sussex officer was caught using the national police database to search for a woman he wanted to date. You’ve doubtless also heard of people installing cryptomining software on corporate machines to make money using company infrastructure (obviously this also raises the issue of unauthorised software running on company machines).

The solution is to use zero-trust. We ‘never trust, always verify’, and in this sense our ML solutions are searching for threats everywhere. They don’t make assumptions that a senior engineer is more trustworthy because they’ve been with the company for seven years, it assumes nothing. A threat can come from anywhere, so you want your cybersecurity solution to always be on the lookout for malicious activity, regardless of how well-behaved a device or account has been in the past.

ML can also be used to detect if too many permissions have been allocated to a user. If a user only uses certain functionality of a piece of software, their permissions could be limited automatically or after human review to only allow them to access what they require. It would also be possible to detect inactive accounts and automatically lock them down to prevent misuse. This is vital in reducing the impact of compromised credentials.

One of the simplest vectors to use in compromising a network is via email. Using either a malicious payload or through social engineering, a trusted user can have their device or credentials compromised unwittingly, allowing an attacker to misuse them. This is too common, and monitoring account activity to detect misuse is key to identifying and stopping these threats before they can penetrate deeper into your infrastructure.

Its learning gets better over time

The most significant advantage of ML is in the name. ML algorithms improve over time as they observe and monitor your infrastructure. Instead of trying to simply block attacks, ML can detect and respond to them. Instead of assuming that malware on our systems will match a particular signature, ML can detect malicious processes and block any extraordinary network traffic coming from them. As we know, insider threats are also a possibility and ML can detect this abnormal behaviour.

Past data can result in increased accuracy in future predictions and in determining what normal behaviour looks like. A notable disadvantage of using ML like this is that it can be difficult to onboard new devices or new employees. As there is no history of activity, it is hard to determine what ‘normal’ is.

Therefore, there are two possible outcomes, either a false positive where activity is flagged as malicious by mistake, or a false negative where activity is malicious, and isn’t detected. Most systems for obvious reasons will tend toward the former. This can be mitigated by creating ‘profiles’ of different types of users – a new hire in human resources will probably have a very different pattern of activity when compared to someone in development.

Another emerging threat to consider is attacks perpetrated using ML – either on its own or in conjunction with a human adversary. The machine could probe the network for open ports, identify any applications running on them, then cross-match this with reported vulnerabilities so the adversary (or sophisticated ML) could then attempt to exploit them. This type of attack can happen at machine speed so requires a prompt response, and ML can be used to adapt to these threats so your network will survive.

ML will continue to revolutionise cybersecurity in the future, just like it has changed so many areas in IT. Remember though that there is no such thing as a silver bullet, and ML should be paired with more traditional security solutions. With a holistic mindset, and keeping your practices and policies up to date, you have every chance of building a strong cybersecurity defence.

About the author

Timothy Clark is a Full-stack Software Engineer who also works as a Cybersecurity consultant, defending against the dark arts. Clark is currently chair of the BCS Preston & District branch and sits on the Early Career executive. He is also a journeyman in the Worshipful Company of Information Technologists’.