AI in the cloud

David Bird FBCS explains how progressive machine-learning capabilities present new challenges for privacy and protective measures.

In 2017, a cloud-based artificial intelligence (AI) strategy was definitely perceived to be important enough for Microsoft to spawn a new cloud AI platform organisation; and introduce a new service for training deep neural networks on the Azure Platform.

Presently the recognised dominant cloud AI players are: Amazon Web Services (AWS), Azure, and Google. A recent prediction indicates that public cloud AI services may become the predominant machine intelligence model compared to traditional datacentre approaches.

Cloud AI services

AWS and Microsoft certainly can entice a steady demand from enterprise users by providing customers with the ability to test their machine-learning algorithms on their platforms - both are also making strides in cognitive services. As well as Google, IBM are also in the AI market but, unquestionably, Google is the technology leader, to date; however, both are yet to entice a similar customer cloud adoption rate to that achieved by Amazon and Microsoft.

This might start to change, in IBM’s case, with the opening up of IBM’s Watson Application Programming Interfaces (API), which are useful to niche industries like healthcare. In the case of Microsoft and Google, Field-Programmable Gate Arrays (FPGA) and non-programmable application-specific integrated circuits, respectively, are used under the hood. AWS offers a Graphics Processing Unit Elastic Compute Cloud (EC2) option and even FPGAs with a number of code libraries to enable customers to produce machine-learning algorithms and APIs.

Collaborative ventures are also occurring between AWS and Microsoft’s AI and Research Group, generating the Gluon open-source deep learning API; this will enable developers to prototype, build, train and deploy advanced machine-learning models in the cloud. Google’s TensorFlow library can be used cross-platform for neural network-centred machine-learning applications. Google’s Deep Mind is an example of an AI system that employs deep reinforcement learning and neural reasoning with extended memory to lock away data nuggets for recall later.

Socio-technological issues

In addition to the release of NHS patient data to Google’s Deep Mind - allegedly without patient consent - AI provides opportunities, being pursued by Facebook and Twitter, to more effectively target social media users by analysing patterns in threads. This calls into question privacy issues with the number of aggregated datasets being accumulated under the ‘big data’ umbrella and the potential of data bias. Fail safes need to be introduced for machine-generated data and machine-interpreted knowledge, so erroneously misinterpreted ‘big data’ could not cause unexpected decisions or malicious outcomes from analytics.

An example of the dangers of bias was unexemplified by Microsoft’s Tay, which was an AI-powered chatbot that, unfortunately, learned predominantly negative interactions and misinformation from Twitter exchanges with humans. In essence, governance and safeguarding controls need to be applied to the AI sphere due to the massive datasets that are involved. Luckily Deep Mind has a form of inbuilt collaborative governance, which is reassuring.

With this first occurring in the 90’s, the timing of this second AI renaissance is convenient with the General European Data Protection Regulations (GDPR) 2016/1148 coming into force; both cloud service providers (CSP), as well as customers, have obligations under GDPR. One of the key pillars being ‘consent’ where individuals must opt-in and this, in principle, provides data subjects with more control. As a consequence, any breaches contrary to GDPR could result in organisations being fined up to four per cent of group worldwide turnover.

CSPs in the UK have been reinforcing the ‘shared security responsibility’ aspect of cloud security, whilst providing support to help customers to achieve GDPR compliance; they have also been reiterating customer cloud responsibilities for accountability. International frameworks, such as the US Privacy Shield, provides attestations from overseas organisations to protect EU Personally Identifiable Information (PII); undoubtedly this will be relevant for both cloud and cloud AI paradigms.

Meanwhile lessons learned from non-AI cloud situations provide some warnings that cybersecurity needs to be taken seriously in the cloud. In 2014 the One More Cloud breach shows that the use of flawed APIs and compromised keys can be a root-cause of an attack. In 2015, a developer accidentally published his Simple Storage Service API keys in his Github account for just five minutes - enough time for a mining bot to seize them and hackers to then spin-up EC2 servers overnight to farm Bitcoins at his cost.

The recent newsworthy TIO Networks situation has proven that vulnerabilities with cloud platforms present an attack surface; that in PayPal’s case might have enabled unauthorised access to potentially 1.6 million PII records.

Uses: both darkness and light

As the cybersecurity arms race continues, threat actors are also expected to make use of AI in order to eke out weaknesses and vulnerabilities across exposed online IT product lines. In 2016, it was predicted that AI will be used by cyber criminals; whether it be via social engineering through chatbots or by employing their own AI systems as cybercrime actors.

However, the concept of machine intelligence for cyber defence has been heralded for some years now. It can be used to recognise normal patterns of network traffic in order to detect the abnormal. Today such capabilities are conjoined and enhanced for use on cloud platforms in various contexts:

Specialist vendors like FireEye offer cloud AI as a back-end, which in itself is securely connected to front-end sensors; thus, enabling them to detect, alert and monitor in real-time - countering progressive asymmetric dangers such as advanced and persistent threats.
Cyber defensive measures such as Microsoft’s threat intelligence machine-learning is available to protect Windows 10 clients and Azure public cloud customers when agents are deployed into customer virtual instances; this provides a defensive ecosystem that can provide visibility of the threat horizon. AI has the potential to go further to perform predictive analytics.
Google has embraced AI in its Gmail service as a mechanism to protect users by searching and blocking emails laden with malware or phishing campaigns.

Although in Google’s case this could be dismissed as a moot point because they had previously scanned personal emails to target users with advertisements - it is clear, however, that machine intelligence in the cloud certainly brings distinct advantages when it comes to crunching large datasets.

Adoption of AI

The internet of things (IoT) brings about popular AI-powered engines for speech pattern recognition such as Apple’s Siri, Microsoft’s Cortana, Google’s Cloud Speech API and Amazon’s Alexa; these are examples of cloud capabilities that perform speech recognition sourced from iOS, Windows 10, Google Home and Amazon Echo virtual assistant front-ends. In this age of IoT, edge systems can be the conduit to send insights to cloud back-ends - large datasets can be processed, filtered, searched or mined using AI technologies.

For example, the European Space Agency (ESA) has been using AWS EC2 since 2009 for galaxy mapping purposes; by 2017, ESA had embarked on a journey to assess the feasibility of integrating space-based services driven by cloud AI through the ‘Satellite Imagery and Intelligent IoT for Agriculture’ project.

The advent of AI capabilities in the Cloud 2.0 era brings about a progressive leap in capability and a multitude of opportunities. A recent report has identified that the US Department of Defence can make use of cloud computing and ‘big data’ analytics to exploit large stockpiles of untapped dataset resources.

With technology maturation, regulation and control-measures machine-learning can certainly benefit mankind and potentially start a new digital information evolution.