Business Continuity Planning (BCP) and related IT Disaster Recovery Planning (IT DRP) have become established, respected and important disciplines in business. However, many of the assumptions underlying BCP/DRP planning today will NOT hold in a pandemic, writes technology risk expert Dr Patrick McConnell DBA MSc BSc FBCS CITP CEng GAICD.

The typical solutions, which essentially involve moving people away from compromised sites to ‘back up’ locations, just won’t work in a pandemic, such as with coronavirus COVID-19: it’s one virus that will not affect computer systems.

Instead, it will be people rather than infrastructure that will become unavailable. And it is the largest firms, with multiple overseas offices and highly centralised support functions, that will be most at risk from staff being unavailable for work.

In a pandemic, computer systems will continue to work, provided that key operational and maintenance staff have uninterrupted access to their control terminals. Telecommunication networks will continue to work - again, provided that network and security managers have the essential monitoring and control capabilities.

‘So, unlike other business continuity situations, where restoring technology is often the greatest problem to be overcome, in a pandemic, technology, especially telecommunications technology, will be the solution, not the problem.’

The nightly television pictures from the city of Wuhan in Hubei, China showed empty roadways and office buildings with most of the city’s 11 million people hunkered indoors under strict mandatory quarantine rules. But, in the first large epidemic since the development of mobile phones and social media, inhabitants in Wuhan were not completely ‘isolated’ but were able to communicate via apps with family, friends and importantly medical authorities. Apps such as WeChat, Weibo and WhatsApp not only allowed video communication but also provided access to online banking services and e-commerce, such as purchasing food. Remaining relatively safe at home has become a viable option in these extreme circumstances.

But what about work? How did firms in Wuhan survive, and how would your firm continue to operate if a large proportion of the staff were not ill, but could not get into work for an extended period?

The concept of ‘working from home’ on an infrequent or even regular basis has become the norm in many firms and especially in IT departments. With the advent of Outsourcing and now Cloud Computing, the notion of people ‘working remotely’ has become commonplace.

Thus, good telecommunications capabilities are key to firms continuing to operate in what will, if / when a full-scale pandemic is declared, be a prolonged disruption.

The COVID-2019 pandemic

Although the COVID-2019 coronavirus epidemic may not blow up into a full-scale global pandemic, it could be close. Pandemics are far from unique events. This century alone there have been numerous serious outbreaks of deadly virus-borne diseases: in 2003, the H5N1 Avian Influenza; in 2004, the Severe Acute Respiratory Syndrome (SARS) epidemic; the Middle East Respiratory Syndrome (MERS) epidemic of 2012 (which is still circulating); and the deadly Ebola virus outbreaks of 2014, which re-emerged in 2018.

Simply, these and similar diseases emerge when a virus that is circulating widely (but relatively benignly) in the animal kingdom jumps to infect a human in a random event. What happens at that point determines much of what follows: the infected human may fight off the virus and nothing untoward happens, until the next time a jump occurs. Or, the virus may kill the human before the virus jumps to another human, again causing no further occurrences.

It starts to become dangerous where one person infects another, who infects another and so on, until before long, the virus is in the general population. Some viruses such as Measles are more infectious than others, while some are less so. Unfortunately, some viruses, such as MERS, are more deadly to humans than others and people die from the viral infection.

So, the so-called mortality rate of a particular virus is important; a virus with a high infection rate and a high mortality rate can turn an epidemic that cannot be contained locally into one that grows into a worldwide pandemic with disastrous consequences. This occurred in the so-called Spanish Flu pandemic of 1918-1919, which killed more people than did the World War that preceded the outbreak.

While each epidemic is different, driven by the unique characteristics of the particular virus, there are things we do know about all pandemics:

Pandemics are inevitable!

Because the animal kingdom holds a huge reservoir of viruses that are potentially deadly to humans, it is only a matter of time and evolution before a virus mutates to a form that can ‘jump’ and infect humans. The more animals, the more humans and the more viruses, the more likely such an essentially random event will occur.

Some of these viruses will be infectious and will jump to other humans. Inevitably, some of the viruses will spread in the general population (an epidemic) and even may, with modern air travel, appear in different countries almost overnight – where again, the epidemic grows in local populations, becoming a pandemic as defined by the World Health Organization (WHO).

Pandemics are inevitable, but because a lot of things have to go right - for the virus that is - they are thankfully rare.

Pandemics are difficult to stop

It is very difficult to stop a virus outbreak, especially one where the virus is highly contagious. Because the initial symptoms are often similar to those of seasonal flu or common cold, novel viruses can circulate widely in the population before being detected by medical staff. There is often a period of extreme confusion before a new virus is identified and a good diagnostic test is developed. During this ‘onset’ phase, people will inevitably die. Rumours will abound and some of the public will panic.

One obvious way to slow a deadly virus spreading throughout a population, is to stop people infecting one another, which means isolating or ‘quarantining’ people who have been, or might be, infected. Isolating whole communities is a pretty big step for authorities to take - but, as the mandatory isolation measures in Wuhan showed at the beginning of the latest virus epidemic, it can be pretty effective. However, isolation does not kill the virus, it merely slows down the rate of infection through the community.

What is needed is a vaccine specially designed to recognise and kill the particular virus, without killing the person who is carrying it. The experts in viruses, virologists and microbiologists, know how to do this - but it takes time. And, from first identifying a new viral strain to getting an effective vaccine to the majority of the population will take many months.

For that period, quarantine is the only effective mechanism for treating a pandemic. People must stay isolated, at home or elsewhere, possibly for many months.

How long will a pandemic last?

No one knows precisely how a pandemic might unfold. However, the three influenza pandemics of the 20th century give some clues as to what might be expected. Figure 1 shows a rough timeline of how a pandemic might evolve, illustrating the scale of disruption and the highly uncertain time taken by each phase.

Figure 1 - Rough timeline of a pandemic

The key points to note in this diagram (1) are that: during the ‘onset’ of a pandemic, there will be much confusion and panic in the community; there will then be a period of ‘maximum disruption’ that may last several months as the impact of multiple outbreaks works its way through the global economy. Then, as authorities come to grips with the outbreak, a vaccine will be developed; a slow ‘prolonged recovery’ will then begin, as firms strive gradually to mend broken supply chains. It is also possible, as in the devastating Spanish Flu outbreak, that the original virus mutates and a second, possibly even more dangerous, wave of infections begins.

Pandemics then are not short, sharp incidents but prolonged and relatively slow-moving events. Their trajectories are highly uncertain and depend on the toxicity of the particular virus and the effectiveness of the measures taken to minimise its effects. But one thing is certain: a pandemic will take a long time to work through the global economy. Though an inadequate analogy, management should think of the disruption illustrated in this rough timeline as similar to that caused by severe weather events (snowstorms, floods, hurricanes etc.) but lasting for 6 to even 18 months! Even the best-prepared business must also recognise that their customers and suppliers make take even longer to recover - if at all.

To this point, we have talked about viruses and people with proper regard for their health, but what of commercial businesses and the staff who work in them? Since different firms in different industries will face different problems, there is no ‘one size fits all’ solution - most solutions will concentrate on physically separating workers to keep them safe and relying on telecommunications technology to help them continue to operate the firm. But to be effective, such measures must be planned in advance.

Practical first steps in developing a pandemic plan

The following first steps are suggested for ‘jump-starting’ the development of a business continuity plan specifically designed to mitigate the impact of a possible COVID-19 pandemic.

Corporate governance

It is suggested that firms (if they have not already done so):

  • Immediately, set up a pandemic planning and coordination unit (PPCU) ideally within the enterprise risk management (ERM) department, as part of the existing operational risk / BCP function.
  • Staff this unit with at least the following skills:
    • Medical expertise, to provide independent, objective information on the background, status and potential trajectory of a possible pandemic, such as using part-time experts from a nearby university medical department. Note: additional medical safety measures, such as making high quality face-masks available, will be needed.
    • Communications expertise, to develop material for distribution to customers and staff on the impact of the pandemic.
    • IT experts, to develop and operate public/private web sites and firm-wide communications capabilities, such as videoconferencing and social media sites.
    • Telecommunications experts, to ensure efficient, secure and robust access to corporate information (for staff working remotely) and to develop and promote the effective use of voice and videoconferencing.
    • Security experts, to ensure that premises and the staff remaining in them are secured and to liaise with civil authorities, ensuring compliance with changing regulations.
  • Identify and assign senior executive responsibilities for initially overseeing and, should the need arise, taking control of pandemic planning and coordination activities.
  • Immediately, raise the issue of pandemic planning to the risk committee of the board for detailed oversight, placing the issue on the agenda of every board meeting going forward.
  • Run education workshops for the board and senior executives, to explain the risks of a pandemic and to discuss options for mitigating the risks and to articulate the ‘risk appetite’ of the board.
  • With direction from the board and senior management, develop policies for operating the business during and after a pandemic. Such policies might include: priorities for reducing risks and slowing down businesses; changes to delegated authorities; changes to staff entitlements and remuneration; interacting with the firm’s regulators; and so on.
  • As a demonstration of senior management’s commitment to competent pandemic planning and to ensure that supporting technologies are working satisfactorily:
    • Hold a number of up-coming board meetings entirely by teleconference, i.e. with all board members participating from home or secure remote locations.
    • For each executive committee meeting going forward, ensure that one or more of the members participates from home or secure remote locations.
    • Set aside special areas within the firm’s premises to emulate remote working and ensure that all staff periodically spend time (such as a day) in such an environment, to iron out communications and operating problems.

Identification of key pandemic risks

Identification of key risks in a pandemic should concentrate on the following areas of risk:

  • People and organisation risks - a ‘map’ of the entire organisation should be developed that shows not merely key responsibilities but also the linkages and dependencies between business functions. For each function, a list of key roles and individuals holding (or capable of holding) those roles should be developed and for each function, the degree of actual and potential ‘operational autonomy’ should be evaluated. The goal of such a map would be to identify potential “key people” risks in the current organisation and to highlight where mitigating actions, such as staff transfers and increased decision-making delegation would be beneficial. It should also be recognised that some of these key people may not survive a serious outbreak and multiple options must be available - no one is indispensable.
  • Process risk - a map of all major ‘end-to-end processes’ in the organisation should be developed that shows not merely key operations performed in each process but also potential bottlenecks and critical internal and external dependencies. The goal of such a map would be to identify potential “key process” risks in ‘core’ processes, such as where increased volumes might overwhelm current - never mind depleted - resources and to highlight where mitigating actions, such as increased automation, would be beneficial.
  • Systems risks - a map of all major systems in the organisation should be developed that shows not merely key technical attributes but also dependencies on human intervention, especially IT Operations. The goal of such a map would be to identify potential “key systems” risks, such as where a large degree of human intervention is needed to access information to operate the business and to highlight where mitigating actions, such as increased electronic report production, would be beneficial.
  • Telecommunications risks - a map of the entire telecommunications network supporting the organisation should be developed that shows not merely the network topology but also capacity bottlenecks within the network. The goal of such a map would be to identify potential “key systems” risks in the current telecommunications infrastructure and to highlight where mitigating actions, such as increasing capacity, would be beneficial.
  • Supply chain and outsourcing risks - all outsourcing and supply chain dependencies across the organisation should be evaluated from an operational risk perspective, considering: (a) contractual agreements; (b) current performance; (c) problems with current performance; (d) vulnerability of the supplier / outsourcer to disruption; and (e) vulnerability of the firm to non-performance by the supplier / outsourcer. The goal of such an analysis would be to identify potential “key external” risks in current arrangements and to highlight where mitigating actions, such as reduction in dependencies, would be beneficial.

Planning for a pandemic

Since people will inevitably be isolated at home in a pandemic (either by choice or government mandate), it makes sense to plan for staff working from home. But that is much easier said than done.

Today, many workers, especially white-collar staff such as IT professionals, work some of the time from home. It has become common practice for staff such as IT analysts and systems designers to sometimes take work home; usually to get away from office disruptions. Today, inexpensive, fast and reliable telecommunications technology is available to support effective remote working. But such capabilities will almost certainly have to be beefed up considerably to handle the immense increase in demand during a pandemic.

However, many staff just cannot work from home and must be physically present in an office or a facility, such as a computer centre. Staff who interact directly with customers must be in secure, recognised premises and those who accept and dispatch deliveries must be present in warehouses.

One of the best ways to find out which staff are absolutely essential to the continued operations of a particular business is to visit the firm’s premises on the weekend or on a public holiday. There you will find the staff that keep the firm ticking over, day by day: the IT operations staff; security staff, maintenance engineers, warehouse staff and cleaners. Without these often-overlooked staff, a firm will grind to a halt fairly quickly.

It is the staff at the bottom of the organisational hierarchy that turn out to be most valuable when considering the impact of a pandemic. And for a business to survive, these workers must all be taken care of!

Definitely not business as usual

‘The most important thing to recognise about a full-blown pandemic, is that operations cannot be considered to be business as usual (BAU) but must become business as survival (BAS).'

During BAS, the full resources of the firm should be directed towards ensuring that the core operations of the firm keep running - albeit at reduced capacity. Pandemic planning must, therefore, be focused on keeping core businesses operating, specifically ensuring that there are always sufficient knowledgeable, competent and trained staff to keep the firm afloat.

In order to achieve the necessary ‘resilience’ then, executives must clearly identify what businesses and functions in the firm are essential to operating the core functions of the company. Which in turn, of course, means that the board and executive of a firm must clearly and unequivocally identify the business units that are critical to the firm’s survival (a tough and highly political undertaking).

However, this does not necessarily mean focusing on the most profitable businesses; rather those that the firm most relies upon to keep its licence to operate. In banking, for example, proprietary trading on the firm’s own account may generate considerable profits but, if necessary, it can be wound down or even mothballed for a few months, whenever the support functions are in danger. Likewise, acquiring new customers will not be a high priority in a pandemic, but retaining and supporting existing profitable customers will be.

In the IT organisation, IT architects and strategists (while essential to the long-term survival of a company) could quite easily ‘stand down’ for several months and the firm would survive. On the other hand, if the IT architects were to use their down time to develop new strategic architectures, that could prove very valuable in the longer term.

Furthermore, IT development projects can be shelved and resurrected later; IT developers and designers could usefully use their time on the bench not only in the undervalued tasks of documentation and code reviews but could undertake educational courses, such as the online courses provided by BCS. Provided they have the necessary spreadsheet capabilities AND access to the firm’s computer files, IT accountants and project managers can also work away from the office.

However, all of this must be planned in advance to minimise confusion when a pandemic emerges.

Managing contact risk

The first activity facing senior management when planning to manage a pandemic, is to identify which roles, functions and business lines can be temporarily side-lined and which can operate remotely. Executives will soon realise that they too can work remotely for a considerable period of time in a dispersed organisation, provided that the proper communications technologies are available and temporary organisation structures are created.

This leaves the staff who must be on-site: those people who will be most at risk when travelling to and from their workplaces and, when at their desks/workstations, would be working in close proximity to others. Probably the most important example in an IT context will be IT operations and the supporting hardware maintenance functions. Note: this is true whether the IT systems environment is in the cloud or not, as systems still have to be operated!

In order to protect IT ops staff during a pandemic, managers must ensure that physical interactions between operators will be minimised and that staff are provided with the necessary protective equipment, such as highest quality FFP (filtering face piece) protective masks and adequate washing facilities. Additional cleaning staff will be needed to reduce the likelihood of secondary infections and trained medical staff will be needed to ensure that no new infections are introduced, for example during shift handovers.

Customer first

‘Modern call centres are a petri dish for the efficient transmission of viruses.’

From an IT and business perspective, the most difficult areas to manage will be in those functions that have contact with customers and suppliers either face to face or (paradoxically) by phone. Today, modern call centres and trading floors tend to be large rooms packed with people constantly coming and going, under flexible working practices. They are a petri dish for the efficient transmission of viruses.

Again, technology can (at least partially) help to solve this problem, through increased usage of websites, email, direct messaging outgoing call technology. It should be recognised, however, that during a pandemic, call centre traffic will increase dramatically because alternative information channels, such as ATMs, may not be accessible for various reasons (not least due to absence of trained staff).

In a pandemic, a firm’s website and social media channels will become its window to the world and staff must be allocated and trained to communicate using these modern mechanisms. Some of the IT and business staff ‘on the bench’ can be used to produce and publish the necessary up-to-date information. But the key will be to make use of call recording and returning calls later, using staff working at home or in strict isolation in a firm’s premises.

Summary

Ensuring the survival of an organisation during (and immediately after) a pandemic will mean first setting up the organisational infrastructure needed to react to the onset of a pandemic and then planning for continued operations of the firm, at a (potentially significantly) degraded level.

It is important to realise that in a pandemic, it cannot be business as usual but must be business as survival. Information technology, especially telecommunications. will be key to a firm’s capacity to survive a pandemic and planning how this technology will be used effectively will be critical. Providing the capabilities to work remotely should be the overarching thrust of this technology, but it will not be the type of working from home familiar today. Sending all staff home with PCs to operate business as usual just won’t work, not least because some of the most critical staff, such as IT Operations, cannot work remotely.

Of course, when a pandemic emerges is certainly not the time to begin installing the necessary hardware and systems to support remote working. As with any BCP exercises, planning must be done in advance and people must be educated on the potential changes to their working practices and organisational structures in the event of a disruption.

Unfortunately, as pandemics don’t come along very often, there will be a tendency over time to downplay the risks, so board-level commitment will be necessary to keep such an inevitability on the radar, if not actually always front of mind.

(It should be noted that this article was written in late February 2020 and, in the nature of pandemics, events move quickly and though the prescriptions remain valid, some of the detail will become outdated quickly.)

References

See detailed explanation of Figure 1 in 'Banks and avian flu: planning for a possible pandemic'

About the author

Dr Patrick McConnell DBA MSc BSc FBCS CITP CEng GAICD Is an expert on technology risk, having worked as a CIO, senior manager and consultant for large corporations in several countries, for over 40 years. He has also been an academic and has written widely on IT and risk management; his latest book is Strategic Technology Risk (Risk Books 2018).