While the average employee sees no reason not to keep every piece of unstructured electronic data, including email and instant messaging, for businesses this accumulation of information can cause considerable headaches.

The average enterprise now stores massive amounts of data, and this, coupled with the fact that most of us are loath to delete anything for fear of not being able to find a document or message in the (highly unlikely) event we will need a key piece of information, means that the challenges and costs associated with managing this data and making it searchable and useful on a day-to-day basis are growing exponentially. Craig Carpenter investigates the true cost of information risk.

When the email administrator knocks on our office door or cube wall, our justification is elegantly simple: we should be able to keep as much email as we want because 'storage is cheap'. And to the untrained user this maxim seems accurate, as storage costs continue to fall. But while 'more data' may seem better to the average employee, it is most definitely not welcome to those of us living in the risk management, compliance and edisclosure worlds.

What is increasingly apparent is that as data volumes soar, the 'information risk' associated with this data rises in parallel, forcing enterprises to completely rethink how data is created, stored, shared and retired on all of their systems.

Storage costs: the ultimate red herring

Storage is indeed cheap - exhibitors give it away at trade shows and one terabyte external hard drives are currently retailing for less than £200. The upshot is that disk space on which to store myriad types of data continues to proliferate - to the tune of 60 per cent data growth every year according to IDC Research. But while the space is increasing it is, in turn, feeding our insatiable appetite to create, share and save data in increasingly large chunks and without a second thought.

However, when it comes to risk management, the direct costs of storage are not very relevant at all. In fact, they represent a mere fraction of the overall costs of managing enterprise data. At the exact time that the direct cost of storing data is decreasing, the indirect costs associated with managing this data are skyrocketing. These indirect costs can include data centre space costs (as storage devices require rack space in which to live), energy costs (which are rapidly rising) and personnel costs (more devices require more technical staff to manage them), all of which are rising to varying degrees. Furthermore, with all this unstructured data now being kept, it is increasingly difficult for employees to find the information they are looking for and that is relevant to the job.

A recent survey conducted by global management consultancy firm Accenture revealed that managers currently spend up to two hours a day searching for information that is necessary to their jobs, and when they do find it, it is often wrong. In knowledge-intensive industries, extensive research and referencing are a vital component of a company's day-to-day practices, so organisations need to be sure they possess the adequate tools to allow them to quickly categorise, collate and ultimately find relevant information.

But making this data easy to locate is only one side of a bigger problem - in a culture of increasingly stringent compliance regulations, the associated costs are largely responsible for the added expense of managing enterprise data.

The two-headed monster of compliance and edisclosure

From more humble beginnings, compliance costs have increased dramatically over the past six years, especially in the US - coinciding with seminal legislation including Gramm-Leach-Bliley and Sarbanes-Oxley (SOX) in the wake of the Enron and Worldcom meltdowns. SOX compliance alone has cost US businesses some $32 billion since 2002, with 2007's price tag a cool $6 billion (sources: AMR Research; IDC).

Not to be outdone, the cost of identifying, preserving, collecting, processing, reviewing, analysing and producing data for litigation (otherwise known as edisclosure) is even larger at $12 billion and growing to $22 billion by 2011 (source: IDC). Gartner states that the average edisclosure event - including regulatory investigations and lawsuits - costs $1.5 million with the average $1 billion revenue US company facing more than 500 lawsuits at any given time. The main factor driving these costs? The amount of data that must be collected, reviewed, analysed and produced. And therein lies the problem.

While businesses in the UK may not yet be subject to the same degree of scrutiny as their US counterparts, it would be wise for these organisations to be fully prepared for dealing with such events. Businesses around the world that trade in the US must already adhere to SOX regulations, and with the Companies Act 2006 coming into force in October this year, the cost of compliance in the UK looks set to soar.

Furthermore, with UK consumers becoming increasingly savvy about the Data Protection Act, customer-focused businesses must ensure they do not fall foul of these regulations just by accidentally storing too much customer information, or by keeping this data for too long, rather than deleting it at the appropriate time. Missing one data record can leave companies in breach of the rules and subject to hefty financial penalties.

In the US, a ballpark rate for edisclosure costs is in the range of $2,000/gigabyte of data, so a case with 200 gigabytes of data will generate $400,000 in edisclosure costs alone. But this is where the data growth phenomenon, driven by ever-cheaper storage costs, comes into play. With data volumes growing 60 per cent every year, edisclosure costs are, by definition, growing commensurately - and quickly outstripping legal department budgets set up to handle these issues. These costs have become so significant that they are increasingly forcing enterprises to rethink their approach to data creation, sharing, storage and deletion.

A different approach to data management

Risk managers, compliance officers and litigation managers are taking increasingly active roles in the management of the data life cycle within their companies. Historically, these groups' concerns may have fallen upon deaf ears, but with information risk occupying more and more of enterprise budgets, they are increasingly finding an interested audience at the executive and board level.

In the past these groups have collectively viewed data - in as much as they viewed it at all - as the province of individual users who are allowed to do what they wished so long as they stayed within the bounds of HR guidelines. Today, however, many enterprises are seeking to get the most out of their data while simultaneously getting (and keeping) their 'information house' in order.

First, rather than putting data into separate silos, many are making all or nearly all pieces of information securely searchable to those who have appropriate access rights, as this allows risk managers to figure out where each piece of data is and how long it should be kept there, while simultaneously increasing the utility of corporate data. Second, many enterprises are deleting redundant data in an effort to gain control over exploding data and storage centres.

Another benefit of this 'deduplication' exercise is that it helps enterprises identify what data should be kept, enabling them to start retiring data that should not be kept - which is important to effectively respond to investigations and lawsuits. Finally, enterprises are building and scaling their IT infrastructure with risk management needs squarely in mind, including records management, edisclosure, compliance and knowledge management. Although these efforts have just begun for most, with so much potential risk at issue, proactive enterprises will find themselves with a distinct advantage over their laggard peers.

About the author

Craig Carpenter, general counsel and vice president of marketing, oversees all aspects of marketing at Recommind. He has extensive experience in the enterprise software, information security and e-disclosure industries, and is a frequent speaker and panelist. Craig is also an adjunct faculty member at the University of San Francisco where he teaches graduate classes on high-tech marketing, content management and digital rights management (DRM).