The response time at which services are delivered to an end user is one of the most important considerations for your business says David Flower, VP EMEA Gomez. He looks at service level agreements and the principles on how to measure performance in the cloud.

As tempting as its costs savings and scalability may be, the risks of the cloud are now coming to light. Availability and security concerns have dominated this discussion but the performance of the cloud - the response time or speed at which services are delivered to an end-user - may be one of the most important risks to your business.

With broadband at critical mass, and with connection speeds continually increasing, all customers, B2B and B2C, now expect Google.com or Amazon.com-like response times. But should we expect that Amazon’s EC2 service delivers the same speedy performance one experiences when shopping at Amazon.com?

Many small businesses are now jumping on cloud services. However the CIO or CTO or a large complex infrastructure has a much longer check-list of performance considerations. Big or small you require assurances that your cloud provider will deliver the service your business needs. But do the service level agreements (SLAs) offered today by cloud providers meet your expectations for accountability and guarantees?

Levels of cloud services

The cloud promises computing as a utility: off-premises, on-demand, easily scalable (elastic) and paying only for what you use.

Infrastructure as a Service (Iaas) - this is typically a server and storage connected to the internet - a blank page on which to build the underlying platform and every element and application requirement in your infrastructure. Examples include Amazon EC2, Mosso and 3Tera.

Platform as a Service (Paas) - here the underlying platform is abstracted out and you’re given an on-demand solution stack, a development environment on which to build your necessary applications. Examples include Google App Engine and Force.com from Salesforce.

Software as a Service (SaaS) - this is the most mature of the cloud options, providing turnkey applications on-demand, usually accessible from a web browser. Examples include SAP, Zoho and Gomez.

Many cloud SLAs promise 99.99 per cent uptime. But what does that mean to your business? Does availability mean that the server is running or that applications are performing to your specifications? Is availability measured from within the cloud’s firewall or from end users’ actual devices?

CIOs need to ensure that a cloud SLA addresses the company’s specific business needs. This means that every service in the delivery chain has to have someone responsible and accountable for it - just as they would in a non-cloud infrastructure with its detailed service level objectives (SLOs) from internal teams and SLAs from outside vendors.

But if you’re outsourcing vital portions of your infrastructure to the cloud, many of those elements are handled outside your direct control. So who is accountable if one aspect of that service performs below expectations? Watching for these potential cloud disconnects is an important part of your due diligence in evaluating cloud services.

With IaaS, elasticity is the most promoted benefit. Elasticity equals velocity plus capacity. That means a quick ramp-up during peak customer usage periods - and only during those times. But will this ramp-up be fast at all times of the day and across all geographies? And how much additional capacity can you get? Will it be enough?

Although not an explicit benefit, connectivity is certainly implied. And you assume you’re getting fast servers with redundant internet connectivity in multi-homed data centres with good peering relationships to major networks nationally and internationally.

Similarly with PaaS, what are the implied performance guarantees? With Google App Engine you assume the underlying service is performing at adequate speeds for your business. Velocity and capacity are a given. With PaaS this happens transparently based on the number of customers using the system. But are all APIs functioning at mission-critical levels at all times or will a spike in usage slow down with the underlying performance?

With SaaS many of the same performance considerations apply. Can you be sure a transaction made in your Paris office is available minutes later for use by the San Francisco team trying to close an end of quarter deal?

A CIO who has optimised his existing IT infrastructure wants answers to these questions and others, as he considers migrating to the cloud.

In its 2008 report ‘Is Cloud Computing Ready for the Enterprise?’, Forrester Research said that cloud platforms are maturing but would not be enterprise ready for two or three more years. Part of this maturing, we hope, will be the inclusion of detailed and relevant performance guarantees in cloud SLAs along with real-time performance monitoring by providers.

Remember too that whether your installation is simple or complex, all cloud services have one thing in common: they rely on the internet to satisfy the needs of end-users. So regardless of which provider is engaged, cloud services do not relieve IT managers of the responsibility of conducting their own ongoing performance monitoring of all web applications delivered by the cloud.

Guidelines for measuring the performance of the cloud should always include:

  1. Getting clear on what your business needs from the cloud, then testing based on those expectations, implicit and explicit.
  2. Testing before deployment and continually in production, due to the constantly fluctuating nature of the cloud.
  3. Knowing your end-users - how they connect to the internet, their location, times of day they log on, even which browser they use - and delivering on their expectations.

Once you’ve established your performance benchmarks demand an SLA from your cloud provider that addresses all your performance needs. Cloud SLAs are a work in progress and will evolve only when IT professionals demand it. Right now the client is king, as cloud providers look to fulfil the promise of utility computing, but without the risk.