Rob Aitken FBCS, Director of Technology at Arm, considers how we can tackle the divergent demands of climate change and tech-based climate solutions.
It’s hard to think of any product with a performance curve like that of the silicon chip. Its invention and ongoing miniaturisation have driven vast advances in speed and accuracy and transformed computers from the legendary room-sized behemoths of yesteryear to the highly efficient handheld device you may well be using to read this article.
We are conditioned to expect our devices to get better with every generation and this is, in part, to do with our changing perception of Moore’s Law. It’s been a gradual change, so it’s hard to pinpoint precisely when it happened, but at some point, Moore’s original concept - that the most cost-effective transistor density point doubles every 18 months - morphed into a kind of blueprint for progress.
This, in turn, created the expectation that every generation will be faster, cheaper and more power-efficient than the one before. More recently, though, this has shifted again, leading to the thinking that any improvement in transistor density, even at much greater cost, still represents Moore’s Law. Such density increases are necessary for the Moore style progress we expect, but without a corresponding cost benefit, they won’t give us the results we’re used to.
In any case, we have an expectation of continual, rapid, exponential growth in complexity, which - alas! - is reflected in the increasing carbon cost of building chips. We’re primed to keep wanting more and to expect that it will just happen, at no cost to us.
Unfortunately, these expectations are not compatible with the reality of climate change and the global imperative to reach carbon net zero by 2050.
Sounding climate code red
Climate change is undoubtedly one of the greatest challenges the world has ever faced. As the IPCC recently signalled in their Sixth Assessment Report, we are now at a point of Code Red for humanity, meaning urgent action is required.
Digital technology has long been heralded as a crucial component in climate solutions, capable of driving down emissions by unlocking efficiencies and reducing energy consumption. But even though it holds the key to decarbonising other sectors, the tech industry should not expect any special exemptions for itself. For every digital solution to climate change, there’s an environmental cost - an amount of carbon being emitted - that must be weighed against the benefits created and minimised wherever possible.
As I wrote in a recent blog, the need to decarbonise compute, for the sake of our planet, means the technology roadmap can no longer prioritise processing power alone. To ensure our net contribution is pushing the stats in the right direction, we need to ensure that the underlying technology is as efficient as possible - and that means that our increasingly high-performance chips also need to be as low-powered as possible.
Targeting performance per watt
At some level, this all goes back to heat. We don’t always think about it, but heat is the number one by-product of computing. CPUs and GPUs work by manipulating binary numbers and every time the value of a ‘bit’ of binary information is changed, electric current flows, creating heat.
In general, as more computation is being done, more heat is generated. With your mobile device, that heat is transferred to your local environment (your phone feels hot, for example). In a closed environment like a data centre, that heat needs to be explicitly cooled. Every joule of heat energy produced in computing requires at least another joule for cooling, at minimum doubling the total energy needed.
The amount of energy consumed by computation can vary significantly between processors; performance per watt is key, both to the energy consumption of the processor itself and, by extension, its environmental footprint. From a design standpoint, we want to make sure that every watt is being used effectively - from avoiding unnecessary computation, to making sure the power delivery circuits are as efficient as possible.
There are many ways in which we can achieve this. Highly specialised designs, such as custom video processors, encryption engines and neural processors, can considerably reduce energy consumption for their respective workloads, but are much less programmable and less adaptable to new algorithms.
Be part of something bigger, join BCS, The Chartered Institute for IT.
A challenge for today’s architects at every level in the hardware / software stack, is to enable as much specialisation as possible, while retaining enough flexibility to meet future needs - especially in the area of security, where we can be confident that future attacks will require defences we have not yet considered.
Being conscious of where compute happens, relative to data, can be just as important as being conscious of what compute is being done. Moving data uses energy, with about four orders of magnitude difference in the energy required to store a bit in a local memory versus sending it off-chip by radio. In a study at Google, around 5% of datacentre energy usage was basically copying memory from one location to another. That’s why making memory copy energy-efficient is a key component of CPU design.
Other examples include processing in or near memory, where processing moves to data rather than data moving to processors, and spatial or dataflow architectures, where processing structures can be set up to physically mimic the logical flow of data in an algorithm. In addition, advanced packaging techniques, where memory and processing die are stacked vertically, can lower communication power at the chip level.
Getting to net zero
So, if we take efficiency to the extreme, how low-power can we go? Can chips become so efficient that they draw almost no power at all?
The answer is yes, they can - and it’s something Arm's research team has been working on for a while. Clever system partitioning and shrewd hardware and software design can dramatically reduce power and energy use ... but there is, of course, a caveat.
As Star Trek’s Lieutenant Scott famously said: ‘You cannot change the laws of physics.’ And one of those laws is that the energy required to charge a capacitor is proportional to its size and the square of the voltage it’s charged with. So, the dynamic power of a chip depends on three factors: its operating voltage, the total capacitance being charged - proportional to the number of bits switching - and the frequency at which those bits switch. Reducing power close to zero means tuning those parameters close to zero as well.
While zero itself isn’t yet a realistic target, many ultra-low-power chips can potentially be made ‘net-zero’ power, or close to it, by coupling them with their own energy harvesters. Alternatively, for devices plugged into the grid, their activity can be tuned so that their power draw coincides with high renewable availability. There’s usually a surplus of solar power in California around noon, for example.
Decreasing data centre draw
At the other end of the scale, we have datacentres which, according to the International Energy Agency, account for around 1% of the world's total electricity use. Yet, despite a massive increase in the volume of data being handled - and fears that the ICT industry could use 20% of all electricity by 2025 - thanks to a laser focus on efficiency and a shift to cloud and hyperscale datacentres, energy demand remains flat.
There is, of course, no room for complacency; processing demands will increase over time, so we must continually strive for greater and greater efficiency.
Firstly, we must continue to locate infrastructure strategically to take advantage of naturally cool climates and areas where sources of renewable energy are abundant. Secondly, we must - once again - make efficient compute a focus. AWS’s Graviton2 processors, for example, which are based on Arm Neoverse cores, deliver a 40% price performance uplift at the same power consumption. This effectively increases the amount of work achieved per watt while simultaneously lowering the cost - and the carbon footprint.
This is the kind of win-win scenario we need to pursue if we’re to land on the right side of history. But, even here, we must sound a note of caution: we need to guard, as far as possible, against the trap of the Jevons paradox - in which technological progress increases efficiency but demand increases, meaning no overall savings are realised.
The urgency of the situation demands that we take an ‘all hands on deck’ approach to achieving the world’s net zero goal. No one technique alone is sufficient and no sector can act in isolation. But, to ensure that technology contributes to tackling climate change without exacerbating it, we need compute to be as efficient as possible, wherever it happens.
I believe we’ll see an increasing number of custom chips devoted to improving performance per watt for specific workloads like video and AI, and also for internal data centre operations like job allocation and memory transfer. We’ll see physical partitioning and distribution of systems to reduce communication energy - compute in and near memory, dataflow designs, stacked die and so on.
When Moore’s Law finally slows to a crawl, we may even see a resurgence of techniques like adiabatic clocking and asynchronous circuit design as a means of pushing efficiency through design effort.
Ultimately, delivering more and more compute performance while improving energy efficiency is what Arm’s partners ask from us every day, so decarbonising compute makes both commercial and environmental sense. What’s more, failure isn’t an option. There is no Planet B.
It’s a tremendous challenge, and we’re just one part of the puzzle. Luckily, we have computers to help us figure it out.