The need for parallel speed

Martin Cooper chats with Dr Alastair Donaldson about his recent BCS Roger Needham Award and why parallelism is both difficult to achieve yet essential to technology’s advancement.

Dr Alastair Donaldson has been awarded the BCS Roger Needham Award for 2017. Dr Donaldson grew up in Glasgow and, though fascinated by computers, he dreamed of becoming a rock drummer. The dream faded as his band folded but, as he packed away his drum kit, his love of computing was rekindled.

After graduating with a degree in computing and maths from Glasgow University, he completed a PhD in model checking. From there Dr Donaldson joined a firm called Codeplay and was introduced to multi-core and many-core processing. His work focussed on the Playstation 3 games console and involved creating tooling. Sony’s Cell Processor, he says, made him appreciate parallel programming’s inherent complexity. These research ideas led him to Oxford for a fellowship and, from then, on to Imperial.

ITNOW chatted with Dr Donaldson about his career, his fascination with parallelism in computing and his views on what multicore can mean for the tech industry and also for society.

Parallel processing is very much your hunting ground. Where’s the attraction?

From a young age I have been really fascinated by the fact that machines can do things so quickly, and parallel programming is one way to get very high performance.

But then also, for my PhD work, I got very interested in software correctness and reliability. Parallel programs are much harder to write correctly than sequential programs. There’s the tension between trying to squeeze out the performance that the processor offers, while still maintaining a correct program. And one solution to that is to design programming languages that make the task easier for performance, or for correctness and reliability, or ideally for both.

And I guess I find that intersection of programming language design - aiming for programmability, aiming for correctness and aiming for performance... - I find that triangle really fascinating. That’s what makes me love the field so much.

And can you tell us about parallel processing? Why is it so important historically? Why has it evolved?

Parallel processing is really quite old. Some of the very earliest computers had parallel capabilities. Machines with multiple cores or multiple processors have been around for decades. But, for a very long time, you would only program those machines if you were at the absolute cutting-edge of science.

Moving forward... For a long time, advances in semiconductor technology meant that single core processors would get faster and faster year on year. But in about the mid-2000s this trend stopped because the physical limits were being reached. It wasn’t possible to increase the frequency of single core processors without them consuming an exorbitant amount of energy.

Since then, multi-core processors, and so-called many-core processors are the norm. It is hard to buy a single core processor these days, and if you want to get high performance, you really have to do multi-core or many-core programming.

Looking at the grand scale, macro level technology trends, like big data, and AI, where does multi-core fit into this landscape?

Multi-core is an enabler for those technologies to scale. Modern datacentres have a whole load of nodes, individual machines, and each machine will then have a significant processing capability via multicore processors, and media accelerators, such as GPUs. And the challenge from a programming perspective is: ‘how do you map your task over this computational resource?’

Machine learning is very successful in many domains. It is really enabled by the graphic processing units (GPUs) which are these very highly parallel processors, initially designed for graphics, but more recently turned to all kinds of computational tasks.

If you train a deep neural network, in some circumstances, you can get a thousand-fold increase in performance by using GPUs. This can make something that just wouldn’t have been feasible to do, actually become possible.

Does programming need to catch up with the investment in silicon?

There has always been a lag between the latest architectural capabilities and the mainstream programming languages. The dream is to have programming languages that can give decent performance on a parallel architecture by using advanced compilation techniques.

The real problem I see with current compilers for parallel architectures is that the performance is really unpredictable. As a programmer, you can write some code that looks like it should be parallelisable, but maybe, depending on exactly how you’ve expressed it, the compiler won’t be able to figure out that it can be parallelised. The code will then be run sequentially. Performance can be very brittle, small changes to the program can lead to dramatic differences in how it performs.

Is your work really about efficiency?

The kind of efficiency that I am interested in is programmer efficiency. How can you best spend your time writing software? You don’t want to spend all your time debugging to reproduce programming errors; you want to spend your time adding new features to the software you’re developing. My work so far is more focused on the efficiency from the software developers’ point of view, rather than the efficiency of getting, say, good energy consumption or fast performance from the processor, though I am interested in that too.

What sort of challenges does multi-core bug hunting present?

The main problem stems from non-determinism. When you write a sequential program, if you run it multiple times on the same input, by and large, you get the same results. The trouble with a parallel program is that - either by design or by accident - it may not have that property. It may not be deterministic.

And bugs in the presence of non-determinism are a real problem. You may have a bug that you know exists but when you try and reproduce it, it doesn’t show again, even if you run the program 1,000 times. This is what really got me interested in trying to do reasoning about parallel programs.

How is your research going to make the world a better place? How and why will parallelism change the world?

The dream for my research is reliability and security by default, and efficiency by default too. It would be fantastic if you could write code in a way that matches the problem you’re solving and have it run efficiently over multiple cores. And to do so seamlessly. And it would also be great if we could have safe programming languages that allow us to write parallel code in a way that doesn’t break, or if it does break, have tools that can give us clear indications of what’s wrong. All this would let us fix those problems and move on to more interesting tasks.

What does winning the Needham Award mean to you?

Winning the award is a huge honour. I think it is personally really humbling to have my work appreciated by the receipt of the award. I know many of the previous awardees, and I am really honoured to be on the same list of awardees that they are.

What will the lecture that you will be delivering be about, in a nutshell?

I’m still thinking exactly what I want to make it about. I am going to try and take a personal journey through what led me to work on this topic. Maybe illustrating some of the problems associated with parallel programming, some of the potential solutions, some progress that researchers in the field have made.

I want to make the lecture technical enough for the computer scientists that will be there, but also non-technical enough that members of the public can hopefully take something away from the lecture, and understand that this is an exciting field to be working in.