The internet of things

Jon Crowcroft, the Marconi Professor at the University of Cambridge, spoke to Jutta Mackwell about connecting cars, light bulbs and utilities to the internet, running out of internet addresses, major developments in computer science and why the open source / proprietary debate has become obsolete.

Your talk* is going to be about the ‘internet of things’. Can you tell me a bit more about what exactly this is?

The internet of things is an idea that has been kicking around for about 20 years, which is when we realised that we should be able to connect devices to the internet that don’t just store information or are just interfaces.

One extreme case is being able to turn on and off a light bulb. You can get input from light sensors and control light switches, and it would be quite simple to extend to the internet in terms of the communication protocols. This is kind of a trivialisation of the idea, because you probably don’t want to have every light bulb in the world addressable by every person.

The point is that the protocol is quite general purpose and the internet can certainly run any protocol. So the idea was that the internet should extend to control and sense things - and it seems that there’s some rapid emergence of this.

For example, you can get a new car and not only does it have a number of computer systems that control things within the car for diagnostic reasons, but more recently you can use wireless access to diagnose the conditions of the engine, maybe change things or decide to replace things. And cars can also report back continuously.

So the idea is that anything could potentially be addressable on the internet...

Well, interestingly, the timing of this is quite amusing. The original design of the internet was never intended to be more than a prototype, so the first design back in the late 70s / early 80s did not consider the idea that you might have billions of systems connected.

Back in those days, even considering a hundred computers was pretty outlandish and millions was not really something sane people predicted. Now we’re way past this point and what’s ironic is that this very week, we’re about to run out of addresses for things on the internet.

Twenty years ago it was becoming obvious that, after 10 years of growth from 1980 to 1990, the net would have been approaching one million users and it was clear that there was no reason to just stop growing. So currently we have this big design problem - it’s not a showstopper, it’s just ironic that it’s happening right now.

There are lots of ways you can solve this problem. For example, you don’t have to have an address for every device in the house or the car. You can have one for the house or one for the car and then you can manage things within an internal network. And there’s very good reasons for that in terms of scoping security and reducing the size of the attack surface - making less things visible.

What other examples are there already for the internet of things?

The most common case at the moment are home entertainment systems and obviously security systems, where you can access alarms and motion sensors.

The other area is that people are connecting large-scale, not just home or personal devices, but also utilities like the power grid, which internally have networks for control reasons. And obviously people monitor roads and transport systems, air traffic and so on. For all those things, if you want to build a network to do that, you can use internet technology because it’s off-the-shelf and cheap and it works.

Ever since more than 20 years ago, we’ve had things like webcams, things that allow you to look at something or things where you can retrieve the status of a device and so on. But now the big difference is that you can go and change some things.

That sounds like there might be some major security issues...

If I changed the setting of my digital video recorder over the internet, it’s not like I’m doing too much harm. If somebody else does it, it’s not going to be too much harm. If I get it wrong, if they get it wrong or they do something humorous, it’s not the end of the world or life-threatening or - but if I turn off the nuclear power station, that could be quite bad, or if I change all the traffic lights.

Bizarrely, the government suddenly realises the big problem and worries about cyber security and cyber warfare. But this has been public knowledge for a very long time. In The Italian Job, the movie, thirty years ago, they control all the traffic lights. It’s not something new.

Moving away from the big utility systems, another example is that you could be driving along in your car, nobody out there apart from some other crummy driver who can’t do any harm to you. Well, not true, as somebody could turn on your brakes, for example, remotely.

This has actually been demonstrated. People at the University of Washington in Seattle took an ‘off-the-shelf car’, which just happened to have Wi-Fi, and they showed that it was possible to turn on the brakes remotely. And that is life-threatening.

So what needs to be taken into consideration when thinking about connecting things to the internet?

When connecting things to the internet, having them controllable is really bad because you’ve got this huge attack surface. If you have that many people getting on a database, there’s going to be someone whose motives don’t coincide with yours. For example, there was a proposal for the health records of all UK citizens to form a single database.

The NHS has one million employees. The chances that not one of those people will be subject to bribery by a newspaper to extract people’s health records is zero.

If you give the health records to individuals and have them look after their data and then give access to their GP or access to the emergency services - that would be very much harder for someone to hack because you can’t search the whole system. And there are not one million people with access to all these records, there are only three people with access to any given record at any time.

There needs to be very serious consideration of how you connect things together and a very serious understanding of whether it is necessary, and if it is, whether it should be done in a fully ‘general internet way’ or not. Controlling the scope of access, managing the who and what role is allowed to get at things in terms of setting operations has to be the first point of design.

What fools people about the net is that it looks cool, because you connect everything and it just works. It’s very easy. Most first-year undergraduates can program a web service and most kids who’ve done any computing and learned to programme can do this very quickly.

The problem is that when you build a power grid or you design a car, you’re concentrating on the primary business. For a car, that’s price, performance, safety, efficiency etc. For a grid, maybe high-level resilience might be a consideration.

So you design an appropriate control system for it and that’s done by people who have a very strong understanding of the automotive industry or power systems. And they work with people who have a very good understanding of control systems. But those people really don’t have an understanding of the risk they take when they connect the system to the net.

So is there a way to make the internet of things safe?

There are people who have connected systems to the net without making terrible mistakes. I would claim that the travel industry and banking are big successes. internet banking has completely taken over for some people, and it’s fast, efficient and relatively safe.

The travel industry also has done a very nice job. There’s obviously less of a security problem for booking your journey because, of course, the bottom line is you show up and you have to identify yourself. But the point is that people have done a very reasonable job.

It is possible to set up secure virtual private networks on the internet and there are many internet service providers who set up a private virtual network service that is secure. Denial of service attacks, phishing attacks, man in the middle attacks are ruled out. That’s a good solution for things that don’t face the public at all.

The other thing is that, in any of these systems, there are always insider attacks, so the software on any device has to be more and more robust. Again, that’s something where, in recent years, there have been a lot of improvements and advances.

Do you think then that there needs to be a greater push to understand security issues for people who are involved in setting up systems and services?

Most degree programmes I know of have security courses, and they’re quite attractive to people who want to study computer science anyway. It goes back to the origins of computer science and cryptography and amateur code breaking and so on, it’s got some glamour about it. So I think there’s reasonably good training in university teaching.

One of the problems is that some of the businesses I mentioned are not populated by computer scientists. Typically, the person who designs the embedded software system for a car or a power grid system or a generating system are engineers who learn programming maybe as part of their engineering course, but they are not trained computer scientists or computer engineers. The point is that someone whose primary job is understanding control theory is not someone who knows anything about software vulnerabilities.

This is part of a cultural problem, which is that large organisations have their primary business and people who are skilled in that primary business. If they have skills in the secondary business, say software systems, it’s assumed they can cover all the territory, but that’s completely wrong. You could certainly also argue the other way round: for example, if you’re doing a control system for your power grid, you probably wouldn’t get Microsoft in, even though they’ve got a lot of good software engineers, but they don’t know anything about power grids.

There’s a culture clash and there are not that many outfits out there that cover the territory. So basically what’s happening is a massive collision between the Titanic and an iceberg, if you like - the iceberg of the internet, two thirds of which are under water and has all the bad people clinging to it, and the Titanic are the utility companies or automotive industry or whatever, which don’t have anyone who understands about carrying enough lifeboats or radios or anything like that.

So if you want to connect a big utility service to the internet, it’s important to have people with security skills to do a design and overall system analysis, a threat analysis, so you get it right by design. One would hope that, if we’re connecting systems that are potentially life-threatening to the net, one would start from a position that doesn’t make such mistakes.

Moving on to some more general questions - how did you get into computer science?

I studied natural sciences in the 70s at Cambridge and the first job I got was as an assistant programmer at North London Polytechnic. I worked for the program advisory group at the Department for Maths and Computing. I was literally paid two and half times what any of my friends were being paid, so that was quite attractive.

I did that for two years and I thought, I need some training in this. I mean, I could program, but I couldn’t claim I’d been trained in computer science, so I went to do a Masters programme at UCL, which I paid for out of my own money. And then I got hired there as a research assistant.

It was 1981 and UCL was the first place outside the US that was really doing internet research. I did a lot of network research and did a PhD and got to be a lecturer and then moved to Cambridge ten years ago.

If you want the history, the real ancient history is that when I was in primary school in North London I had a maths teacher who used to do programming and he took the whole class for a whole term, once a week, to the Royal Free Hospital where they had a PDP8 computer and we learned to program it in binary machine code on paper tape.

What do you think are the most important changes in computer science over the last few years?

On the negative side, at least until pretty recently, there has been this continual downturn in people who want to study computer science at school and at GCSE level, in the UK and pretty much around the world, except in China. That was disappointing, though it seems it is just turning up again.

On the plus side, I’ve seen the UK and Europe trying to re-jig their syllabus so as to make it go back to - in excitement level - 20 odd years ago when kids really got into computing because it was exciting. It is even more so now and in more obvious ways - like computer games and films and mobile systems and apps for smartphones. There’s a huge number of things that young kids are noticing, but that are not part of the school syllabus.

On the technical side of the discipline, there’s a lot of stuff that’s happened. If you think about a system, for example, like Facebook, with 500 million users, that wasn’t around four years ago or half a dozen of Google services - they just work.

So we really got a lot better at robust software. And one reason is that people are using functional languages more and more, which are a much clearer design and so on. It’s not the only tool for the job, but it’s a sort of watershed where we’ve been teaching that as a first style of programming in most computer science courses in Europe and North America.

The other side of the coin is that we’ve got these program verification tools. Byron Cook at Microsoft started this fantastic work on proving correct termination of programs - and actually that’s really brilliant, because you’ve got low level software systems, device drivers, operating systems that can now be subjected to these tools, and suddenly you’re in a different world. From a discipline, those are major results.

There are also lots of cool theoretical results, for example in the application of computer science in mathematics - production and automatic proof systems, provable proofs, where the program is automatically checked, those are very cool and those are nice results.

What I do - which is networks and operating systems - we did a lot of virtualisation stuff and that’s pretty much done now. The next stage is dealing with programming - what we call multi-scale programming.

We now have desktop systems with multiple cores, not just two or four, but heading for 16, 64 even more cores and the parallel processing necessary for that is quite subtle. But now we actually have much better formal tools in computer science so programming those systems is becoming more tractable and dealing with the concurrency transactional memory is becoming doable.

The other side of the multi-scale picture is programming large-scale systems, whole datasets. So, for example, if you want to index a large amount of the web, that’s a lot of processing and you need to do that on a lot of machines in a data centre, continuously. And there are a lot of nice distributed programming languages emerging, so that’s quite fun.

For me those are the robust and simple formats addressing the next challenges. And the performance goes on scaling, which people don’t notice. We’ve scaled the performance of computers, storage and networking, it has doubled every 12 to 18 months continuously for 25 years, and everyone just takes it for granted. But that’s an awesome achievement for computer science.

And we’ve done verification of hardware, verification of drivers, verification of applications, signs, self checking, model checks, provable code, all these things are coming up. Those are pretty awesome. But they’re kind of transparent to the person on the street. People assume that Google voice searches are quite easy or they get their new game, which has got some ridiculously good physics model in it, running in real time, but they don’t notice these things.

But we have to keep going. There are only one billion people in the world with a fixed internet and most of them are using modest machines at the moment and they’re not doing 3D video or gesture interaction yet, but when they do, that’ll mean a thousandfold performance improvement. By 2021, you have to see a thousandfold improvement in what storage you have, speed of interfaces, networking...Most people’s roadmaps run out of steam there.

Quick questions

Open source or proprietary?
Blimey, that’s not black and white. I don’t believe that they’re opposites. So, for example, you can publish source code with a copyright notice on it. There’s proprietary closed source, proprietary open source, open source, open source that are documented and undocumented, there’s some high-quality open source and there’s some absolute rubbish open source out there, and there’s the same for proprietary.

And there are people who increasingly will publish code or at least in constrained ways let you have code. The big issue for me is that, in education terms, if you don’t let people have source code and then you hire them and complain that they can’t program your system, when actually they’ve been programming a Linux box or a BSD Unix box, what do you expect?

It’s not obvious to me why closed source happens as much as it does. There’s the argument that, if you open your source, then the bad guys can get it, but the bad guys can do that anyway.

Ross Anderson at Cambridge wrote a nice paper on this where he analysed the reports on vulnerabilities and he said that it’s pretty much a level playing field, that, actually, the advantages of opening source - that lots of people can examine, check and maybe fix it - is offset by the bad guys seeing stuff sooner, but the proprietary guys get less people able to fix it and therefore they get less people being able to see the vulnerabilities ahead of time. But actually it ends up being about the same, as far as anyone can see.

Mobile phones and smartphones have illustrated this again. Publishing the API is the main step. Mobile phones have done a pretty reasonable job at application level. The low level stuff is not so good, but there’s several complicated reasons for not letting everyone have the low-level driver software on a phone.

One reason is that the internal operating system is not secured, because they are written using old-fashioned programming languages and you would be able to examine old memory, which means you would get access to data that would be somebody’s private data, their phone book and so on or even change the radio properties and break the radio network, as a worst case. So there’s a little bit of justifiable paranoia about that.

But on the other hand, you can see pretty much all the mobile phone platform companies, iPhone, Android, Blackberry, all have app stores and have gigantic numbers of good applications - written by arbitrary people. My favourite thing on TV last week was a 13-year old in the middle of nowhere in America, who now has the most popular iPhone game. Two million downloads and he’s just knocked Angry Birds off the top of the list.

He’s an example of the new upcoming generation - he’s not a hacker, he’s obviously a good science student, and he was in an interview by American TV, and he was saying, ‘somebody suggested I could write a game, so I went to the library and got a book on programming and I found on the net that everyone recommends a particular physics engine for these games...’ It was great, the way he was talking about it sounded like someone who’d been in the business for ten years.

So the point is, I think the whole argument is pretty obsolete, I think we’ve stepped past that.

Apple or PC?
We’re a Unix shop in Cambridge at the computer lab, so about ten years ago, as I’m a systems person, I tended to use Linux or BSD Unix. But then Apple finally shipped something that actually runs a decent operating system and I can take Unix applications and they just work. So when I buy a Mac and stick it on my desk, it integrates with everything in the lab and I can run it with Open Office and so on, so that all works.

But that’s not the crucial argument: there are six people in my household. And there’s a Windows PC in the house. And no one is using it. I have Mac Minis on every floor, with a nice flat screen, and there’s one Windows one and it’s been off since Christmas. People only turned it on because there was some PowerPoint file somebody got sent and it wouldn’t view quite right. This is not me, this is a 13-year old, an 18-year old, a 21-year old, my wife, au pair and my mother. What do they sit down in front of by choice? That’s the user experience...

Blackberry or smartphone?
I use an Android smartphone, but I would happily use a Blackberry. The Blackberry battery life is awesome. Absolutely brilliant. And I know some people in Waterloo, Canada, where they build the things, and the designers are very, very good engineers as well. They’re good system designers, which I really like, but also at the lowest level, which is the hardware, they engineer everything. Google don’t build the hardware for Android phones and Apple don’t build hardware, it’s all subcontracted. Blackberry does everything, from the ground up. And you can tell. So there’s no obvious reason why I don’t use a Blackberry

Wii, Xbox or Playstation?
I don’t do games. People in my house do, so we have all of them. Have to - no choice!

Geek or nerd?
Neither.

*Jon Crowcroft is the Marconi Professor of Networked Systems in the Computer Laboratory of the University of Cambridge. He is the Principal Investigator in the Computer Lab for the EU Social Networks project, the EPSRC-funded Horizon Digital Economy project, the EPSRC-funded project on federated sensor nets project FRESNEL (in collaboration with Oxford University), and a new 5-year project towards a Carbon Neutral Internet (in collaboration with Leeds University). Jon is a Fellow of the ACM, BCS, the Royal Academy of Engineering, the IEE and the IEEE. He will be speaking on ‘The Interknot: Unintended Consequences of Internetworking Different Kinds of Networks’ at the Real Time Club on 15 March 2011.