Transcript of the Karen Sparck Jones lecture

Dame Wendy Hall, Professor of Computer Science, University of Southampton, recently gave this year’s Karen Spἂrck Jones lecture. This is what she had to say.

Karen was a mentor of mine; she was just a fantastic woman and she wasn’t made a professor until quite late in her career and there are whole stories around this issue. Her work lives on in her inverse document frequency work, which is still used in search engines, like Google; she was a pioneer of information retrieval and this is often forgotten.

Women are often the hidden strength behind many of the big achievements in computing so I think this is a really important lecture.

Ursula Martin emailed me and said: ‘You don’t have to talk about women in computing, you can talk about what you do in computing. I always say, as do many women, in any field of work, that we want to be promoted; we want to get all the honours and gongs of merit, but I don’t want to be where I am just because I am woman. I have been a woman in computing and I have experienced that world as a woman and I have developed this lecture as a woman; not that I’ve done it before, but this is where I will tell you my story as part of the work I’ve done and hopefully we’ll all get something out of it. I also love talking about myself - I’m a media tart, so there we go!

There will be other things as we go along, so relax and enjoy. Actually I’ve always had an interest in the issues of memory and how computers can help enhance people’s memories and I did put this slide together when we were talking about how, increasingly, your life is stored digitally, whether you like it or not.

I was born in 1952, in London, seven years after the Second World War. My father, at 19, went into the Air Force; he was a prisoner of war in Germany for five years. It’s a lovely story though; my mum’s in the ATS, they got engaged just before he got shot down and didn’t see each other for five years. He came back to the UK in May 1945 on VE day and they got married 16 days later and that’s what it was like in those days. They were married 62 years before he died of dementia.

I think my generation is a much blessed generation because the world was coming out of that war and I was born to parents who were very humble. That picture is taken in my grandmother’s garden in Walthamstow and behind it you can see bomb damage still left.

They came from very humble roots. I hate that word. They were just ordinary people. Nobody in my family had ever been to university, but as I and my brother came along, we had free education from five years old to 24 and free health service and food, and coming out of the rationing from the war we had everything; and parents who just wanted us to do better than they had done and of course at this end of my career I now am in a position where I’ve had a good job, got a house. I have a pension, which the people coming after us won’t have. So I really think that my generation is very blessed and we should be ever grateful.

Also another thing I say is, choose your partner well. There’s a picture of Peter in a minute, but that’s me on my wedding day, which was the happiest day of my life. This is me with the Queen, when I got my CBE, and this is me with Tony Blair; that’s the first time I shook hands with the Prime Minister. First time I met Tony Blair and, in fact, there’s a story there…

We were in India, and I think I was a professor, but I’d just got my EPSRC Senior Fellowship in 1996 or 97, when he’d just became Prime Minister, around 97/98. It was a tour of India with him as an Anglo-Indian collaborator; ‘let’s collaborate on science tour’, and I nearly didn’t go. Peter and I went on holiday to Australia and I was supposed to be coming back from Australia. We booked our tickets and the day the flight was going out to India, and I said: ‘I can’t do that, I won’t be able to recover from that and go to India’ and Peter said: ‘no, you should be able to do it, let’s do it.’

I met people on that tour who have helped build my career. People who became chief scientific advisers, CEOs of companies; it was a networking experience, par excellence, and ‘seize the day’, has been my motto ever since.  

Here’s another day when I met David Beckham. Nothing to do with work; we met while we were on holiday in Greece. Anyway it’s a long story, but people ask about that photo, and, yes, he is lovely.

To cut a long story short, I was good at maths at university and went off to read maths and I rebelled a bit and wouldn’t apply for Oxford or Cambridge. I have no idea if they'd let me in. I went to the red brick one the south coast (Southampton) and loved it. In fact, in 1974, that’s 40 years ago, and I’m still there. I did leave and come back again and we’re celebrating my 40th anniversary this summer and holding a party for the whole university.

They persuaded me to stay on and do a PhD in pure maths and I just loved it. I’m very happy in the abstract world, which is why I think I have never really enjoyed programming. You can think of programming abstractly, of course, but I’m happier in n-dimensions rather than three. If you see me park a car, you’ll know!  Anyway, that’s a bit stereotypical.  

That’s where I met Pete, and I always say ‘choose your partner well’, and that’s what I meant to say earlier. Whether you’re male or female, if you want to have a career in science, which involves a lot of late night working, if you’re into science engineering, there’s never an end to your job. There’s always something to do: another conference to attend; another student to supervise; another class to teach; exam papers to mark. You never actually finish and, if you’re passionate about what you do, and want to get on, then your partner has to be very tolerant. 

This is the 80s and I couldn’t get a job in pure mathematics. Universities were contracting. I went to a couple of places teaching maths to engineers - there’s lots of stories there. Like the first job interview I went for, a temporary post at a university, which I won't name; a one year post, teaching maths to engineers. I walked into the interview with a whole lot of men, mostly professors. I’d just finished my PhD. It was for a one-year post teaching maths to engineers. This was the days when you still had the interviews where you sat in a room waiting for them to give you the answer. The head of the department came and offered the job to a man. I thought that maybe he was better qualified than I was. I’d just got my PhD. And then he called me into his office and he said: ‘Wendy, I’d have liked to have given you the job, but the panel wouldn’t give it to you because you are a woman.’ He could think it now, but he could say it in those days. The engineering panel didn’t think I could control a class of engineers and the next week I got a job teaching maths to engineers at Oxford Poly. It was fine.

In the 80s I didn’t want to do this type of maths teaching; I wanted to get into the research world, and then personal computers came out. So you will have your favourite one, the BBC B and the Sinclair Spectrum and, when I was teaching at a teacher training college in 1982, and  my boss said: ‘Wendy I want you to teach on programming,’ I thought, oh no, and he said: ‘there’s a communal pet in the cupboard; I want you to take it home’. I took it home and taught myself Basic. And, as Edgar Dyke once said: ‘if Basic is your first programming language you’re mentally mutilated for life’, and I am! 

I have programmed and I have taught programming, but I have never really enjoyed it. However, what I really got was the idea that suddenly these computers became interactive and we could put pictures on them, graphics, and then I saw a video on a computer from the analogue video discs, (remember those, and laser discs?), and it was a Doomsday disc. We were looking through one day and I saw my mother-in-law in the Doomsday book, because all the kids had to send in pictures. I don’t know if any of you can remember, you had to send pictures of your town, village or school etc.; a bit like Google do today. And I just got so excited by this and Southampton was advertising a lectureship, and I decided to apply for it, and I got it and I’ve never looked back since really. I had done a masters at City so I had a qualification in computing.

Now where does this man fit into my career, the Earl Mountbatten of Burma?  Here, this is me in the library in Southampton, in the archives, and this is Lord Mountbatten’s archives in 1987; quite a seminal year for me and the world. His archive arrived at the University of Southampton and we were custodians of the archive and here it is and people say it’s my shoe collection, but that’s actually his archives and there’s me standing there looking at a picture.

We didn’t conceive of a world born digital back then, but what we wanted to do was to digitise, to take papers, and to take photos (there were lots of photos in his archives), audio and some film, and the idea was wouldn’t it be fantastic if people coming to look through these boxes could actually look at this stuff and we could send them a disc - this is what we thought about in those days - or maybe online.

We never did all of that because of the cost of digitising. It was about 250,000 papers, 50,000 photos. Not that big, as we think of multi-media archives on the web today, but it wasn’t the cost of digitising, but the copyright that actually defeated us. We weren’t allowed to make the material available online without lots of clearances.

In 1987 Apple released the MAC and they had this programme on it called ‘HyperCard’ and it was the year of the first hypertext conference. And I was suddenly into this world. Not only could we have pictures, but we could use computers to link the pictures and the text and the text to the video and I started thinking of hypertext and hyper media ways of access. Paul (Martynenko) has said that I have always been interested in making things easier and giving people access to find information, so that was the sort of thing happening in 1987.

We built this system called Microcosm, which I won’t go into now, but, people say it was the precursor to the web at the time. There were lots of hypertext systems and we were pioneering a new type of hypertext system where the links were held separately from the documents.

I wasn’t going to talk much about this, but when I first saw Tim Berners-Lee demo the web, I was aghast because he imbedded the links into the documents and I thought it was such a bad way to do it, and they were only one way, and everybody knew hypertext links had to be two way so our links could be two way, in the good Ted Nelson genre, but the idea was to separate the documents so you had a beginning and end, a source and an object, and a destination really, and a description.

The idea was they were triples and the idea was you could reason about the links so you could reason about why objects were connected, and actually this was really the precursor not to the web, but to the semantic web, which was always part of Tim’s original vision, the web of linked data. I had no idea that is what I was doing. The word ‘web’ appeared in Vannevar Bush’s paper. The concept of linked data didn’t exist, but we, for sure, were making links on data when most people were either embedding their documents, like Tim, using the sgml type approach or they were linking at the document-to-document level, but not at a data level.

So we had the most fantastically rich hypertext, but it wasn’t on the network; it ran on a PC, and we used to talk about a ‘distributive’ version of it, but our focus was not making it available on the network, which of course Tim’s was. And that’s what multi-media looked like around that time. It’s now on our phones, on our tablets, and we can walk around the street accessing all the things that we were envisaging back then, but there’s a laser disk round the corner somewhere that was a Microcosm workstation in those days. I have no idea what sort of modem or phone it was, whether it was just a phone or office phone that was parked on the floppy disk. Look at the haircut!

This is all part of weaving the story of women in computing into my career as a woman in computing. In 1987 I was working with Jill Lovegrove, who’s another lady who is very well known in BCS circles, at Southampton. We looked at the class list for October 1987. We had a very new Bachelor of Computing. I think, I’d arrived the year the first lot graduated in 1984 and there were only about 20 people in the class in those days, but we realised that we had three years of undergraduate computer scientists with no women at all, none at all.

And we thought what are we doing wrong? Where have all the girls gone? So we started digging around and somebody, I don’t know who it was, but Carol Goble, who was another well-known lady in computing, was very much to do with it. I don’t know who started a mailing list, (email was quite new those days) called Women in Computing, and we had the first ‘Women in Computing’ comp’s in Lancaster and we said: ‘haven’t you got any women; where are they all?; where have they all gone?’ 

I was doing this multimedia work in the 1990s and one of the professors at Southampton called me out in public, in the coffee room one day, and told me that there was no future for me if I carried on, and he wanted me out, because it was not computer science, it was not writing compilers or writing design programming languages or anything that we should be doing, and so he wanted me out, basically.

Luckily, I had the support of the head of department, David Barron, who saw that this was the future and supported me to do it, and it was quite a difficult time because I really thought that I should be doing something else with my life. David mentored me through all that and gave me the money, because I didn’t have any grants and things in those days, and he gave me the money to go to conferences in the States, where this certainly was computer science.

I also had a fantastic sabbatical at the University of Michigan and Anne Arbor University where multimedia certainly was computer science. And, the other point to make here is that the WIC mailing list was abandoned. This is because some men got on to it and started talking. It wasn’t abusive or trolling or anything we might talk about today, they just started talking about stuff we women didn’t want to talk about. It wasn’t actually very interesting, it was about the details of maths A-levels, and actually we wanted to talk about what it was like to be a woman in an all-male department, and all this stuff.

I am a great believer in sometimes having just women-only networks, but generally speaking, and I’m going to come back at the end to say this, but this is not just a women’s issue, this is for everybody.

So I started then thinking about women in computing. I’ve just put the stats up here. Gill and I, for our paper, looked at UCAS, as it was then, at how many men and women were applying for computing. I haven’t bothered to update because it’s too complicated to work out what’s computer science these days, but just going from 1978, where computing largely came out of maths departments, (sometimes out of physics and electronics departments, but largely out of maths), about a third of the class were women.

There were quite a lot of women in computing in those days. Many women in the industry were running computer departments and the figures for maths was about 2:1 men to women and computing was about the same. See what happens here where the government are putting lots of money into computing and growing, and courses were expanding hugely, deliberate policy, and the number of men goes up absolutely and the number of women goes down absolutely, and relatively. Now what happened here 1984/85/86? 

I say personal computers because there is very little you could do on personal computers when they first came out except programming basics or if you wanted to be more advanced, assembly language and to play Space Invaders.

Part 2

There actually wasn’t anything that attracted the average female, they didn’t know much about it because it was done behind closed doors by scientists, and IT was like a challenge, a bit like mathematics. The other thing was when the ‘toys for the boys’ came out, every advert for personal computers was targeted at men, often to buy them for their sons.

And the other thing we did, and this was all done with the best of intentions, and I say this story now because we are potentially going to repeat history. The government had this big campaign to put a computer in every school, which is like putting one telephone in every office or as signor papa used to say, if you just imagine you can put a pencil in a classroom a day each week, exactly how will that influence the educational experience of those kids

So the government had a campaign to put a computer in every secondary school and they also then set up centres of excellence. There was a programme - I don’t know what the programme was called - but teachers who knew something about computing were put into centres to potentially help other teachers to learn about how to teach computing, but they, of course, took teachers out of the classroom, so who did the teaching? The boys, who taught themselves or learnt from their fathers or whatever, and suddenly we get the phenomenon of the boys at the front trying to get on to the one computer, showing off that they could do it and the girls going: ‘I’d rather do something else with my life.’ 

This happened in Europe and, I believe, in some ways, there is a North/South divide in Europe and I think this why there is a rift between Northern Europe and Southern Europe, and it also happened in America. I honestly think that this set up of cultural difference, about ‘computing is for boys’, is something that we really ever really got over; we have not actually changed that.

And the reason why I say this is because the government has, quite rightly, changed the curriculum in this country to go back to where we are going, to put programming at the heart, putting computer programming on the curriculum and getting all the kids to learn how to programme.

It is so, so important that we don’t repeat history. We have to make those classes inclusive, particularly in secondary schools, because in primary schools activities tend to be inclusive, but we really, really need to make sure the way we introduce the new curriculum is done in an inclusive way or we will just repeat the mistakes of past years.

I did my sabbatical; this is when I gave my first paper at a conference. This is the European Conference in hypertext in 1990, in Paris; that was the conference where I first met one Tim Berners-Lee talking about something he called the ‘Worldwideweb’. Then in Christmas 1990, he put the first site up and he and Robert called it the World Wide Web. And the ACM Hypertext 91 in San Antonio, Texas, was the conference that famously rejected Tim and Robert’s paper on the World Wide Web. It’s not a great paper, not a lot of research in it, it was rejected by the reviewers, but it was a really great idea. They also rejected our second paper on Microcosm.

So what do you do when you have your paper rejected? You submit a poster or a demo; that’s what academics and students do. And so Tim and Robert submitted a demo of the World Wide Web and my team submitted a demo on Microcosm and we all trolled out to San Antonio, Texas, and called it the ‘demo slot’.

There is a picture on the web of Tim demoing the web and I remember looking over his shoulder and watching him demo the web, and of course there was no internet connection so he paid a bit for a modem, but mostly he was showing a simulation of what it would be like and that’s when I first thought: ‘all my links are embedded in documents, this is never going to go anywhere.’ But it’s the network stupid!

The other thing I remember is, because it was Texas, there was a Tequila fountain in the reception area outside and they were all out there drinking Margaritas. Nobody realised what they were seeing; it was the first demo of the web in the US or close to that anyway.

So we were developing Microcosm and I got my first start-up company experience in 1994. We developed a company (Microcosm Ltd) to sell our hyper-media system, the year that the web started to take off. How clever is that? So people would say: ‘what you’re doing is fascinating and I think it could be very interesting, but this one is free, so we’ll try that first’!

I didn’t personally make a lot of money, but the company did very well. We had £13M of investment and it’s still going. The company, it’s called Active Navigation, and it still uses our linked data service, but it didn’t try and compete with the web, but worked with it and developed it into a different type of company; a bit of a niche company.

I became a professor, the first professor in engineering at Southampton. I remember the first meeting I walked into, the first committee meeting was all men. As I walked in I saw a man, an aeronautics professor, chairing the session, and he said: ‘hey lads, there’s a woman in the room, we need to watch what we say.’ What! Are you going to tell dirty jokes? What are you going to do? You learn to get through that.

So anyway, to cut a long story short, I’m going to tell you, in a minute, about the work I do, but this is the career piece. The luckiest break I got was getting into EPSRC, that’s our Research Council. The research fellowship was for six years of really not doing a lot of teaching or research, but that’s when I took the idea of Microcosm and started working with Tim on the web and helped with Tony Hey and grew the department in Southampton, because it was tiny when we first joined and Nigel Shadbolt, who many of you will know here, and Nick Jennings, came to Southampton in 2000. I had that fellowship where I was able to grow my research, grow the reputation of Southampton, and grow the team. 

And also, another lucky break, I became a member of the EPSRC Council, and that’s what got me my CBE. You don’t know these things when you start out; I was just so excited to be on the Council.

I took over as the Chair of BCS Publications and didn’t learn how to say ‘no’ and became Vice President of BCS Publications and, because of that, and being on the ACM Publications Board, I never realised I’d become president of both those organisations and, why did I do that serial presidency? Maybe because I like organising things and people. I got my CBE and then did my stint as head of BCS. 

I have always been a part of women’s networks and while I was president, BCS Women became a specialist group, or maybe it was the year before I became president, but I helped push it through the board. We set up the BCS Women’s Forum, the work of which is now being carried on in another way, but is still being carried on in the BCS.

I can talk for hours about this graph; one of my PhD students, from a few years ago, up-dates it every year for me. Let’s put up the pattern of growth. Most people don’t realise that Google didn’t emerge until maybe ten years after Tim put the first web up, because you think Google’s always been there; you don’t know how it will work until you built it. 

We had to build the web before we knew how to develop the tool to actually make it usable, because you can’t do that experiment the way we’d done previously, with computing, where you do the design and build and test and go round that cycle iteratively with people or with a few groups of people. The web works because millions of people use it over the network. And another big reason is because Tim gave it away. People think Tim is as rich as Croesus, but he never made a penny. He does a few appearances nowadays, but basically he gave the web away and he’s never signed up to any one vendor or company. 

His thesis was an experiment - we can’t actually repeat it because we’ve built a web now and it’s hard to know. Maybe if this web dies we will have to build another one and there’s a scenario there, but basically his thesis was, either everybody uses it or nobody will. We had great difficulty getting people to develop our hypertext. We used to pay people to develop hypertext. It’s really hard to get people to put links in between documents because you didn’t know what to link to what.

The web works because you get a return on investment of what you put into it. If you write something or you link to another site, the whole world can read what you’ve written or follow your link and of course Google follows our links, that’s why Google works, it’s because you get that potential return on investment.

Facebook works because potentially anyone in the world, if you want them to be, can be your friend. Twitter works because potentially you can micro-blog to the whole world and you get that return back if you put in; like everything else, it’s the way it started. If it’s free you get more of a chance of people starting to use it.

Last week somebody gave me a document to be the custodian of. The document was signed in Brussels 20 years ago; June 1994. The document is the agreement between Europe and America to set W3C up, the standards for the web; I have this document. This is what took Tim from Europe. Europe lost the web at that time; the Americans had the money to pay for a chair for Tim at MIT. There are a whole lot of stories on that, around innovation, but I was thinking, wow, this is really a piece of history and it’s in my hands.

The justification provided of why it is so important is that it demonstrates ways of getting access to files on the internet, and there is this new thing called the World Wide Web and last year there were 60 sites and this year, (1994) there are now 800 sites; isn’t that amazing? So if you want to get things going on the web you have to give it away, get people to use it, and then think about how to keep it going, be it a company or not-for-profit or whatever. How are you going to get the money to do that; are you going to charge people? Are you going to do advertising or get sponsors? How are you going to do it? It’s a very, very different business model to the ones we had previously. There are lots of stories about that. 

Think about the .com bubble here, the browser wars, of course, and Amazon, and then we get streaming media; that’s how I got started on the web, the video on the web, wow. I still can’t do the video on the web where we did the laser video disk. I still can’t really do it. You used to be able to point on a particular disk and say: ‘what’s this?’, but it’s really hard to do on the web.

So Amazon starts and people start piling in. ‘We’re going to shop here on the internet and we all want to be as rich as Bill Gates, so let’s invest in this company’, and there have been lots of .com booms since, where people think this is a new idea; let’s put money in and completely over-value the price of the company and inevitably it is going to burst and if you look here, when you think about how you got onto the web in these days, there was no Wi-Fi or broadband in our homes. It was alright if you were in a research lab, or big company, but of course companies didn’t let people use the web inside the firewall in those days.

If you had computer at home you had to use a modem and it would go whrr and if you wanted to download a web page it would go bmp bmp, clunk, and time out. We used to call it the world-wide-wait! And so you’re actually trying to sell into a market that doesn’t exist, if you’re trying to be a shop on the web, and it’s a bit like after the invention of the printing press trying to sell books to a society that couldn’t read, that was a pretty hard business ask as well, but it’s all happening a lot faster.

And so again this is a sort of a website piece and, if you look back as to why things happen, you weren’t going to get shopping on the web until you could have fast access. Pictures, catalogues, you couldn’t do any of that before the technology arrived, you needed a Google as well to find things and so it was inevitable it was going to burst.

But I remember reading one of the broad-sheets at the time, in the US, and there was a very sanctimonious reporter saying: ’that’s the end of shopping on the internet; we’ve now categorically proved nobody wants to shop on the internet’. Of course they were completely wrong. Nobody wanted to shop on the internet in 1999, but in 2014 we have a situation, which we were always going to have, where we have a problem with the high street and everyone’s shopping on the internet. The whole of the retail world is being turned upside down and very exciting it is too.

The way we need to be able to look forward is to look back and see what’s happened in the past. You’ve got to take the whole context into account. We’ve got lots of people using lots of sites and we’ve got a Facebook and we’ve got a Twitter.

We’re now into the world of social networks and again I can tell you lots of stories. I was talking earlier to someone about ‘why hasn’t MySpace grown instead of Facebook?’ There are lots of stories round that and we set our students challenges to look at these issues and all these things emerge as things that techies write a blog on or on Wikipedia; put your video here, put your photos here, and we all start doing them and isn’t that amazing, and then of course who buys them? The big companies do and it all goes a bit different.

This is where I live here in Hampshire and, in fact, my husband was the Chair of the Parish Council and they voted not to have a website, as it’s a waste of public money and that was last year! That’s the sort of village I live in, but somebody’s put us on Wikipedia, we’re in the Doomsday book; we’re a little village in the Doomsday book, and someone has put us on Wikipedia.

I was doing a bit of research recently and they estimated that if you printed Wikipedia out, (they put the count up every so often), there would be nearly 2,000 volumes, if you measured it in Encyclopaedia Britannica type volumes. It would be about 2,000 of those, so imagine following those links; imagine being in the first one saying right now you go to volume 1,563, page number so and so, it’s the end of that link!

I remember doing lectures about hypertext and trying to explain to people about what hypertext was when you only had paper; quite interesting, that sort of thing.  And the last version of Britannica, that was printed, was 32 volumes. Now we’ve written all that and that’s just the English language version!  We have written all that, we the people, and it’s developed its own governance rules; it’s fascinating, and this is not-for-profit. They ask you to give money. Who knows what the future is, but what has certainly happened is we don’t have any printed Britannica any more. You can still get them, they’re still produced and lovingly edited, but they’re not printed.

Hence when the power goes out, which it will because we don’t have enough of it, it’s not just the heating and lighting that goes out, it’s this. You won’t have a book, even if you have a torch or a candle to look at it and you won’t have a book on your shelf that will give you the information and also other things like the travel timetables and the phone numbers to call British Airways for your flight times, and all that info is now on our mobile phones or online…

Part 3

The industry that used to print these things has gone and I think this is an important thing for us to think about as a society and also think about the fact when Wikipedia first came out, we couldn’t use it in a court of law, although I’m not a lawyer. I don’t know if you can, but IT is taken as the source of all knowledge now, all round the world.

Twitter has grown dramatically and is now a very different beast to the one it was when it started. It saves lives, it tells people what’s happening in emergencies, it’s used for marketing; it fans the fuel of celebrities.

I didn’t go in for Twitter and Sue Black was one of the founders of Twitter, she was on before it was on the web. And this is when Stephen Fry was famously stuck in a lift in London and he tweeted: ‘I’m stuck in a lift’. He just picked up his iPhone 3G. So the technology enabled him to do that tweet on the move. It’s not a correlation, but twitter starts to really take off when we could tweet on the move, using our 3G/4G, whatever they’re going to be, phones.

Paul (Martynenko) mentioned that one of the things we did in Southampton was to work with Tim to develop the semantic web and the story here is that, remember what I said about the web, you have to build it before you know what tools you need to use it; it’s the same with the semantic web.

Part of Tim’s original version had always been a web of linked data. He talked about it at the first conference; it’s in his book, ‘Weaving the web’. He called it the semantic web because it was his idea to call it linked data and you’d link it semantically to explain why relationships were there and machines can interpret data as long as it’s described to them using ontologies.

Machines can make inferences about data. As long as you’ve got the description and the language we can look at a document and interpret what it’s about, by looking at a picture. It’s much harder for machines to do that and you have the language processing and image processing tools, and it is much quicker for us to do that, but really much quicker for machines to make sense of data than us. It’s always been part of Tim’s world to have linked data and his argument is we will get knowledge we couldn’t have found any other way by having the web being able to help us find information. 

But what happened was when people started talking about the semantic web it got picked up by the AI intelligence community; the AI Community went down what I call, a AI rat-hole where people talk a lot of upside down As and backward Es, and full of logic, and it’s very good work. I’m not knocking it, it’s really good work, but it didn’t help progress the semantic web. In terms of building it was more the theory about it. But what he really needed was to put the data out so we could build it and then we could work out how to use it. 

Nigel, Tim and I wrote this paper about: ‘let’s simplify it all down to its basic principles’ and you just give everything a universal resource identifier and label everything, use the same protocols. We’ve already got HTTP so come up with a common standard for your data RDF Resources Description Framework. You can use XML, if you want, but it’s easier in RDF, and link the data and wonderful things will happen. Basically ‘Tim’s Ted Talk’... If you look at it on YouTube, he’s saying: ‘you gave me your documents, now give me your data’

The semantic web is out there, it’s happening, and people are increasingly using linked data to build all sorts of things and it’s going to change the way we manage data, and of course we’ve also got the open data movement, that was led by Tim and Nigel in the UK and Tim Henlan in the US, to get governments to give data out.

The interesting thing is that the public sector was very slow getting on to the first web and the private sector, industry, very quickly got onto the first web and realised they could make a lot of money out of it. Some didn’t of course, and they are no longer in existence.

But in this world actually, the public sector has got this quicker. The public sector has the idea of putting data out because a lot of the data they have about us isn’t private data, but it’s the number of cars that go down a road or where all the bus stops are in the country and those sorts of things; it’s information about our world and it can be made open without anybody screaming about it. It’s much harder for industry to think of doing that, but they’re getting there, we’re getting there in terms of at least business-to-business sharing of data, and increasingly.

I remember when companies used to say we’re never going to create a website; why would the public want to read all about us? You can’t be a company these days without having an amazing website for people to look at and it would be the same with the data eventually. People will put data out and all the stuff about company reports and stuff like that is all public knowledge and could all go out as data. There’s no reason why not to, and then we can link it all up and see what happens.

The world of big data, linked data, open data - this has happened because of the web and the internet and it’s also helping us solve all sorts of problems to do with health and energy. The big challenge problems we all need to talk about concern data, data, data...

Websites in five minutes

This part about the research I’ve been involved with is really a web science story. As we were talking about this when we were writing this paper about the semantic web, with a guy called Danny Weitzner, who is a lawyer and who worked with Tim in IT and internet policy. We were talking about how the web of data had evolved and how we could get it to evolve. This is one of the things and when we look back at the graph that I showed you about the web and started to think about how things work, we realised stupidly, we realised we were so stupid that actually the whole story was a socio-technical one and it’s so much more than just about the technology.

The web grows because of what we do with it and if we stopped using it, it would stop growing. But we built this very complex system, a new type of system, so it’s a complex system; it’s networked and about network science. You can use network science to explain it, but actually this ecosystem we built is socio-technical and you can’t understand it without understanding human behaviour, organisational behaviour, understanding what interventions the lawyers or governments make, the policy decisions, or companies make decisions that affect the economy, particularly the social psychology of understanding people and social behaviour. We need these disciplines involved with what we’re doing so we called it web science. There are two things I hate about this name, one is ‘web’ and the other is ‘science’! 

Well someone said ‘computer science’, but that was a misnamed subject. Social scientists think it’s about something very technical, even though they have science in their name as well and computer scientists think it’s nothing to do with them because it’s about building websites and we’re sort of stuck with it now. It is like computer science; it has lots of branches; so social media and analytics, that’s a part of web science. There are lots of different things under that big umbrella, that all come together to make up our understanding of this world.

We launched it in 2006. That’s us at MIT; Rod Brookes was the Head of the CS & AI lab at the time. I always put that quote up from Eric Schmidt because it says everything about it. They didn’t give us any money, but it was a lovely quote to put on the press release.

Over time this initially was the initiative between Southampton and IT, but we clearly didn’t want to keep it. This is of the world. So, over time, it evolved into a not-for-profit trust, based in the UK, but has a remit to promote the ideas around the world. 

This is a butterfly diagram. We call it ‘the butterfly diagram that Nigel wrote on the back of an envelope one day’; I think to try and explain to people how many disciplines we were trying to bring together. There’s a lot missing from education that should be there. Philosophy should be there, but it is not meant to be representative, it’s just saying there are lots of things you might want to be interested in when you’re studying this phenomenon. You don’t have to know all of them, it’s not the union, but it’s more than that pin-prick of intersection.

I have particularly focused on where computer science meets social science, but also the life sciences. I’m just writing a proposal, (why aren’t I retiring? I’m just writing a proposal, why?)  You write a proposal now, which assumes I’m going to be doing that work in three years’ time, oh God!

With the life scientists on board we have a new type of project where we want to try to understand each other’s worlds and where they come together. We run conferences and summer schools and workshops around the world and we have a wonderful network of labs. They’re not all on the maps. These labs we picked up were doing this type of work, so we’re happy to put it on the website.  

We don’t fund them, they fund themselves. They get money from their local funding agencies, but we work together. We collaborate on joint projects, setting up curricula, organising the conferences and workshops and I’m away a lot, because I’m travelling around the world visiting some of these labs, and I just have such fun.

Talking about the web, in China, it’s a completely different experience talking about the web, but they have a lot of the same issues and a lot of different issues and when I take students with me it’s an eye opener. I take UK students to Beijing or Shenzhen and get them to talk to the Chinese students who love the web and don’t think anything of it. It’s different, but they know it’s different. We have our own issues with government doing things on the web that we don’t like as well.

That’s my building. We’re just about to launch the Web Science Institute at the University of Southampton. My pride and joy - the PhD students. We have one of these lovely EPSRC Doctorate Training Centres on web science and the wonderful thing about web science is it attracts as many women as men. This is an MSC PhD programme. They come in from different backgrounds: computer science, yes, but also social science, economics, law, humanities, archaeology, politics, and accountancy. These are just some of them, not all of them. I think we have 50 at the moment. This year, it’s our peak number. We just got another CDT funded.

I love working with them, they are, as I say, my pride and joy and my big project at the moment is my passion piece. My passion piece, at the moment, is building what we’re calling the Web Observatory to study the web, which means doing web science. There’s a paper in triple intelligence systems, the idea is like thinking about how we do this research, how you study the web in a longitudinal manner and you’ve got to study it all around the world. So I think of it like the physicists are, building the telescopes all round the world and they train them on the stars, they get the pictures, analyse them, share the results and share the tools to do that, so we get a map of what is going on in the heavens.

The environmental scientists do that for climate science. They have meteorological people measuring the depths of the oceans and the rates at which the glaciers are melting, and all that stuff and collecting it and sharing it. But there are issues in climate science because people say: ‘I spent a lot of money collecting that data; I’m not letting you have it ‘till I publish my paper’. This is the new world of research; it’s all about sharing data.

I want to do this for the web; get all the people that are doing work on the web, including working with companies who have a lot of data that may be very commercially sensitive, very private about people, and governments, to collaborate, to share that data and share the tools they’re using to analyse it. It won’t all be open data and some of it will be very, very closed so we have to use access control on this. There are all sorts of issues of privacy and trust and citing people’s data when you do the research.

But I believe if we do this, it will help us do longitudinal research and we will be able to come up with the evidence that we need to give governments and companies about what works and what doesn’t work, what makes the web good for humanity and what would stop it becoming something very bad that you don’t want to go anywhere near. It’s a big, big project, and it will be ambitious to map the digital universe.

You can read about this in the paper. We’re building this thing; it’s a global effort, as I said. We’re going to have levels of sharing, open, shareable, and very private stuff where you need very strict controls, health data and so on. This is at Southampton where we’re building what I call our ‘telescope’, with lots of people around the world. I’ve got this vision of a map of the world with lots of flags saying: 'I’m an observatory’, just like you have lots of meteorologists around saying I’m going to send you the temperature bit of my planet today and every day.

All the labs say: we’re an observatory and here’s the data we’ve got, you can look at this straightaway, this lot you have to get permission to use, and this lot you have to be very careful with what you do with it, and so on.

So we’re building, at Southampton, our version of a ‘telescope’ to enable us to store the data we’ve got and we want other people to use this technology as well, to share the tools we’re developing. It’s a repository for data sets full of analytical tools for using that data and for people who want to look.

So social scientists can look at this without having any programming language experience and can put their commentaries on the back of what’s happening and store it and that’s the important thing. We store it, so that, over time, people can add their interpretation of what happened at that time, why that event happened, what happened as a result, and then we link them all up around the world.

That is an amazingly big concept, but you just have to start somewhere, don’t you. So we got the web science labs beginning to set up observatories. This is big data, but distributed, so you can’t do all this on a big server because the data is going to stay where it’s been put, and it’s a bit like we did with open access actually. The key thing is having common standards so when you publish the data you can share it with people because the standards are common.

So that’s the big thing. What next? The web’s turning 25. I mentioned a lot of these things: privacy, trust, issues with cyber security, internet governance, net neutrality, the fall-out from the Snowdon affair and developing personal data; all have major issues and interesting things going forward. 

I’ve just been put on this Global Commission on Internet Governance and we’ve got the first meeting in Sweden this weekend. There are lots of people talking about how we’re going to manage this thing. It’s going to be like climate change because you’ve got the US, Europe and China as the big players; there’s Brazil and others as well, but they all have different and conflicting interests about what they want to do and it’s going to take a long time to get to any agreement, and it’s all about rights and responsibility; and regulation by the right sort of people. Not too much regulation; you don’t want too little either. Tim is talking about the Magna Charta for the internet, to build the web we want so that’s his big campaign at the moment, and very laudable it is too. So ‘keep calm and trust me, I’m an engineer!’

So how do you make a difference globally? I’m trying to do this in my research world. I don’t know if we’ll succeed or not, but we’ll have some fun along the way.

When I became President of the ACM (Association for Computing Machinery) I realised I was making a difference globally, because we set up the India, Europe and China councils and I got an ACMW on the seat of the council, and you’re working at a level where you think if we make a change here that will affect everything about publishing computer science or maybe influence other people to make sure that their women’s networks are at the top table, and that sort of thing.

How much difference have we really made? 

I find myself looking at glass walls. So I’m a Dame and I’m a Fellow of the Royal Society. Just look how many women there are in that picture. You can see how many men in suits there are and I’m the lady in the white suit here. It’s not because the Royal Society is sexist, it’s because there aren’t enough women at that level coming through. It’s all about filling the pipe and getting women all the along to get on the ladder and then make sure they go on to the top.

But I will just point out that this is the book that Newton signed. When you become a Fellow of the Royal Society you sign the same book that Isaac Newton and Hoyle and all the founders of the Royal Society signed right at the beginning here; it’s quite amazing, with ink, with a quill-tipped pen. 

Where are we in terms of women in computing?

I told you some of my stories, what’s beginning to work. Venus Swan is beginning to work in universities. Universities are begging to step up to the plate and say: ‘we are thinking about women’. We get judged and we go for bronze, silver and gold’; there are a lot of people doing it for themselves. 

I remember the young community actually saying: ‘we love coding, we love the stuff, it’s great to be women in computing’, and we really need to encourage that. The grass roots sort of stuff and it’s great to be women in computing, and we really need to be having women’s networks, they really help us to find our voice. I’m very glad this is a mixed audience; I so often talk to an audience, which is mostly female. 

It has to become an issue for us all because I keep getting asked to go out and be a role model, and I keep getting asked to do this, that and the other, as a woman in computing, and actually it’s not fair, because I just want to be a woman in computing. I want to be doing computing like the men are. The other thing I said: ‘let’s not repeat the mistakes of the past as programming is reintroduced into the classroom’.

This is me in India talking to the ACM chapter of the students in the engineering class. They were computer students in the engineering college in Chennai. Fifty-fifty, men: women and, in some places, more men than women. This is the cultural difference. They got to computing later than us, but they don’t have the baggage we had when it was introduced in the 80s, and there’s an issue about women going on into a career. It’s a big cultural thing there. You go to the Middle East and you go to Malaysia and Singapore, not in China so much, but you see lots of women desperate to go into computing and coming top of the class.

And then I go to CeBit, which is the big IT conference, and I walked around. I kept taking pictures, and it was all men, everywhere I went. Every other panel was all men and the audience was almost always men and then they asked me to run a meeting for women in IT and I walked into the room and there were no men in the room at all so we’d been segregated or we’d segregated ourselves off into another room. So while we were talking about women in computing, the men were all doing the big deals and talking to the CEOs and being on the panels and we think ‘no, this is madness, we really mustn’t do this’, it’s got to be an issue for us all.

So my last slide is...

As the wonderful KSJ said: ‘computing is too important to be left to men’ and that’s not to denigrate men; it’s just too important that only half the population does it and my message today is it’s time for men to start making the sacrifices too. There’s a campaign now going round that if you see an all-male panel men should refuse to be on it, and ask: ‘have you invited a woman?’ Men have got to lead this on behalf of us all and that means making a sacrifice and every time a man asks: ‘have you asked a woman?’ they are giving up their place for a woman. This is so important, and we’ve got to break down this stereotype.

There are no comments on this item

Leave Comment

Post a comment