All models are wrong... but some are useful

Welcome to 2017. 2016 was an interesting year, wasn’t it? The consequences of last year’s decisions on Brexit/Trump may well dominate 2017 and the impact on the IT industry of a retreat from globalisation is an area of much uncertainty.

Much has been made about how the polls and predicative analytics called both Trump and Brexit wrongly. I have some sympathy with the pollsters. Their results were much closer to reality than the poor interpretation in the media. Some of the people I follow did manage to call both Brexit and Trump correctly based on analytics.

At the same time, Michael Gove’s phrase ‘fed up with experts’ struck a chord with many. Think about the following questions:

  1. What will be the price of Brent crude oil on 31/12/2018?
  2. Will Putin still be in power in 2020?
  3. Will China enter a recession before 2020?
  4. What will be the average age of a woman at the birth of her first child in England in 2019?
  5. Who will win the UEFA Champions League in 2018-19?
  6. How many iPhones will be shipped globally in 2018?
  7. What will the average house price be in England and Wales at 31/12/2017?

How many of these do you think your judgement would match an ‘expert’ judgement? Importantly, which ‘expert’ would you see as credible? Now these are big challenges and each is worthy of a thesis or two, let alone a blog. My focus here is which of these problems can we get better than a random answer by big data and analytics?

Take the last, house prices, as an example. The RICS survey is a widely-respected indicator of short term trends in house prices. It surveys much of the property chain and from experience its indicator is highly reliable about six months into the future, because activity at the start of the chain translates into sales down the line. Changes in time to sell and gaps between asking price and sales price are very useful indicators of shift in sentiment.

The real challenge is ‘events’. If the super volcano in Italy blows or air space is impeded by an Icelandic eruption, sentiment can shift suddenly, as happened on 9/11. Furthermore, the assassination of a European political Leader, for instance, is unlikely to be considered as relevant to six of the seven questions above, at least until it is. I have in this blog before used George E P Box’s dictum ‘all models are wrong, but some are useful.’

Judgements about the future can be considered as a spectrum from pure luck to insight and expertise. The Saturday night lottery numbers are at one end, but how far can the questions above move away from random towards expertise?

The work of Philip Tetlock, and, in particular, his book ‘Superforecasting’ spells out in great detail with evidence what we know about forecasting. Here in the UK we often moan about the quality of weather forecasts but they are among the most reliable forecasts we get in our daily lives.

The pundits in magazines, on TV and in books who get paid large sums for their views are probably no more reliable on the answers to the questions above than a reasonable reader of this blog. Is Michael Gove onto something? The evidence strongly points to overconfidence in all areas. Doctor forecasts of how long a terminal cancer sufferer will live, or an economist forecasting inflation levels in the US in 2017, overestimate their own insight.

There is a model, much beloved of futurists, derived from the work of Oxford Philosopher Isaiah Berlin. He describes two types of people: hedgehogs and foxes. Hedgehogs have a dominant world view, a single way of seeing the world. Everything may be seen through the lens of a variety of models, be they religious, political, academic discipline, but a hedgehog will follow one. An animal rights individual (hedgehog) might know nothing about football but may know why Leicester FC are called the foxes.

Foxes, by way of contrast, have a variety of strategies and world views. In my experience, they are great value in quiz teams. Most people have a mix of both, but in organisational roles, the hedgehog tendency tends to dominate.

What Philip Tetlock’s work demonstrates is that in many areas, foxes outperform hedgehogs at forecasting. It isn’t what you think or what you know, but how you think that makes the difference. Importantly, it is possible to improve your forecasting ability.

The societal challenge we face is that machine learning can match and outperform some experts. Algorithms exist that, for instance, are better at predicting whether an individual patient in Intensive Care has a chance of survival than most doctors. Even if this is true, will society accept ending treatment because the computer ‘said no’?

However, my suspicion is that the more accurate the hedgehog worldview, the better and more reliable the algorithms and big data will be. With problems that have a higher ‘fox’ content the more computationally complex the future predictions will be.

So, look again at the questions above. Which do you believe are tractable to predictive analytics? My own view would be 1, 4, 6 and 7. We could probably look at q5 and narrow it down to 12-16 teams in Europe and be confident that it would be one of them. Many teams, such as Walsall or Wolves could be safely eliminated. We might be 99 per cent confident that it would be one of our 16 and may be 50 per cent confident around say four of the 16. Even here, a tragedy such as the Munich air disaster or the recent Chapacoense crash could eliminate a clear favourite.

Even with 1, 4, 6 and 7, there will be times when the model we believe in fails to work. A global flu pandemic for instance might disrupt all aspects of the economy and society. How many of the seven questions above would have their principle forecasts altered by such an exogenous event?

If experts suffer from overconfidence, how sure can we be that our predictive analytics models won’t in time become similarly overconfident, or we become over reliant on them working? Remember, a turke, survives every day till it doesn’t. Each day it survives, so conventional wisdom suggests that it should be more confident that it will survive. It only has to be wrong once (and it will be). Yet every day the sun rises, it does give us more confidence that it will rise tomorrow.

The moral and ethical issues that this throws up need strong public discourse. Let me illustrate.

How long do you think it will be before we have a reliable way of slowing down the progress of dementia? How long before a cure? How confident would you be with a model derived by machine learning? If you or a loved one were diagnosed with dementia this year how confident would you be that you could live long enough for the treatment to be readily available? Are you prepared to accept the risk to your quality of life should the forecast be wrong?

My forecast for 2017 is that this will be the year of the fox. Can we construct an AI that behaves like a fox and produces multiple forecasts and reliable weighting, e.g. 30 per cent A, 25 per cent B, 40 per cent C, 5 per cent D?

Have a great 2017.

About the author
Chris is a technology and policy futurologist. Chris has been in the IT industry since 1980. His roles have spanned Honeywell, ICL, HP, Microsoft and Capgemini. He is a Fellow of the BCS and a Fellow of the RSA.

See all posts by Chris Yapp
December 2017

Search this blog