What will you be discussing during your Turing lecture speech?
I will be focusing primarily on algorithmic decision-making and how to train learning algorithms to make decisions in a fair way. I will be looking at various different notions of fairness. For instance, we would want decisions to be non-discriminatory and without any bias. We would want them to be transparent and we would want the outcomes to be diverse. These all sound like fine goals, but the key question is how should we think about this? What does it mean to be, ‘non-discriminatory’, and how do we train algorithms to be non-discriminatory.
How can we train artificial intelligence to be unbiased?
The important thing here is to essentially understand the word unbiased; it’s a word that carries with it a lot of different interpretations. What do we really mean by discrimination? One way in which discrimination can manifest itself is when you’re learning or when you’re training algorithms. Suppose that you try to train an algorithm to make the minimum amount of errors in its predictions over an entire population.
Let’s say the algorithm that you’re trying to train is one that we want to predict who is going to recidivate or reoffend in the near future. There was a lot of discussion in the news about an algorithm called Compass, which was being used in several jurisdictions in the U.S. to help judges or federal officers by giving them some assessments as to how likely some criminal defendant was to reoffend in the near future.
Now, suppose you were training that algorithm over some existing historical data. You have some historical data that shows which type of criminals have reoffended in the past. You want to train your algorithm to pick up patterns in this historical training data. That data would contain some features of the criminal. So you would want to see what types of criminals are more likely to reoffend, as opposed to criminals with different types of features that might be less likely to reoffend. Those are the kinds of patterns that you would want to train your algorithm to take.
Traditionally, in learning, when you try to pick up these patterns you tend to specify an objective function. That objective function typically is of the form: ‘I want to pick up the pattern so that when I make predictions I minimise the sum of the errors that I make for all the individuals in the population.’
Now, that seems like a very reasonable goal because whenever you make a prediction you might go wrong for certain individuals in the population. It seems like a very reasonable thing to do to minimise the errors you would make in predictions for the entire population. The problem occurs when the population has two different sub-groups of people. Say these correspond to gender or race or whatever else.
When you’re trying to make decisions you minimise the sum of the errors for all the individuals in the population, and if those individuals belong to two different races, it’s quite possible that you would actually prefer another algorithm, one that might learn a decision boundary that makes few errors for one group, but at the expense of more errors for another group.
When you’re making certain decisions, you are essentially trading off between the errors that you might make for different individuals or groups of people in the population. It’s actually quite possible that you would be highly accurate for one sub-group of people and highly inaccurate for another. Now that is what might lead to discrimination.
If you want to be anti-discriminatory, you might want to have an additional objective that’s safe beyond minimising the sum of errors and predictions for all the individuals in the population. But, unless you specify this to an algorithm it would end up picking up patterns that could be discriminatory.
In our work what we have argued, and what we have shown, is actually how to specify these additional objectives in terms of error rates at the level of groups to avoid learning discriminatory decision-making.
Is it possible to measure bias and unfair decision-making in machines with your methodology?
The short answer is yes, but the more nuanced answer is coming up with those measures and is actually one of the fundamental problems. When we’re talking about measuring bias there are two or three ways in which you could think about it. You could measure the bias in the outcomes, or you could measure in the procedure itself. I think, if you’re considering the problem of measuring the bias in outcomes, the results of the decision-making, then the problem is no different than the one that you would have in human decision-makers.
Now the place where it gets a lot trickier is when you have to reason about bias of the procedures that are being used to make decisions. Here the problem is to understand how humans make decisions. You could ask a human to explain the intent of their decision making or you could look at a particular situation and have a sense of whether or not the errors that are made by human decision-makers seem like reasonable ones or seem like they’re driven by some extreme bias.
These are the sort of things that people have a good sense for when it is a human decision maker. But asking these questions in the context of algorithmic decision-making and the procedures by which learning algorithms make decisions, that becomes a bit more tricky. Because this raises the question of what’s the intent of an algorithm when it is making decisions. That’s where you have to think more carefully.
What are the biggest challenges you are currently facing in your line of research?
I think the biggest challenge is the fact this is an inter-disciplinary topic, which requires an understanding of the notions of fairness, accountability and transparency. These are topics that have been traditionally studied in social sciences. In these disciplines, these topics are approached in a very different manner than the way in which people in computer science approach them.
The challenge is we would want to look at these notions of fairness, accountability and transparency through a computational lens, from an algorithmic perspective. This requires us to essentially translate some of these notions in formal ways. That is, we want to be able to say: ‘Well, this idea of non-discrimination that you are considering in this decision-making scenario would translate into this particular pattern of making decisions and that pattern is something that you would want to specify formally in the form of an equation, or in the form of a constraint.’ This translation is actually the most difficult thing to do.
What’s your ultimate goal with this line of research?
How would you formalise accuracy so that you can actually learn to make decisions that are most accurate from the data? Meaning there is just one objective that people focus on, which is to minimise the sum of errors that people make for individual users in the entire data set.
When we make decisions in the real world, they actually account for a number of different types of objectives beyond that. At a high level my goal is to explore that rich set of objectives that one should have when making decisions rather than just hang onto this one single objective. But that’s what I think is needed to make the decision-making fair. My goal would be essentially to explore a different way to think about the topic of fairness; what would be a fair way of making decisions.
I feel that today, if you look at how algorithms are trained to make decisions, we are failing to capture the rich set of objectives that people usually have. Currently my research is focused more on exploring that space of different objectives, and figuring ways to formally specify them to learning algorithms, so that we can train algorithms that will be fair according to all those objectives. And so, in short, it will learn how to make fair automated decisions.