AI systems are increasingly making important decisions for ever-increasing numbers of people. The decisions these systems make can literally change people’s lives – and as AI systems become more entwined into real-world systems, the decisions they make need to be trustable.

The use of Black-box AI systems to make these decisions brings up basic questions of public trust. In this note, we’ll examine ways to mitigate these issues without losing the many benefits of using AI systems.

Types of Black-box Systems

There are two kinds of black-box AI systems.

The first is where the AI system is proprietary. The workings of the system are known but are held as a trade secret. The appropriate use of policy (corporate or political) can build trust in such systems, and we can consider this a solved problem that requires only the formulation of a good disclosure policy.

The second kind of black-box AI system is one that is too complicated for any human to comprehend, and the decisions made by these systems cannot be fully explained - even by the creators of the system.

Even if the decisions made by such an AI system are generally good, a society or business where important decisions are made by a mysterious entity that cannot truly be held to account runs contrary to democratic and corporate ideals of transparency and accountability.

AI Explainability has emerged as a major area of research that is attempting to answer challenging questions posed by the widespread deployment of the second type of black-box AI system.

Explainability Explained

An Explainable AI System is a system where the decision made by the AI can be explained to a wide range of users, and these users accept the explanation as reasonable, correct, and timely.

Before we go further, however, we should answer the fundamental question of why one should even care about explainability.

Let’s say you get a letter tomorrow that tells you your auto insurance rates are going up by 30%. If the letter does not provide an explanation, or provides an inadequate explanation, you are going to call your provider and you’re going to demand an explanation.

Depending on what they tell you, you can then do things; you move to another provider, you agree to pay the increased rates, or perhaps you negotiate a new rate.

Thus, explanations are solicited by humans for everyday or “local” events to help understand why something happened or is about to happen. An explanation empowers people to take actions that can lead to favorable outcomes for themselves.

Explainability is a basis for Trust

The lack of an explanation is funny when the AI gives you a head scratcher of a song recommendation, but similar AIs can be tasked with picking the right candidate for a job, or deciding who gets parole. These decisions demand an explanation, and governments across the world are starting to codify this demand into law.

The European Union came close to implementing a binding “Right to Explanation” into the General Data Protection Regulations (GDPR), but eventually published a non-binding version into law in 2016.

It is reasonable to assume that future laws will contain stronger versions of similar rights, and businesses need to be prepared to handle increasing demands for transparency and explainability from their customers as well as the government.

There are many benefits to having the explanation for an AI’s decision be generally accepted as reasonable and correct: User confidence and trust in the system will grow; there is a built-in safeguard against accusations of bias; the system can be shown to meet regulatory standards or policy requirements; and the developers of the system can continue to improve its performance.

Building Explainable AI Systems

There are various ways to build Explainable AI Systems. Three commonly used methods are listed below, but researchers are constantly working on creating new solutions.

  1. Self-explainable models – the model itself is the provided explanation. This is the most interpretable and easily understood approach, and eliminates the issues associated with black-box models.

  2. Global Explainable AI algorithms – this an approach that treats an AI algorithm as a black-box that can be queried to produce a model that explains the algorithm. This is not as good as a self-explainable model, but does allow for a higher degree of confidence in the decisions made by the system.

  3. Per-Decision Explainable AI algorithms – these algorithms take both a black-box model that can be queried, and a single decision of this model and explains why the model made that particular decision.

Another way of verifying the algorithm comes through the use of counterfactuals – “what if” questions that can contrast multiple scenarios. For example, if an AI system takes an input of “Play a pop song from the 2000s” and responds with a certain decision, the response to the counterfactual “What if I ask for a rock song from the 2000s” can give us a better understanding of the inner workings of the system

Should all AI Systems be Explainable?

Given that it is apparently possible to build explainable AI systems, should all AI systems then be explainable?

On the face of it, this would seem like a reasonable thing to do. After all, can’t human beings explain all the decisions that they make?

Or can they?

As it turns out, human-produced explanations can be quite unreliable. Even more surprisingly, it turns out that for certain kinds of decisions, forcing a person to provide an explanation actually lowers the quality of the decision!

Gaining expertise in a subject makes the decision-making process increasingly unconscious and automatic, and forcing an explanation interferes with this automatic process.

Elite athletes are an obvious example – Roger Federer instinctively knows where to hit a tennis ball to win the point, but if he started to think about it as he hit it, he’d probably make a mess of it (or not, but then he’s Roger Federer).

This is an obvious analogue to certain black-box AI systems that appear to be delivering amazing results, and forcing all such systems to be explainable may result in losing the significant value of these systems.

The fundamental issue causing unease is the lack of accountability. It is not possible to hold an AI system accountable for a bad decision in the same way that we can hold a human being accountable. In addition, punitive structures that work with humans cannot be made to work with AI systems.

Human beings constantly self-regulate against a complex background of social and moral norms, business rules, and civil and criminal laws. An AI may be built to try to self-regulate in a similar manner, but if it fails there are no real consequences to that failure.

Trying to hold the creators of the system accountable is unjust on the face of it, since they themselves do not know why their system behaves the way it does after a certain point.

When is Explainability necessary?

Since we are primarily concerned with the lack of accountability, it would appear that any system where the decision is important enough to demand accountability needs to be explainable.

Systems that can make life-or-death decisions, or decisions where people’s lives can be affected in serious ways must make explainable decisions.

This can be regulated through policy (insurance companies in the US must be able to explain how they arrived at a rate decision) or will end up being regulated through social forces (the pushback against self-driving cars and social media engagement algorithms). There is little appetite for a black-box AI to make decisions of this nature.

However, there are many applications where AI systems may not need to be formally explainable. Many industries have decision making systems that don’t need perfect reproducibility or explainability.

For example, think about using AI to regulate the HVAC system for a large office building. As long as the temperature and humidity are maintained within certain norms, people are unlikely to ask for precise explanations why the system decided to set the thermostat to a certain temperature. The system can lead to large savings in energy costs.

Or take the case of a mapping application. As long as it gets you from point A to B in a safe and mostly reliable manner, you are not likely to ask why it took Daneel Street instead of Giskard Street. The benefits outweigh the consequences even if the system gets it occasionally wrong.

We do not necessarily have to make a trade-off (if one even exists) between utility and explainability.

As we enter the third decade of the 21st century, we can see the shape of the future to come. AI Systems are a real and important part of this future, and methods to increase the trust in these systems can only make them a more valued part of our society.

And perhaps this is a precursor to that distant dream of true Artificial General Intelligence – but that’s a subject for another time!