Demystifying SHAP: The 2024 Guide I Wish I Had for Explainable AI

Demystifying SHAP: The 2024 Guide I Wish I Had for Explainable AI

SHapley Additive exPlanations (SHAP) is a relatively recent method (less than 10 years old) that seeks to explain the decisions behind ML models in a more direct and intuitive way, avoiding “black box” solutions.

Its concept is based on game theory with very robust mathematics. However, to use this methodology in our daily lives, it is not necessary to have a complete mathematical understanding.

For those who want to learn more about how the theory was developed, I recommend reading this paper.

In this reading, I will try to demonstrate more practical interpretations of SHAP, understanding your main results.

Now let's get down to business! But for this, we need a model to interpret, right?

The model used here is a tree-based model for binary prediction of Diabetes. In other words, the model predicts people who have this pathology.

NOTE: This text is based on my notebook at Kaggle. For those who want to see the complete construction of a machine learning model and understand other model interpretation techniques, click here (notebook in portuguese).

To build this entire analysis, the shap library was used, initially maintained by the author of the article that originated the method and now by a vast community.

First, let's calculate the SHAP values by following the package's tutorials:

# Libs
import shap

# Define Explainer based on your model
explainer = shap.TreeExplainer(model=model)

# Shap Calcs
shap_values_train = explainer.shap_values(x_train, y_train)        

Note that I defined a TreeExplainer. This happened because my model is a tree-based model, so the library has a specific explainer for this family of models. In addition, so far what we have done was:

  • Define an explainer with the desired parameters (there are a variety of parameters for TreeExplainer, I recommend checking the options)
  • Calculate the SHAP values for the training sets

What are SHAP values?

With the SHAP values already calculated for our training set, we can evaluate how each value of each variable influenced in the result achieved by the predictive model.

In our case, we will be evaluating the model output in terms of probability, that is, the percentage X that the model presented to say whether the correct class is 0 (does not have diabetes) or 1 (has diabetes).

It is worth mentioning that this may vary from model to model: If you use an XGBoost model, your default output will not be probability like the random forest from the sklearn package.

To make the value in terms of probability, you can define it through the TreeExplainer parameters.

But the question that remains:

How can I interpret the SHAP values?

To do this, we will calculate the prediction probability result of the training set for any sample that predicted a positive value:

# Train set probability
y_pred_train_proba = model.predict_proba(x_train)

# Check the result
print('Probability of 0 -', 100 * y_pred_train_proba[3][0].round(2), '%.')
print('Probability of 1 -', 100 * y_pred_train_proba[3][1].round(2),'%.')        

The result:

Probability of 0 - 17.0 %.
Probability of 1 - 83.0 %.        

The code above generated the probability given by the model for both classes. Let's now visualize the SHAP values for that sample according to the possible classes:

# Shap Values for sample 3 / class 1
array([-0.01811709,  0.0807582 ,  0.01562981,  0.10591462, 0.11167778, 0.09126282,  0.05179034, -0.10822825])

# Shap Values for sample 3 / class 0
array([ 0.01811709, -0.0807582 , -0.01562981, -0.10591462, -0.11167778,-0.09126282, -0.05179034,  0.10822825])        
Well, so far it seems like a vector of strange numbers, right?

However, if you look more carefully you will notice that the values are the same in modulus! What makes the difference is the sign that inverts for each vector given the different class.

If we add these vectors together, what will we find for both classes?

# SHAP SUM - Class 1
shap_values_train[1][3].sum().round(2) # 0.33

# SHAP SUM - Class 0
shap_values_train[0][3].sum().round(2) # -0.33        

As I said, the values differ according to the sign. We can say that the SHAP values for the positive class are like the mirror of the negative class (in the binary case).

Keep this result in your DDR5 memory so we can evaluate another important factor to help us understand the SHAP values: the base value.

The base value is extracted from the Explainer itself and, in general, consists of the prediction limit of your model for it to choose a class as correct.

The binary classification presents this value intuitively: If our model has a probability greater than 50% for a class X, then it will say that that sample should be classified as such, because the model has greater confidence over that class.

If it were a situation with three classes to predict, the base value would be 33%. Note that this does not mean that only a probability above 33% would guarantee that the model selects that class, but the model needs to have the majority of the probability and be above 33%.

Let's take the base values from our Explainer:

# Get base value
expected_values = explainer.expected_value
base_value_1 = expected_values[1]
base_value_0 = expected_values[0]        

Basically (trust me), for our binary classifier, both values are 0.50!

Note that for both classes, the base value is the same, since this is the minimum value required for a model to define that class as correct.

No binary classification model would choose (by default, using those base values) that the class with 49% would be correct compared to the class with 51%. By default, the choice is defined based on the class with a higher probability.

It is worth mentioning that it is possible to change these base values in the models to ensure that a class has more strength to be predicted. This tactic can be used, for example, in imbalance problems where you need to give some kind of advantage to the minority class.

Finally, the sum of the SHAP values for a sample is defined as:


Article content
SHAP Interpretation

Where i refers to the predictive class (0 or 1).

Let's check this in code:

# SHAP Sums for sample 3
mdl_proba_1 = 100 * y_pred_train_proba[3][1].round(2)
mdl_proba_0 = 100 * y_pred_train_proba[3][0].round(2)
base_value_1 = 100 * expected_value[1].round(2)
base_value_0 = 100 * expected_value[0].round(2)
shap_sum_1 = mdl_proba_1 - base_value_1  # 0.83 - 0.50 = 0.33
shap_sum_0 = mdl_proba_0 - base_value_0  # 0.17 - 0.50 = -0.33        

And as a home office lesson, here is the question: The sum of the SHAP values of a class x added to the base value of that class give exactly the probability value of the model found at the beginning of this section!

But what about the SHAP values individually, what do they represent?

For this, we will use more code, using the positive class as a reference, that is: We want to understand the shap meaning given reference to class 1 (has diabetes)

Let's go back to the code:

for col, vShap in zip(x_train.columns, shap_values_train[1][3]): print('###################', col)
print('SHAP Value:',100*vShap.round(2))        

The result will be:

################### Pregnancies
SHAP Value: -2.0
################### Glucose
SHAP Value: 8.0
################### BloodPressure
SHAP Value: 2.0
################### SkinThickness
SHAP Value: 11.0
################### Insulin
SHAP Value: 11.0
################### BMI
SHAP Value: 9.0
################### DiabetesPedigreeFunction
SHAP Value: 5.0
################### Age
SHAP Value: -11.0        

Again, we are looking to sample 3 (the analysis will be the same for any sample):

Positive SHAP values such as Glucose, BloodPressure, SkinThickness, BMI and DiabetesPedigreeFunction influenced the model in predicting the positive class as correct.

In other words, positive values imply a preference towards the reference category.

Negative values such as Age and Pregnancies, on the other hand, seek to indicate that the true class is the negative one (the opposite).

In this example, if both were also positive, our model would result in a 100% prediction for the positive class, but since this did not happen, they represent the 17% that are against choosing the positive class (the same value of probability shown at the beginning of this article).

Furthermore, we can quantify in terms of percentage the contribution of each variable to the final response of that model by dividing it by the maximum possible contribution, which in this case is 50% (binary):

################### Pregnancies
Contribution: -4.0 %
################### Glucose
Contribution: 16.0 %
################### BloodPressure
Contribution: 4.0 %
################### SkinThickness
Contribution: 22.0 %
################### Insulin
Contribution: 22.0 %
################### BMI
Contribution: 18.0 %
################### DiabetesPedigreeFunction
Contribution: 10.0 %
################### Age
Contribution: -22.0 %        

Here we can see that Insulin and SkinThickness and BMI together had an influence of 62%. We can also see that the Age variable can cancel out the impact of SkinThickness or Insulin in this sample.

Conclusion

In short, you can think about SHAP values as...

Feature Contributions to build the model final output

Thus:

  • In this binary case, the sum of the SHAP values cannot exceed 50%
  • Positive values (given a reference class) indicate that the reference class is the correct in the prediction.
  • Negative values (given a reference class) indicate otherwise: the correct class is not the reference one, but another class.


Lucas Mascarenhas

Data Scientist | MLOps | LLM | Python | SQL | Azure Certified

6mo

Love this. Useful tips! 👏👏👏

Stephanie Galvão

Data Scientist Sr @ Santander | Data Science & Analytics Specialist | Control Systems Engineer | Databricks | PySpark

6mo

Muito bom e necessário!

Felipe Santos

Assistente Juridico | Engenheiro de Prompt

6mo

Muito útil 👏🏻

Fellipe Gomes

Data Scientist | AI & Machine Learning Specialist | Generative AI | Kaggle Master | Python & R

6mo

Thanks for sharing, nice tips!

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics