-
This repository provides a summarization of recent empirical studies/human studies that measure human understanding with machine explanations in human-AI interactions.
-
We focused on quantative measures. Based on our survey, we idenfied three key concepts for measuring human understanding in human-AI decision making: model decision boundary (g), model error (z), and task decision boundary (f). A short description of those three concepts are in the end.
-
In the table, we show the title, AI models used, whether the model predictions were shown or hidden from participants, explanations, and if the study measured any aspects of g, z, or f. ✔️ (or ✖️) indicates that a study measures (or does not measure) a specific aspect of human understanding.
-
Papers are organized chronologically, with the most recent studies listed first.
-
Check our paper for more details:sunglasses:! Based on the survey, we proposed a cool theoretical framework to describe the relationship between machine explanations and human understanding, which could be helpful for future works.
✨ Feel free to pull requests to add more papers! You can also contact me at chacha@uchicago.edu 🍻
Paper (Year) | Model | Prediction | Explanations | g | z | f |
---|---|---|---|---|---|---|
Colin et al. Human-algorithm collaboration: Achieving complementarity and avoiding unfairness. (2022) | InceptionV1, ResNet | Hidden | Local feature importance (Saliency, Gradient Input, Integrated Gradients, Occlusion (OC), SmoothGrad (SG) and Grad-CAM) | ✔️ | ✖️ | ✖️ |
Taesiri et al. Visual correspondence-based explanations improve AI robustness and human-AI team accuracy. (2022) | ResNet, kNN, other deep learning models | Shown | Confidence score, example-based methods (nearest neighbors) | ✖️ | ✔️ | ✔️ |
Kim et al. Hive: evaluating the human interpretability of visual explanations. (2022) | CNN, BagNet, ProtoPNet, ProtoTree | Mixed | Example-based methods (ProtoPNet, ProtoTree), local feature importance (GradCAM, BagNet) | ✔️ | ✔️ | ✔️ |
Nguyen et al. The effectiveness of feature attribution methods and its correlation with automatic evaluation scores. (2021) | ResNet | Shown | Model uncertainty (classification confidence (or probability)); Local feature importance (gradient-based, salient-object detection model); Example-based methods (prototypes) | ✖️ | ✔️ | ✔️ |
Buccinca et al. To trust or to think: Cognitive forcing functions can reduce overreliance on AI in AI-assisted decision-making. (2021) | Wizard of Oz | Shown | Model uncertainty (classification confidence (or probability)) | ✖️ | ✔️ | ✔️ |
Chromik et al. I think i get your point, ai! the illusion of explanatory depth in explainable ai. (2021) | Decision trees/random forests | Shown | Local feature importance (perturbation-based SHAP) | ✔️ | ✖️ | ✖️ |
Nourani et al. Anchoring bias affects mental model formation and user reliance in explainable AI systems. (2021) | Other deep learning models | Shown | Local feature importance (video features) | ✔️ | ✔️ | ✔️ |
Liu et al. Understanding the effect of out-of-distribution examples and interactive explanations on human-AI decision making. (2021) | Support-vector machines (SVMs) | Shown | Local feature importance (coefficients) | ✔️ | ✔️ | ✔️ |
Wang et al. Are explanations helpful? a comparative study of the effects of explanations in AI-assisted decision-making. (2021) | Logistic regression | Shown | Example-based methods (Nearest neighbor or similar training instances); Counterfactual explanations (counterfactual examples); Global feature importance (permutation-based); | ✔️ | ✔️ | ✖️ |
Poursabzi et al. Manipulating and measuring model interpretability. (2018) | Linear regression | Shown | Presentation of simple models (linear regression); Information about training data (input features or information the model considers) | ✔️ | ✔️ | ✔️ |
Bansal et al. Does the whole exceed its parts? the effect of ai explanations on complementary team performance. (2020) | RoBERTa; Generalized additive models | Shown | Model uncertainty (classification confidence (or probability)); Local feature importance (perturbation-based (LIME)); Natural language explanations (expert-generated rationales | ✖️ | ✔️ | ✔️ |
Zhang et al. Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. (2020) | Decision trees/random forests | Shown | Model uncertainty (classification confidence (or probability)); Local feature importance (perturbation-based SHAP); Information about training data (input features or information the model considers) | ✖️ | ✔️ | ✔️ |
Abdul et al. Cogam: Measuring and moderating cognitive load in machine learning model explanations. (2020) | Generalized additive models | Shown | Global feature importance (shape function of GAMs) | ✔️ | ✖️ | ✖️ |
Lucic et al. Why does my model fail? contrastive local explanations for retail forecasting. (2020) | Decision trees/random forests | Hidden | Counterfactual explanations (contrastive or sensitive features) | ✔️ | ✖️ | ✖️ |
Lai et al. Why is 'chicago' deceptive?" towards building model-driven tutorials for humans. (2020) | BERT; Support-vector machines | Shown | Local feature importance (attention); Model performance (accuracy); Global example-based explanations (model tutorial) | ✖️ | ✔️ | ✔️ |
Alqaraawi et al. Evaluating saliency map explanations for convolutional neural networks: a user study. (2020) | Convolution Neural Networks | Hidden | Local feature importance (propagation-based (LRP), perturbation-based (LIME)) | ✔️ | ✖️ | ✖️ |
Carton et al. Feature-based explanations don’t help people detect misclassifications of online toxicity. (2020) | Recurrent Neural Networks | Shown | Local feature importance (attention) | ✖️ | ✔️ | ✔️ |
Hase et al. Evaluating explainable AI: Which algorithmic explanations help users predict model behavior? (2020) | Other deep learning models | Shown | Local feature importance (perturbation-based (LIME)); Rule-based explanations (anchors); Example-based methods (Nearest neighbor or similar training instances); Partial decision boundary (traversing the latent space around a data input) | ✔️ | ✖️ | ✖️ |
Buccinca et al. Proxy tasks and subjective measures can be misleading in evaluating explainable ai systems. (2020) | Wizard of Oz | Mixed | Example-based methods (Nearest neighbor or similar training instances) | ✔️ | ✔️ | ✔️ |
Kiani et al. Impact of a deep learning assistant on the histopathologic classification of liver cancer. (2020) | Other deep learning models | Shown | Model uncertainty (classification confidence (or probability)); Local feature importance (gradient-based) | ✖️ | ✔️ | ✔️ |
Gonzalez et al. Human evaluation of spoken vs. visual explanations for open-domain QA. (2020) | Other deep learning models | Shown | Extractive evidence | ✖️ | ✔️ | ✔️ |
Lage et al. Human evaluation of models built for interpretability. (2019) | Wizard of Oz | Mixed | Rule-based explanations (decision sets) | ✖️ | ✔️ | ✔️ |
Weerts et al. A human-grounded evaluation of shap for alert processing. (2019) | Decision trees/random forests | Hidden | Model uncertainty (classification confidence (or probability)); Local feature importance (perturbation-based SHAP) | ✖️ | ✔️ | ✔️ |
Guo et al. "Visualizing uncertainty and alternatives in event sequence predictions." (2019) | Recurrent Neural Networks | Shown | Model uncertainty (classification confidence (or probability)) | ✖️ | ✖️ | ✔️ |
Friedler et al. "Assessing the local interpretability of machine learning models." (2019) | Logistic regression; Decision trees/random forests; Shallow (1- to 2-layer) neural networks | Hidden | Counterfactual explanations (counterfactual examples); Presentation of simple models (decision trees, logistic regression, one-layer MLP) | ✔️ | ✖️ | ✖️ |
Lai et al. "On human predictions with explanations and predictions of machine learning models: A case study on deception detection." (2019) | Support-vector machines | Shown | Example-based methods (Nearest neighbor or similar training instances); Model performance (accuracy) | ✖️ | ✔️ | ✔️ |
Feng et al. "What can AI do for me: Evaluating machine learning interpretations in cooperative play." (2018) | Generalized additive models | Shown | Model uncertainty (classification confidence (or probability)); Global example-based explanations (prototypes) | ✖️ | ✔️ | ✔️ |
Nguyen et al. "Comparing automatic and human evaluation of local explanations for text classification." (2018) | Logistic regression; Shallow (1- to 2-layer) neural networks | Hidden | Local feature importance (gradient-based, perturbation-based (LIME)) | ✔️ | ✖️ | ✖️ |
Ribeiro et al. "Anchors: High-Precision Model-Agnostic Explanations" (2018) | VQA model (hybrid LSTM and CNN) | Hidden | Rule-based explanations (anchors) | ✔️ | ✖️ | ✖️ |
Chandrasekaran et al. "Do explanations make vqa models more predictable to a human?" (2018) | Convolution Neural Networks | Hidden | Local feature importance (attention, gradient-based) | ✔️ | ✖️ | ✖️ |
Biran et al. "Human-centric justification of machine learning predictions." (2017) | Logistic regression | Shown | Natural language explanations (model-generated rationales) | ✖️ | ✔️ | ✔️ |
Lakkaraju et al. "Interpretable decision sets: A joint framework for description and prediction." (2016) | Bayesian decision lists | Hidden | Rule-based explanations (decision sets) | ✔️ | ✖️ | ✖️ |
Ribeiro et al. "Why Should I Trust You?: Explaining the predictions of any classifier." (2016) | Support-vector machines; Inception neural network | Shown | Local feature importance (perturbation-based (LIME)) | ✔️ | ✖️ | ✖️ |
Bussone et al. "The Role of Explanations on Trust and Reliance in Clinical Decision Support Systems" (2015) | Wizard of Oz | Shown | Model uncertainty (classification confidence (or probability)) | ✖️ | ✔️ | ✖️ |
Lim et al. "Why and why not explanations improve the intelligibility of context-aware intelligent systems." (2009) | Decision trees/random forests | Shown | Rule-based explanations (tree-based explanation); Counterfactual explanations (counterfactual examples) | ✔️ | ✖️ | ✖️ |
- We use a two-dimensional binary classification problem to illustrate the three concepts.
- Task decision boundary, as represented by the dashed line, defines the mapping from inputs to ground-truth labels: inputs on the left are positive and the ones on the right are negative.
- Model decision boundary, as represented by the solid line, determines model predictions.
- Model error, the yellow highlighted background between the two boundaries is where the model makes mistakes.