Algorithms to infer local explanations and their generalization to global ones (post-hoc) and algorithms that are transparent by-design
Local to Global Explanations
From the experience of the surveys (missing reference), we developed and designed various explanation methods with a focus on local rule-based explainers. Also, our goal was to “merge” such local explanations to reach a global consensus on the reasons for the decisions taken by an AI decision support system.
1.1 Local Rule-based Explainer
Our first proposal is the LOcal Rule-based Explainer (LORE) presented in (Guidotti et al., 2019). LORE is a model-agnostic local explanation method that returns as an explanation a “factual rule revealing the reasons for a decision, and a set of counterfactual rules illustrating how to change the classification outcome. The first peculiarity of LORE is that it adopts a synthetic generator based on a genetic algorithm. The second peculiarity of LORE is that it adopts a decision tree as a local surrogate, thus (i) decision rules can naturally be derived from a root-leaf path in a decision tree; and, (ii) counterfactuals can be extracted by symbolic reasoning over the tree.
1.2 Plausible Data-Agnostic Local Explanations
LORE was initially designed to deal with tabular data and binary classification problems. We extended it to work on other data types and for multiclass problems. Also, we are currently working to solve some limitations of LORE related to the stability and actionability of the explanations. In line with LIME, our idea was to extend LORE for working on any data type. Indeed, our objective was to define a local model-agnostic and data-agnostic explanation framework to explain the decisions taken by obscure black-box classifiers on specific input instances. Therefore, our proposal would not be tied to a specific type of data or a specific type of classifier. Besides being model-agnostic, LIME is also data-agnostic. However, LIME employs conceptually different neighborhood generation strategies for tabular data, images, and texts. For images, LIME randomly replaces actual super-pixels with super-pixels containing a fixed color. For texts, it randomly removes words. Thus, both for images and text LIME “suppresses” parts of the actual information in the data. On the other hand, for tabular data, LIME assumes uniform distributions for categorical attributes and normal distributions for the continuous ones. Such limitations prevent LIME from basing the local regressor used to extract the explanation on meaningful synthetic instances. Our proposal allows overcoming these limitations by guaranteeing comparable synthetic data generation among all the different data types, ensuring meaningful synthetic instances to learn interpretable local surrogate models. Our idea was to extend LORE (Guidotti et al., 2019) to overcome the limitations of existing approaches by exploiting the latent feature space learned through different types of autoencoders to generate plausible synthetic instances during the neighborhood generation process. Given an instance of any type classified by a black-box, the Latent-LORE (LLORE) allows instantiating a data-specific explainer following the explanation framework structure. The explainer will be able to return a meaningful explanation for the classification reasons. LLORE-based approaches work as follows. First, they generate synthetic instances in the latent feature space using a pre-trained autoencoder (GAM, AAE, VAE, etc.). Then, they learn a latent decision tree classifier. After that, they select and decode the synthetic instances respecting the latent local decision rules observed on the decision tree. Finally, independently from the data type, they return an explanation that always consists of a set of synthetic exemplars and counter-exemplars instances illustrating, respectively, instances classified with the same label and with a different label than the instance to explain, which may be visually analyzed to understand the reasons for the classification. Additionally, a data-specific explanation can be built on the exemplars and counter-exemplars. We instantiated LLORE for images (Guidotti et al., 2019; Guidotti et al., 2020), time series (Guidotti et al., 2020) and text (Lampridis et al., 2020) realizing ad-hoc logic-based explanations. A wide experimentation on datasets of different types and explaining different black-box classifiers empirically demonstrate that LLORE-based explainers overtakes existing explanation methods providing meaningful, stable, useful, and really understandable explanations. In (missing reference) we employed ABELE in a case study for skin lesion diagnosis, illustrating how it is possible to provide the practitioner with explanations on the decisions of a Deep Neural Network (DNN). We have proved that after being customized and carefully trained, ABELE can produce meaningful explanations that really help practitioners. The latent space analysis suggests an interesting partitioning of images over the latent space. Still in (missing reference) is reported a survey involving real experts in the health domain and common people that supports the hypothesis that explanation methods without a consistent validation are not useful. As highlighted by these works, the context of synthetic data generation for local explanation methods it is important to generate data samples located within “local” areas surrounding specific instances. The problem with generative adversarial networks and autoencoders is that they require a large quantity of data, and a not negligible training time. In addition, such generative approaches are suited only for particular types of data. In (Guidotti & Monreale, 2020) we overcome these drawbacks proposing DAG, a Data-Agnostic neighborhood Generation approach that, given an input instance and a (small) support set, returns a set of local realistic synthetic instances. DAG applies a data transformation that enables the generation for any type of input data. It is based on a set of generative operators inspired to genetic programming. Such operators work by applying specific vector perturbations by following a fast procedure that only requires a small set of instances to support the data generation. A wide experimentation on different types of data (tabular data, images, time series, and texts) and against state-of-the-art local neighborhood generators shows the effectiveness of DAG in producing realistic instances independently from the nature of the data.
1.3 Local Explanation 4 Health
In order to enable explainable AI systems to support medical decision-making, it is necessary to enable XAI techniques to deal with typical healthcare data characteristics. We incrementally addressed such a problem with the contributions presented in (Panigutti et al., 2019; Panigutti et al., 2020). (Panigutti et al., 2019) presents MARLENA (Multi-lAbel RuLe-based ExplaNAtions), a model-agnostic XAI methodology to address the outcome explanation problem in the context of multi-label black box outcomes. Building on the insights we gained from the experiments carried out in (Panigutti et al., 2019), we developed Doctor XAI (Panigutti et al., 2020), a model-agnostic technique that is suitable for multi-label black box outcomes and it is also able to deal with ontologically-linked and sequential data. Two key aspects of the presented approach are that it exploited the ontology in creating the synthetic neighborhood and employed a novel encoder/decoder scheme for sequential data that preserves the interpretability of the features. The ontological perturbation allows us to create synthetic instances that consider local features interactions by perturbing the set of neighbors available in the dataset masking semantically similar features. We tested Doctor XAI in two scenarios. First, we tested the ability of Doctor XAI combined with a local-to-global approach to audit a fictional commercial black box. This resulted in a framework for auditing clinical decision support systems called FairLens (Panigutti et al., 2021). FairLens first stratifies the available patient data according to demographic attributes such as age, ethnicity, gender and healthcare insurance; it then assesses the model performance on such groups highlighting the most common misclassifications. Finally, FairLens allows the expert to examine one misclassification of interest by exploiting DoctorXAI to explain which elements of the affected patients’ clinical history drive the model error in the problematic group. We validate FairLens’ ability to highlight bias in multilabel clinical DSSs introducing a multilabel-appropriate metric of disparity and proving its efficacy against other standard metrics. Finally in (missing reference), we presented the collective effort of our interdisciplinary team of data scientists, human-computer interaction experts and designers to develop a human-centered, explainable AI system for clinical decision support. Using an iterative design approach that involves healthcare providers as end-users, we present the first cycle of the prototyping-testing-redesigning of DoctorXAI and its explanation-user interface. We first present the DoctorXAI concept that stems from patients data and healthcare application requirements. Then we develop the initial prototype of the explanation user interface, and perform a user study to test its perceived trustworthiness and collect healthcare providers’ feedback. We finally exploit the users’ feedback to co-design a more human-centered XAI user interface taking into account design principles such as progressive disclosure of information.
1.4 Local to Global Approaches
Local explanations enjoy several properties: they are relatively fast and easy to extract, precise, and possibly diverse. Conversely, global explanations are more cumbersome to extract, and, having a larger scope, more general. Thus, these two families present complementary properties. The Local to Global explanation paradigm (Dino et al., 2018; Pedreschi et al., 2019) is a natural extension of the Local and Global paradigms, and aims to exploit the fidelity and ease of extraction of Local explanations to generate faithful, general, and simple Global explanations. In our work, we have focused on explanations in the form of axis-parallel decision rules, and have proposed two algorithms to tackle them, namely Rule Relevance Score (RRS) (Setzu et al., 2020) and GLocalX (Setzu et al., 2021). Rule Relevance Score (RRS) (Setzu et al., 2020) is a simple scoring framework in which we try to select, rather than edit, the local explanations. In other words, with RRS we construct global explanations by selecting local ones. RRS uses a multi-faceted scoring formula in which explanations are ranked according to their fidelity, coverage and outlier coverage, which rewards rules explaining seldomly explained records. GLocalX (Setzu et al., 2021) relies on three assumptions: (i) logical explainability, that is, explanations are best provided in a logical form that can be reasoned upon; (ii) local explainability, that is, regardless of the complexity of the decision boundary of the black box, locally it can be accurately approximated by an explanation; (iii) composability, that is, we can compose local explanations by leveraging their logical form. Starting from a set of local explanations in the form of decision rules constituting the structure’s leaves, GLocalX iteratively merges explanations in a bottom-up fashion to create a hierarchical merge structure that yields global explanations on its top layer. GLocalX shows a unique balance between fidelity and simplicity, having state-of-the-art fidelity and yielding small sets of compact global explanations. When comparing with natively global models, such as Decision Trees and CPAR, who have direct access to the whole training data, rather than the local explanations, GLocalX compares favorably. It’s only slightly less faithful than the most faithful model (~2% less faithful than a Decision Tree) while having a far simpler model (up to 1 order of magnitude smaller set of output rules). When compared with models of similar complexity, such as a Pruned Decision Tree, GLocalX is slightly more faithful and less complex.
1.5 Towards Interpretable-by-design Models
In parallel with the activity of designing local and global post-hoc explainers, we also started to explore directions for designing predictive models which are interpretable-by-design, i.e., they return the prediction and allow us to understand the reasons that lead to that prediction. Indeed, if the machine logic is transparent and accessible, as humans, we tend to trust more a decision process using a logic similar to that of a human being, rather than a reasoning that we can understand but that is outside the human way of thinking. In (Guidotti & D’Onofrio, 2021) we present MAPIC a MAtrix Profile-based Interpretable time series Classifier. MAPIC is an interpretable model for time series classification able to guarantee high levels of accuracy and efficiency while maintaining the classification, and the classification model, interpretable. In the design of MAPIC we followed the line of research based on shapelets. However, we replaced the inefficient approaches adopted in the state of the art for the search of the most discriminative subsequences with the patterns that is possible to extract from a model named Matrix Profile. In short, the Matrix Profile (MP) represents the distances between all subsequences and their nearest neighbors. From a MP it is possible to efficiently extract some patterns characterizing a time series such as motifs and discords. Motifs are subsequences of a time series which are very similar to each other, while discords are subsequences of a time series which are very different from any other subsequence. As a classification model, MAPIC adopts a decision tree classifier due to its intrinsic interpretability. We empirically demonstrate that MAPIC overtakes existing approaches having a similar interpretability in terms of both accuracy and running time. For these two last results and from GLocalX it is clear the importance of relying on sound decision tree models. A weak point of traditional decision trees is that they are not very stable and a common procedure to stabilize them is to merge various trees into an unique tree. In a certain sense, this is a form of explanation of a set of decision trees with a single model. Several proposals are present in the literature for traditional decision trees but there is a lack of merging operations for oblique trees and forests of oblique trees. Thus, in (Bonsignori et al., 2021) we combine XAI and the merging of decision trees. Given any accurate and complex tree-based classifier, our aim is to approximate it with a single interpretable decision tree that guarantees comparable levels of accuracy and a low complexity that permits us to understand the logic it follows for the classification. We propose a Single-tree Approximation MEthod (SAME) that exploits a procedure for merging decision trees, a post-hoc explanation strategy, and a combination of them to turn any tree-based classifier into a single and interpretable decision tree. Given a certain tree-based classifier, the idea of SAME is to reduce any approximation problem with another one for which a solution is known in a sort of “cascade of approximations” with several available alternatives. This allows SAME to turn Random Forests, Oblique Trees and Oblique Forests into a single decision tree. The implementation of SAME required adapting existing procedures for merging traditional decision trees to oblique trees by moving from an intensional approach to an extensional one for efficiency reasons. An experimentation on eight tabular datasets with different size and dimensionality compares SAME against a baseline approach (PHDT) that directly approximates any classifier with a decision tree. We show that SAME is efficient and that the retrieved single decision tree is at least as accurate as the original non interpretable tree-based model.
References
2021
FairLens: Auditing black-box clinical decision support systems
Cecilia
Panigutti, Alan
Perotti, André
Panisson, Paolo
Bajardi, and Dino
Pedreschi
Highlights: We present a pipeline to detect and explain potential fairness issues in Clinical DSS. We study and compare different multi-label classification disparity measures. We explore ICD9 bias in MIMIC-IV, an openly available ICU benchmark dataset
@article{PPB2021,author={Panigutti, Cecilia and Perotti, Alan and Panisson, André and Bajardi, Paolo and Pedreschi, Dino},doi={10.1016/j.ipm.2021.102657},issn={0306-4573},journal={Information Processing & Management},line={1,4},month=sep,number={5},open_access={Gold},pages={102657},publisher={Elsevier BV},title={FairLens: Auditing black-box clinical decision support systems},visible_on_website={YES},volume={58},year={2021}}
GLocalX - From Local to Global Explanations of Black Box AI Models
Mattia
Setzu, Riccardo
Guidotti, Anna
Monreale, Franco
Turini, Dino
Pedreschi, and
1 more author
Artificial Intelligence (AI) has come to prominence as one of the major components of our society, with applications in most aspects of our lives. In this field, complex and highly nonlinear machine learning models such as ensemble models, deep neural networks, and Support Vector Machines have consistently shown remarkable accuracy in solving complex tasks. Although accurate, AI models often are “black boxes” which we are not able to understand. Relying on these models has a multifaceted impact and raises significant concerns about their transparency. Applications in sensitive and critical domains are a strong motivational factor in trying to understand the behavior of black boxes. We propose to address this issue by providing an interpretable layer on top of black box models by aggregating “local” explanations. We present GLocalX, a “local-first” model agnostic explanation method. Starting from local explanations expressed in form of local decision rules, GLocalX iteratively generalizes them into global explanations by hierarchically aggregating them. Our goal is to learn accurate yet simple interpretable models to emulate the given black box, and, if possible, replace it entirely. We validate GLocalX in a set of experiments in standard and constrained settings with limited or no access to either data or local explanations. Experiments show that GLocalX is able to accurately emulate several models with simple and small models, reaching state-of-the-art performance against natively global solutions. Our findings show how it is often possible to achieve a high level of both accuracy and comprehensibility of classification models, even in complex domains with high-dimensional data, without necessarily trading one property for the other. This is a key requirement for a trustworthy AI, necessary for adoption in high-stakes decision making applications.
@article{SGM2021,author={Setzu, Mattia and Guidotti, Riccardo and Monreale, Anna and Turini, Franco and Pedreschi, Dino and Giannotti, Fosca},doi={10.1016/j.artint.2021.103457},issn={0004-3702},journal={Artificial Intelligence},line={1,4},month=may,open_access={Gold},pages={103457},publisher={Elsevier BV},title={GLocalX - From Local to Global Explanations of Black Box AI Models},visible_on_website={YES},volume={294},year={2021}}
Matrix Profile-Based Interpretable Time Series Classifier
Time series classification (TSC) is a pervasive and transversal problem in various fields ranging from disease diagnosis to anomaly detection in finance. Unfortunately, the most effective models used by Artificial Intelligence (AI) systems for TSC are not interpretable and hide the logic of the decision process, making them unusable in sensitive domains. Recent research is focusing on explanation methods to pair with the obscure classifier to recover this weakness. However, a TSC approach that is transparent by design and is simultaneously efficient and effective is even more preferable. To this aim, we propose an interpretable TSC method based on the patterns, which is possible to extract from the Matrix Profile (MP) of the time series in the training set. A smart design of the classification procedure allows obtaining an efficient and effective transparent classifier modeled as a decision tree that expresses the reasons for the classification as the presence of discriminative subsequences. Quantitative and qualitative experimentation shows that the proposed method overcomes the state-of-the-art interpretable approaches.
@article{GD2021,author={Guidotti, Riccardo and D’Onofrio, Matteo},doi={10.3389/frai.2021.699448},issn={2624-8212},journal={Frontiers in Artificial Intelligence},line={1},month=oct,open_access={Gold},publisher={Frontiers Media SA},title={Matrix Profile-Based Interpretable Time Series Classifier},visible_on_website={YES},volume={4},year={2021}}
Deriving a Single Interpretable Model by Merging Tree-Based Classifiers
Valerio
Bonsignori, Riccardo
Guidotti, and Anna
Monreale
Decision tree classifiers have been proved to be among the most interpretable models due to their intuitive structure that illustrates decision processes in form of logical rules. Unfortunately, more complex tree-based classifiers such as oblique trees and random forests overcome the accuracy of decision trees at the cost of becoming non interpretable. In this paper, we propose a method that takes as input any tree-based classifier and returns a single decision tree able to approximate its behavior. Our proposal merges tree-based classifiers by an intensional and extensional approach and applies a post-hoc explanation strategy. Our experiments shows that the retrieved single decision tree is at least as accurate as the original tree-based model, faithful, and more interpretable.
@inbook{BGM2021,author={Bonsignori, Valerio and Guidotti, Riccardo and Monreale, Anna},booktitle={Discovery Science},doi={10.1007/978-3-030-88942-5_27},isbn={9783030889425},issn={1611-3349},line={1,2},open_access={NO},pages={347–357},publisher={Springer International Publishing},title={Deriving a Single Interpretable Model by Merging Tree-Based Classifiers},visible_on_website={YES},year={2021}}
2020
Explaining Image Classifiers Generating Exemplars and Counter-Exemplars from Latent Representations
Riccardo
Guidotti, Anna
Monreale, Stan
Matwin, and Dino
Pedreschi
Proceedings of the AAAI Conference on Artificial Intelligence, Apr 2020
We present an approach to explain the decisions of black box image classifiers through synthetic exemplar and counter-exemplar learnt in the latent feature space. Our explanation method exploits the latent representations learned through an adversarial autoencoder for generating a synthetic neighborhood of the image for which an explanation is required. A decision tree is trained on a set of images represented in the latent space, and its decision rules are used to generate exemplar images showing how the original image can be modified to stay within its class. Counterfactual rules are used to generate counter-exemplars showing how the original image can “morph” into another class. The explanation also comprehends a saliency map highlighting the areas that contribute to its classification, and areas that push it into another class. A wide and deep experimental evaluation proves that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability, besides providing the most useful and interpretable explanations.
@article{GMM2020,author={Guidotti, Riccardo and Monreale, Anna and Matwin, Stan and Pedreschi, Dino},doi={10.1609/aaai.v34i09.7116},issn={2159-5399},journal={Proceedings of the AAAI Conference on Artificial Intelligence},line={1,4},month=apr,number={09},open_access={NO},pages={13665–13668},publisher={Association for the Advancement of Artificial Intelligence (AAAI)},title={Explaining Image Classifiers Generating Exemplars and Counter-Exemplars from Latent Representations},visible_on_website={YES},volume={34},year={2020}}
Explaining Any Time Series Classifier
Riccardo
Guidotti, Anna
Monreale, Francesco
Spinnato, Dino
Pedreschi, and Fosca
Giannotti
In 2020 IEEE Second International Conference on Cognitive Machine Intelligence (CogMI) , Oct 2020
We present a method to explain the decisions of black box models for time series classification. The explanation consists of factual and counterfactual shapelet-based rules revealing the reasons for the classification, and of a set of exemplars and counter-exemplars highlighting similarities and differences with the time series under analysis. The proposed method first generates exemplar and counter-exemplar time series in the latent feature space and learns a local latent decision tree classifier. Then, it selects and decodes those respecting the decision rules explaining the decision. Finally, it learns on them a shapelet-tree that reveals the parts of the time series that must, and must not, be contained for getting the returned outcome from the black box. A wide experimentation shows that the proposed method provides faithful, meaningful and interpretable explanations.
@inproceedings{GMS2020,author={Guidotti, Riccardo and Monreale, Anna and Spinnato, Francesco and Pedreschi, Dino and Giannotti, Fosca},booktitle={2020 IEEE Second International Conference on Cognitive Machine Intelligence (CogMI)},doi={10.1109/cogmi50398.2020.00029},line={1},month=oct,open_access={NO},pages={167–176},publisher={IEEE},title={Explaining Any Time Series Classifier},visible_on_website={YES},year={2020}}
Explaining Sentiment Classification with Synthetic Exemplars and Counter-Exemplars
Orestis
Lampridis, Riccardo
Guidotti, and Salvatore
Ruggieri
We present xspells, a model-agnostic local approach for explaining the decisions of a black box model for sentiment classification of short texts. The explanations provided consist of a set of exemplar sentences and a set of counter-exemplar sentences. The former are examples classified by the black box with the same label as the text to explain. The latter are examples classified with a different label (a form of counter-factuals). Both are close in meaning to the text to explain, and both are meaningful sentences – albeit they are synthetically generated. xspells generates neighbors of the text to explain in a latent space using Variational Autoencoders for encoding text and decoding latent instances. A decision tree is learned from randomly generated neighbors, and used to drive the selection of the exemplars and counter-exemplars. We report experiments on two datasets showing that xspells outperforms the well-known lime method in terms of quality of explanations, fidelity, and usefulness, and that is comparable to it in terms of stability.
@inbook{LGR2020,author={Lampridis, Orestis and Guidotti, Riccardo and Ruggieri, Salvatore},booktitle={Discovery Science},doi={10.1007/978-3-030-61527-7_24},isbn={9783030615277},issn={1611-3349},line={1},open_access={NO},pages={357–373},publisher={Springer International Publishing},title={Explaining Sentiment Classification with Synthetic Exemplars and Counter-Exemplars},visible_on_website={YES},year={2020}}
Data-Agnostic Local Neighborhood Generation
Riccardo
Guidotti, and Anna
Monreale
In 2020 IEEE International Conference on Data Mining (ICDM) , Nov 2020
Synthetic data generation has been widely adopted in software testing, data privacy, imbalanced learning, machine learning explanation, etc. In such contexts, it is important to generate data samples located within “local” areas surrounding specific instances. Local synthetic data can help the learning phase of predictive models, and it is fundamental for methods explaining the local behavior of obscure classifiers. The contribution of this paper is twofold. First, we introduce a method based on generative operators allowing the synthetic neighborhood generation by applying specific perturbations on a given input instance. The key factor consists in performing a data transformation that makes applicable to any type of data, i.e., data-agnostic. Second, we design a framework for evaluating the goodness of local synthetic neighborhoods exploiting both supervised and unsupervised methodologies. A deep experimentation shows the effectiveness of the proposed method.
@inproceedings{GM2020,author={Guidotti, Riccardo and Monreale, Anna},booktitle={2020 IEEE International Conference on Data Mining (ICDM)},doi={10.1109/icdm50108.2020.00122},issn={2374-8486},line={1},month=nov,open_access={NO},pages={1040–1045},publisher={IEEE},title={Data-Agnostic Local Neighborhood Generation},visible_on_website={YES},year={2020}}
Doctor XAI: an ontology-based approach to black-box sequential data classification explanations
Cecilia
Panigutti, Alan
Perotti, and Dino
Pedreschi
In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency , Jan 2020
Several recent advancements in Machine Learning involve blackbox models: algorithms that do not provide human-understandable explanations in support of their decisions. This limitation hampers the fairness, accountability and transparency of these models; the field of eXplainable Artificial Intelligence (XAI) tries to solve this problem providing human-understandable explanations for black-box models. However, healthcare datasets (and the related learning tasks) often present peculiar features, such as sequential data, multi-label predictions, and links to structured background knowledge. In this paper, we introduce Doctor XAI, a model-agnostic explainability technique able to deal with multi-labeled, sequential, ontology-linked data. We focus on explaining Doctor AI, a multilabel classifier which takes as input the clinical history of a patient in order to predict the next visit. Furthermore, we show how exploiting the temporal dimension in the data and the domain knowledge encoded in the medical ontology improves the quality of the mined explanations.
@inproceedings{PPP2020,author={Panigutti, Cecilia and Perotti, Alan and Pedreschi, Dino},booktitle={Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency},collection={FAT* ’20},doi={10.1145/3351095.3372855},line={1,3,4},month=jan,open_access={NO},pages={629–639},publisher={ACM},series={FAT* ’20},title={Doctor XAI: an ontology-based approach to black-box sequential data classification explanations},visible_on_website={YES},year={2020}}
Global Explanations with Local Scoring
Mattia
Setzu, Riccardo
Guidotti, Anna
Monreale, and Franco
Turini
Artificial Intelligence systems often adopt machine learning models encoding complex algorithms with potentially unknown behavior. As the application of these “black box” models grows, it is our responsibility to understand their inner working and formulate them in human-understandable explanations. To this end, we propose a rule-based model-agnostic explanation method that follows a local-to-global schema: it generalizes a global explanation summarizing the decision logic of a black box starting from the local explanations of single predicted instances. We define a scoring system based on a rule relevance score to extract global explanations from a set of local explanations in the form of decision rules. Experiments on several datasets and black boxes show the stability, and low complexity of the global explanations provided by the proposed solution in comparison with baselines and state-of-the-art global explainers.
@inbook{SGM2019,author={Setzu, Mattia and Guidotti, Riccardo and Monreale, Anna and Turini, Franco},booktitle={Machine Learning and Knowledge Discovery in Databases},doi={10.1007/978-3-030-43823-4_14},isbn={9783030438234},issn={1865-0937},line={1},open_access={NO},pages={159–171},publisher={Springer International Publishing},title={Global Explanations with Local Scoring},visible_on_website={YES},year={2020}}
2019
Factual and Counterfactual Explanations for Black Box Decision Making
Riccardo
Guidotti, Anna
Monreale, Fosca
Giannotti, Dino
Pedreschi, Salvatore
Ruggieri, and
1 more author
The rise of sophisticated machine learning models has brought accurate but obscure decision systems, which hide their logic, thus undermining transparency, trust, and the adoption of artificial intelligence (AI) in socially sensitive and safety-critical contexts. We introduce a local rule-based explanation method, providing faithful explanations of the decision made by a black box classifier on a specific instance. The proposed method first learns an interpretable, local classifier on a synthetic neighborhood of the instance under investigation, generated by a genetic algorithm. Then, it derives from the interpretable classifier an explanation consisting of a decision rule, explaining the factual reasons of the decision, and a set of counterfactuals, suggesting the changes in the instance features that would lead to a different outcome. Experimental results show that the proposed method outperforms existing approaches in terms of the quality of the explanations and of the accuracy in mimicking the black box.
@article{GMG2019,author={Guidotti, Riccardo and Monreale, Anna and Giannotti, Fosca and Pedreschi, Dino and Ruggieri, Salvatore and Turini, Franco},doi={10.1109/mis.2019.2957223},issn={1941-1294},journal={IEEE Intelligent Systems},line={1,4},month=nov,number={6},open_access={Gold},pages={14–23},publisher={Institute of Electrical and Electronics Engineers (IEEE)},title={Factual and Counterfactual Explanations for Black Box Decision Making},visible_on_website={YES},volume={34},year={2019}}
Explaining Multi-label Black-Box Classifiers for Health Applications
Cecilia
Panigutti, Riccardo
Guidotti, Anna
Monreale, and Dino
Pedreschi
Today the state-of-the-art performance in classification is achieved by the so-called “black boxes”, i.e. decision-making systems whose internal logic is obscure. Such models could revolutionize the health-care system, however their deployment in real-world diagnosis decision support systems is subject to several risks and limitations due to the lack of transparency. The typical classification problem in health-care requires a multi-label approach since the possible labels are not mutually exclusive, e.g. diagnoses. We propose MARLENA, a model-agnostic method which explains multi-label black box decisions. MARLENA explains an individual decision in three steps. First, it generates a synthetic neighborhood around the instance to be explained using a strategy suitable for multi-label decisions. It then learns a decision tree on such neighborhood and finally derives from it a decision rule that explains the black box decision. Our experiments show that MARLENA performs well in terms of mimicking the black box behavior while gaining at the same time a notable amount of interpretability through compact decision rules, i.e. rules with limited length.
@inbook{PGM2019,author={Panigutti, Cecilia and Guidotti, Riccardo and Monreale, Anna and Pedreschi, Dino},booktitle={Precision Health and Medicine},doi={10.1007/978-3-030-24409-5_9},isbn={9783030244095},issn={1860-9503},line={1,4},month=aug,pages={97–110},publisher={Springer International Publishing},title={Explaining Multi-label Black-Box Classifiers for Health Applications},visible_on_website={YES},year={2019}}
Meaningful Explanations of Black Box AI Decision Systems
Dino
Pedreschi, Fosca
Giannotti, Riccardo
Guidotti, Anna
Monreale, Salvatore
Ruggieri, and
1 more author
Proceedings of the AAAI Conference on Artificial Intelligence, Jul 2019
Black box AI systems for automated decision making, often based on machine learning over (big) data, map a user’s features into a class or a score without exposing the reasons why. This is problematic not only for lack of transparency, but also for possible biases inherited by the algorithms from human prejudices and collection artifacts hidden in the training data, which may lead to unfair or wrong decisions. We focus on the urgent open challenge of how to construct meaningful explanations of opaque AI/ML systems, introducing the local-toglobal framework for black box explanation, articulated along three lines: (i) the language for expressing explanations in terms of logic rules, with statistical and causal interpretation; (ii) the inference of local explanations for revealing the decision rationale for a specific case, by auditing the black box in the vicinity of the target instance; (iii), the bottom-up generalization of many local explanations into simple global ones, with algorithms that optimize for quality and comprehensibility. We argue that the local-first approach opens the door to a wide variety of alternative solutions along different dimensions: a variety of data sources (relational, text, images, etc.), a variety of learning problems (multi-label classification, regression, scoring, ranking), a variety of languages for expressing meaningful explanations, a variety of means to audit a black box.
@article{PGG2019,author={Pedreschi, Dino and Giannotti, Fosca and Guidotti, Riccardo and Monreale, Anna and Ruggieri, Salvatore and Turini, Franco},doi={10.1609/aaai.v33i01.33019780},issn={2159-5399},journal={Proceedings of the AAAI Conference on Artificial Intelligence},line={1},month=jul,number={01},pages={9780–9784},publisher={Association for the Advancement of Artificial Intelligence (AAAI)},title={Meaningful Explanations of Black Box AI Decision Systems},visible_on_website={YES},volume={33},year={2019}}
2018
Open the Black Box Data-Driven Explanation of Black Box Decision Systems
Pedreschi
Dino, Giannotti
Fosca, Guidotti
Riccardo, Monreale
Anna, Pappalardo
Luca, and
2 more authors
Black box systems for automated decision making, often based on machine learning over (big) data, map a user’s features into a class or a score without exposing the reasons why. This is problematic not only for lack of transparency, but also for possible biases hidden in the algorithms, due to human prejudices and collection artifacts hidden in the training data, which may lead to unfair or wrong decisions. We introduce the local-to-global framework for black box explanation, a novel approach with promising early results, which paves the road for a wide spectrum of future developments along three dimensions: (i) the language for expressing explanations in terms of highly expressive logic-based rules, with a statistical and causal interpretation; (ii) the inference of local explanations aimed at revealing the logic of the decision adopted for a specific instance by querying and auditing the black box in the vicinity of the target instance; (iii), the bottom-up generalization of the many local explanations into simple global ones, with algorithms that optimize the quality and comprehensibility of explanations.
@misc{PGG2018,author={Dino, Pedreschi and Fosca, Giannotti and Riccardo, Guidotti and Anna, Monreale and Luca, Pappalardo and Salvatore, Ruggieri and Franco, Turini},doi={1806.09936},line={1},month=dec,publisher={Arxive},title={Open the Black Box Data-Driven Explanation of Black Box Decision Systems},year={2018}}