Local to global

1

LINE

From the experience of the surveys [ GMR2018 , BGG2021 ], we developed and designed various explanation methods with a focus on local rule-based explainers. Also, our goal was to “merge” such local explanations to reach a global consensus on the reasons for the decisions taken by an AI decision support system.

1.1 Local Rule-based Explainer

Our first proposal is the LOcal Rule-based Explainer (LORE) presented in [GMG2019]. LORE is a model-agnostic local explanation method that returns as an explanation a “factual rule revealing the reasons for a decision, and a set of counterfactual rules illustrating how to change the classification outcome. The first peculiarity of LORE is that it adopts a synthetic generator based on a genetic algorithm. The second peculiarity of LORE is that it adopts a decision tree as a local surrogate, thus (i) decision rules can naturally be derived from a root-leaf path in a decision tree; and, (ii) counterfactuals can be extracted by symbolic reasoning over the tree.

LINK

1.2 Plausible Data-Agnostic Local Explanations

LORE was initially designed to deal with tabular data and binary classification problems. We extended it to work on other data types and for multiclass problems. Also, we are currently working to solve some limitations of LORE related to the stability and actionability of the explanations. In line with LIME [1], our idea was to extend LORE for working on any data type. Indeed, our objective was to define a local model-agnostic and data-agnostic explanation framework to explain the decisions taken by obscure black-box classifiers on specific input instances. Therefore, our proposal would not be tied to a specific type of data or a specific type of classifier. Besides being model-agnostic, LIME is also data-agnostic. However, LIME employs conceptually different neighborhood generation strategies for tabular data, images, and texts. For images, LIME randomly replaces actual super-pixels with super-pixels containing a fixed color. For texts, it randomly removes words. Thus, both for images and text LIME “suppresses” parts of the actual information in the data. On the other hand, for tabular data, LIME assumes uniform distributions for categorical attributes and normal distributions for the continuous ones. Such limitations prevent LIME from basing the local regressor used to extract the explanation on meaningful synthetic instances. Our proposal allows overcoming these limitations by guaranteeing comparable synthetic data generation among all the different data types, ensuring meaningful synthetic instances to learn interpretable local surrogate models. Our idea was to extend LORE [GMG2019] to overcome the limitations of existing approaches by exploiting the latent feature space learned through different types of autoencoders [2] to generate plausible synthetic instances during the neighborhood generation process. Given an instance of any type classified by a black-box, the Latent-LORE (LLORE) allows instantiating a data-specific explainer following the explanation framework structure. The explainer will be able to return a meaningful explanation for the classification reasons. LLORE-based approaches work as follows. First, they generate synthetic instances in the latent feature space using a pre-trained autoencoder (GAM, AAE, VAE, etc.). Then, they learn a latent decision tree classifier. After that, they select and decode the synthetic instances respecting the latent local decision rules observed on the decision tree. Finally, independently from the data type, they return an explanation that always consists of a set of synthetic exemplars and counter-exemplars instances illustrating, respectively, instances classified with the same label and with a different label than the instance to explain, which may be visually analyzed to understand the reasons for the classification. Additionally, a data-specific explanation can be built on the exemplars and counter-exemplars. We instantiated LLORE for images [GMG2019, GMM2020], time series [GMS2020] and text [LGR2020] realizing ad-hoc logic-based explanations. A wide experimentation on datasets of different types and explaining different black-box classifiers empirically demonstrate that LLORE-based explainers overtakes existing explanation methods providing meaningful, stable, useful, and really understandable explanations. In [MGY2021, MBG2021] we employed ABELE in a case study for skin lesion diagnosis, illustrating how it is possible to provide the practitioner with explanations on the decisions of a Deep Neural Network (DNN). We have proved that after being customized and carefully trained, ABELE can produce meaningful explanations that really help practitioners. The latent space analysis suggests an interesting partitioning of images over the latent space. Still in [MGY2021, MBG2021] is reported a survey involving real experts in the health domain and common people that supports the hypothesis that explanation methods without a consistent validation are not useful. As highlighted by these works, the context of synthetic data generation for local explanation methods it is important to generate data samples located within “local” areas surrounding specific instances. The problem with generative adversarial networks and autoencoders is that they require a large quantity of data, and a not negligible training time. In addition, such generative approaches are suited only for particular types of data. In [GM2020] we overcome these drawbacks proposing DAG, a Data-Agnostic neighborhood Generation approach that, given an input instance and a (small) support set, returns a set of local realistic synthetic instances. DAG applies a data transformation that enables the generation for any type of input data. It is based on a set of generative operators inspired to genetic programming. Such operators work by applying specific vector perturbations by following a fast procedure that only requires a small set of instances to support the data generation. A wide experimentation on different types of data (tabular data, images, time series, and texts) and against state-of-the-art local neighborhood generators shows the effectiveness of DAG in producing realistic instances independently from the nature of the data.

1.3 Local Explanation 4 Health

In order to enable explainable AI systems to support medical decision-making, it is necessary to enable XAI techniques to deal with typical healthcare data characteristics. We incrementally addressed such a problem with the contributions presented in [PGM2019] and [PPP2020]. [PGM2019] presents MARLENA (Multi-lAbel RuLe-based ExplaNAtions), a model-agnostic XAI methodology to address the outcome explanation problem in the context of multi-label black box outcomes. Building on the insights we gained from the experiments carried out in [PGM2019], we developed Doctor XAI [PPP2020], a model-agnostic technique that is suitable for multi-label black box outcomes and it is also able to deal with ontologically-linked and sequential data. Two key aspects of the presented approach are that it exploited the ontology in creating the synthetic neighborhood and employed a novel encoder/decoder scheme for sequential data that preserves the interpretability of the features. The ontological perturbation allows us to create synthetic instances that consider local features interactions by perturbing the set of neighbors available in the dataset masking semantically similar features. We tested Doctor XAI in two scenarios. First, we tested the ability of Doctor XAI combined with a local-to-global approach to audit a fictional commercial black box. This resulted in a framework for auditing clinical decision support systems called FairLens [PPB2021]. FairLens first stratifies the available patient data according to demographic attributes such as age, ethnicity, gender and healthcare insurance; it then assesses the model performance on such groups highlighting the most common misclassifications. Finally, FairLens allows the expert to examine one misclassification of interest by exploiting DoctorXAI to explain which elements of the affected patients' clinical history drive the model error in the problematic group. We validate FairLens' ability to highlight bias in multilabel clinical DSSs introducing a multilabel-appropriate metric of disparity and proving its efficacy against other standard metrics. Finally in [PBF2022], we presented the collective effort of our interdisciplinary team of data scientists, human-computer interaction experts and designers to develop a human-centered, explainable AI system for clinical decision support. Using an iterative design approach that involves healthcare providers as end-users, we present the first cycle of the prototyping-testing-redesigning of DoctorXAI and its explanation-user interface. We first present the DoctorXAI concept that stems from patients data and healthcare application requirements. Then we develop the initial prototype of the explanation user interface, and perform a user study to test its perceived trustworthiness and collect healthcare providers' feedback. We finally exploit the users' feedback to co-design a more human-centered XAI user interface taking into account design principles such as progressive disclosure of information.

1.4 Local to Global Approaches

Local explanations enjoy several properties: they are relatively fast and easy to extract, precise, and possibly diverse. Conversely, global explanations are more cumbersome to extract, and, having a larger scope, more general. Thus, these two families present complementary properties. The Local to Global explanation paradigm [PGG2018, PGG2019] is a natural extension of the Local and Global paradigms, and aims to exploit the fidelity and ease of extraction of Local explanations to generate faithful, general, and simple Global explanations. In our work, we have focused on explanations in the form of axis-parallel decision rules, and have proposed two algorithms to tackle them, namely Rule Relevance Score (RRS) [SGM2019] and GLocalX [SGM2021]. Rule Relevance Score (RRS) [SGM2019] is a simple scoring framework in which we try to select, rather than edit, the local explanations. In other words, with RRS we construct global explanations by selecting local ones. RRS uses a multi-faceted scoring formula in which explanations are ranked according to their fidelity, coverage and outlier coverage, which rewards rules explaining seldomly explained records. GLocalX [SGM2021] relies on three assumptions: (i) logical explainability, that is, explanations are best provided in a logical form that can be reasoned upon; (ii) local explainability, that is, regardless of the complexity of the decision boundary of the black box, locally it can be accurately approximated by an explanation; (iii) composability, that is, we can compose local explanations by leveraging their logical form. Starting from a set of local explanations in the form of decision rules constituting the structure's leaves, GLocalX iteratively merges explanations in a bottom-up fashion to create a hierarchical merge structure that yields global explanations on its top layer. GLocalX shows a unique balance between fidelity and simplicity, having state-of-the-art fidelity and yielding small sets of compact global explanations. When comparing with natively global models, such as Decision Trees and CPAR, who have direct access to the whole training data, rather than the local explanations, GLocalX compares favorably. It’s only slightly less faithful than the most faithful model (~2% less faithful than a Decision Tree) while having a far simpler model (up to 1 order of magnitude smaller set of output rules). When compared with models of similar complexity, such as a Pruned Decision Tree, GLocalX is slightly more faithful and less complex.

1.5 Towards Interpretable-by-design Models

In parallel with the activity of designing local and global post-hoc explainers, in line with [3] we also started to explore directions for designing predictive models which are interpretable-by-design, i.e., they return the prediction and allow us to understand the reasons that lead to that prediction. Indeed, if the machine logic is transparent and accessible, as humans, we tend to trust more a decision process using a logic similar to that of a human being, rather than a reasoning that we can understand but that is outside the human way of thinking [4]. In [GD2021] we present MAPIC a MAtrix Profile-based Interpretable time series Classifier. MAPIC is an interpretable model for time series classification able to guarantee high levels of accuracy and efficiency while maintaining the classification, and the classification model, interpretable. In the design of MAPIC we followed the line of research based on shapelets. However, we replaced the inefficient approaches adopted in the state of the art for the search of the most discriminative subsequences with the patterns that is possible to extract from a model named Matrix Profile [5]. In short, the Matrix Profile (MP) represents the distances between all subsequences and their nearest neighbors. From a MP it is possible to efficiently extract some patterns characterizing a time series such as motifs and discords. Motifs are subsequences of a time series which are very similar to each other, while discords are subsequences of a time series which are very different from any other subsequence. As a classification model, MAPIC adopts a decision tree classifier due to its intrinsic interpretability. We empirically demonstrate that MAPIC overtakes existing approaches having a similar interpretability in terms of both accuracy and running time. For these two last results and from GLocalX it is clear the importance of relying on sound decision tree models. A weak point of traditional decision trees is that they are not very stable and a common procedure to stabilize them is to merge various trees into an unique tree. In a certain sense, this is a form of explanation of a set of decision trees with a single model. Several proposals are present in the literature for traditional decision trees but there is a lack of merging operations for oblique trees and forests of oblique trees. Thus, in [BGM2021] we combine XAI and the merging of decision trees. Given any accurate and complex tree-based classifier, our aim is to approximate it with a single interpretable decision tree that guarantees comparable levels of accuracy and a low complexity that permits us to understand the logic it follows for the classification. We propose a Single-tree Approximation MEthod (SAME) that exploits a procedure for merging decision trees, a post-hoc explanation strategy, and a combination of them to turn any tree-based classifier into a single and interpretable decision tree. Given a certain tree-based classifier, the idea of SAME is to reduce any approximation problem with another one for which a solution is known in a sort of “cascade of approximations” with several available alternatives. This allows SAME to turn Random Forests, Oblique Trees and Oblique Forests into a single decision tree. The implementation of SAME required adapting existing procedures for merging traditional decision trees to oblique trees by moving from an intensional approach to an extensional one for efficiency reasons. An experimentation on eight tabular datasets with different size and dimensionality compares SAME against a baseline approach (PHDT) that directly approximates any classifier with a decision tree. We show that SAME is efficient and that the retrieved single decision tree is at least as accurate as the original non interpretable tree-based model.

Publications

1.

[GMR2018]

A Survey of Methods for Explaining Black Box Models
Guidotti Riccardo, Monreale Anna, Ruggieri Salvatore, Turini Franco, Giannotti Fosca, Pedreschi Dino (2022) - ACM Computing Surveys. In ACM computing surveys (CSUR), 51(5), 1-42.

Abstract

In recent years, many accurate decision support systems have been constructed as black boxes, that is as systems that hide their internal logic to the user. This lack of explanation constitutes both a practical and an ethical issue. The literature reports many approaches aimed at overcoming this crucial weakness, sometimes at the cost of sacrificing accuracy for interpretability. The applications in which black box decision systems can be used are various, and each approach is typically developed to provide a solution for a specific problem and, as a consequence, it explicitly or implicitly delineates its own definition of interpretability and explanation. The aim of this article is to provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box system. Given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work. The proposed classification of approaches to open black box models should also be useful for putting the many research open questions in perspective.

1

LINE

1.1 Local Rule-based Explainer

1.2 Plausible Data-Agnostic Local Explanations

1.3 Local Explanation 4 Health

1.4 Local to Global Approaches

1.5 Towards Interpretable-by-design Models

Publications

1.

2.

3.

4.

5.

6.

7.

8.

9.

12.

13.

15.

16.

23.

24.

25.

27.

29.

33.

34.

35.

36.

38.

39.

40.

41.

42.

45.

48.

51.

52.

53.

54.

55.

56.

57.

58.

59.

60.

63.

64.

Researchers working on this line

RiccardoGuidotti

FrancoTurini

SalvatoreRuggieri

MircoNanni

SalvoRinzivillo

AndreaBeretta

AnnaMonreale

CeciliaPanigutti

MattiaSetzu

FrancescoSpinnato

FrancescaNaretto

FrancescoBodria

CarloMetta

AlessioMalizia

MartaMarchiori Manerba

MartinaCinquini

CristianoLandi

SamueleTonati

AnnaArias Duart

AndreaFedele

FedericoMazzoni

ClaraPunzi

AndreaPugnana

MargheritaLalli

MarzioDi Vece

Riccardo
Guidotti

Franco
Turini

Salvatore
Ruggieri

Mirco
Nanni

Salvo
Rinzivillo

Andrea
Beretta

Anna
Monreale

Cecilia
Panigutti

Mattia
Setzu

Francesco
Spinnato

Francesca
Naretto

Francesco
Bodria

Carlo
Metta

Alessio
Malizia

Marta
Marchiori Manerba

Martina
Cinquini

Cristiano
Landi

Samuele
Tonati

Anna
Arias Duart

Andrea
Fedele

Federico
Mazzoni

Clara
Punzi

Andrea
Pugnana

Margherita
Lalli

Marzio
Di Vece