Case studies

In recent years, the development of AI systems focused on uncovering black-box systems through a wide range of explainability methods to make users more aware of why the AI gives the suggestion.

The scientific community’s interest on eXplainable Artificial Intelligence (XAI) has produced a multitude of research on computational methods to make explainability possible. Nevertheless, the attention on the final user has been studied with less effort. This line address two main aspects:

the user’s decision-making process with the eXplainable AI systems used to support high stakes decision; use cases to test explanation methods developed in XAI project. Since the beginning of XAI project, we focused on mainly on the healthcare high-stakes decision. In the healthcare application domain, both AI and human doctors will have complementary roles reflecting their strengths and weaknesses. Therefore, it is of pivotal importance to develop an AI technology able to work synergistically with doctors. Current AI technologies have many shortcomings that hinder their adoption in the real world. In recent years, developing methods to explain AI models’ reasoning has become the focus of many of the scientific community, particularly those in the field of eXplainable AI (XAI). While several XAI methods have been developed in the past years, only a few considered the specific application domain. Consider, for example, two of the most popular XAI methods: LIME and SHAP These two methods model-agnostic and application-agnostic, meaning that they can extract an explanation from any type of black-box AI model regardless of the application domain. While the model-agnostic approach to XAI offers great flexibility to the use of these methods, the application-agnostic approach implies that the specific user needs are not considered. A few works have tried to close such a gap in the medical field by involving the doctors in the design procedure or by performing exploratory surveys e.g. Despite these recent efforts, most of the research has been focused on laypeople

In our first work, we tested the impact of AI explanation with healthcare professionals. Specifically, the context of this work is AI-supported decision-making for clinicians. Imagine, for example, a doctor who wants to have a second opinion before making a decision about the risk of a patient’s myocardial infarction. She forms her opinion on the previous visits and symptoms of the patient and then an AI suggestion is presented to her. What happens to her when she gets a second opinion? Does she trust herself or she will be more prone to follow the algorithmic suggestion to make her final decision? To answer this question, we collected data from 36 healthcare professionals to understand the impact of advice from a clinical DSS in two different cases: the case in which the clinical DSS explains the given suggestion and the case in which it does not. We adapted the judge-advisor system framework from Sniezek & Van Swol [Sniezek2001] to evaluate participants’ trust and behavioral intention to use the system in an online estimation task. Our main measure was the Weight of Advice, a measure of the degree to which the algorithmic suggestion (with or without explanation) influences the participant’s estimate. To have more meaningful insights from the participants, we collected qualitative and quantitative measures. Our results showed that participants relied more on the condition with the explanation compared to the condition with the sole suggestion. This happened even if participants found the explanation unsatisfying. It is interesting to notice that, despite the low perceived explanation quality, participants were influenced by it and relied more on the advice of the AI system. This finding might be in line with previous research on automation bias in medicine, i.e., the tendency to over-rely on automation. From the open questions at the end of the study, healthcare professionals showed an aversion to the use of algorithmic advice and a fear of being replaced by such AI systems. The importance of these results is twofold. Firstly, even if the explanation provided left most of the participants unsatisfied, they were strongly influenced by it and relied more on the advice given by the AI with the suggestion. Secondly, the importance of the ethnographic method, i.e., the open-ended questions, to get more insights from the participants that cannot be caught only with quantitative measures.

The limitations of this study need to be found in the presentation of a decision from the AI that was always correct. In future work, we aim to carry out a similar study testing if the overreliance is still maintained even when the suggestions are wrong. In the second work we performed, we tested how users react to a wrong suggestion when they have to evaluate different types of skin lesions images. The need is to develop AI systems that can assist doctors in making more informed decisions, complementing their own knowledge with the information and suggestion yielded by the AI system [MGY2021, PPP2020]. However, if the logic for the decisions of AI systems is not available it would be impossible to accomplish this goal. Skin image classification is a typical example of this problem. Here, the explanation is formed by synthetic exemplars and counter-exemplars of skin lesions (i.e. images generated and classified with the same outcome as the initial dataset, and with an outcome other than the original dataset, correspondingly). This explanation offers the practitioner a way to highlight the crucial traits responsible for the algorithmic classification decision We conducted a validation survey with 156 domain experts, novices, and laypeople to test whether the explanation increases the reliance and the confidence in the automatic decision system. The task was organized into ten questions. Each of those questions was presented as an image of a skin lesion without any label and its explanation was generated by ABELE. The participants were shown with two exemplars, classified as the presented skin lesion, and two counter-exemplars, i.e. another lesion class. They had to classify the presented image in a binary decision task to decide whether the class of the nevus by using the presented explanation. Here, one of the main points was to see how participants regain their trust after receiving a misclassified suggestion by the AI system. The results showed a slight reduction of trust towards the black box when the presented suggestion is wrong, although there is no statistically significant drop in confidence after receiving wrong advice from an AI model. However, if we restrict our analysis to the sub-sample of medical experts, we have noticed that they are more prone to lower their confidence in the system’s advice even in the subsequent trials compared to the other participants (beginners and laypeople). This study showed how domain experts are more prone to detect and adjust their estimates when the suggestion is not correct. This aspect can be important for the role of the final users of the system. That is to say, explanation methods without a consistent validation can be not taken into account as expected by the developers of such methods. Healthcare is one of the main areas in which we have put our effort to include real participants to get an insight into the effect of AI explanations during the use of clinical assisted decision-making systems. We are focusing on how to improve the explanations in the diagnosis forecasts to inform the design of healthcare systems to promote human-AI cooperation, avoid algorithm aversion and improve the overall decision-making process.


Research line people

img Guidotti
Riccardo
Guidotti

Assitant Professor
University of Pisa


R.LINE 1 ▪ 3 ▪ 4 ▪ 5

img Nanni
Mirco
Nanni

Researcher
ISTI - CNR Pisa


R.LINE 1 ▪ 4

img Pappalardo
Luca
Pappalardo

Researcher
ISTI - CNR Pisa


R.LINE 4

img Rinzivillo
Salvo
Rinzivillo

Researcher
ISTI - CNR Pisa


R.LINE 1 ▪ 3 ▪ 4 ▪ 5

img Beretta
Andrea
Beretta

Researcher
ISTI - CNR Pisa


R.LINE 1 ▪ 4 ▪ 5

img Monreale
Anna
Monreale

Associate Professor
University of Pisa


R.LINE 1 ▪ 4 ▪ 5

img Panigutti
Cecilia
Panigutti

Phd Student
Scuola Normale


R.LINE 1 ▪ 4 ▪ 5

img Spinnato
Francesco
Spinnato

Researcher
Scuola Normale


R.LINE 1 ▪ 4

img Naretto
Francesca
Naretto

Post Doctoral Researcher
Scuola Normale


R.LINE 1 ▪ 3 ▪ 4 ▪ 5

img Metta
Carlo
Metta

Researcher
ISTI - CNR Pisa


R.LINE 1 ▪ 2 ▪ 3 ▪4

img Cappuccio
Eleonora
Cappuccio

Phd Student
University of Pisa - Bari


R.LINE 3 ▪ 4

img Malizia
Alessio
Malizia

Associate Professor
University of Pisa


R.LINE 3 ▪ 4

img Tonati
Samuele
Tonati

Phd Student
University of Pisa


R.LINE 4

img Gezici
Gizem
Gezici

Researcher
Scuola Normale


R.LINE 4

img Giannini
Francesco
Giannini

Research Fellow
Scuola Normale


R.LINE

img Colombini
Iacopo
Colombini

Phd Student
Scuola Normale


R.LINE 2 ▪ 4

img Pierotti
Mariarita
Pierotti

Associate Professor
University of Pisa


R.LINE 4

img Mauro
Giovanni
Mauro

Research Fellow
Scuola Normale


R.LINE 4


Line 4 - Publications

2025

  1. Embracing Diversity: A Multi-Perspective Approach with Soft Labels
    Benedetta Muscato, Praveen Bushipaka, Gizem Gezici, Lucia Passaro, Fosca Giannotti, and 1 more author
    Sep 2025
    RESEARCH LINE
  2. Perspectives in Play: A Multi-Perspective Approach for More Inclusive NLP Systems
    Benedetta Muscato, Lucia Passaro, Gizem Gezici, and Fosca Giannotti
    In Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence , Sep 2025
    RESEARCH LINE
  3. The explanation dialogues: an expert focus study to understand requirements towards explanations within the GDPR
    Laura State, Alejandra Bringas Colmenarejo, Andrea Beretta, Salvatore Ruggieri, Franco Turini, and 1 more author
    Artificial Intelligence and Law, Jan 2025
    RESEARCH LINE
  4. A Simulation Framework for Studying Systemic Effects of Feedback Loops in Recommender Systems
    G. Barlacchi, M. Lalli, E. Ferragina, F. Giannotti, and L. Pappalardo
    Dec 2025
    RESEARCH LINE

2024

  1. Commodity-specific triads in the Dutch inter-industry production network
    Marzio Di Vece, Frank P. Pijpers, and Diego Garlaschelli
    Scientific Reports, Feb 2024
    RESEARCH LINE
  2. A Frank System for Co-Evolutionary Hybrid Decision-Making
    Federico Mazzoni, Riccardo Guidotti, and Alessio Malizia
    Feb 2024
    RESEARCH LINE
  3. Multi-Perspective Stance Detection
    Benedetta Muscato, Praveen Bushipaka, Gizem Gezici, Lucia Passaro, and Fosca Giannotti
    Dec 2024
    RESEARCH LINE
  4. Beyond Headlines: A Corpus of Femicides News Coverage in Italian Newspapers
    Eleonora Cappuccio, Benedetta Muscato, Laura Pollacci, Marta Marchiori Manerba, Clara Punzi, and 5 more authors
    Dec 2024
    RESEARCH LINE
  5. A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions
    Luca Pappalardo, Emanuele Ferragina, Salvatore Citraro, Giuliano Cornacchia, Mirco Nanni, and 9 more authors
    Dec 2024
    RESEARCH LINE
  6. XAI in healthcare
    Gezici G.; Metta C; Beretta A.; Pellungrini R.; Rinzivillo S.; Pedreschi D.; Giannotti F.
    Dec 2024
    RESEARCH LINE

2023

  1. Effects of Route Randomization on Urban Emissions
    Giuliano Cornacchia, Mirco Nanni, Dino Pedreschi, and Luca Pappalardo
    SUMO Conference Proceedings, Jun 2023
    RESEARCH LINE
  2. Explaining Socio-Demographic and Behavioral Patterns of Vaccination Against the Swine Flu (H1N1) Pandemic
    Clara Punzi, Aleksandra Maslennikova, Gizem Gezici, Roberto Pellungrini, and Fosca Giannotti
    Jun 2023
    RESEARCH LINE

2022

  1. Understanding the impact of explanations on advice-taking: a user study for AI-based clinical Decision Support Systems
    Cecilia Panigutti, Andrea Beretta, Fosca Giannotti, and Dino Pedreschi
    In CHI Conference on Human Factors in Computing Systems , Apr 2022
    RESEARCH LINE
  2. Assessing Trustworthy AI in Times of COVID-19: Deep Learning for Predicting a Multiregional Score Conveying the Degree of Lung Compromise in COVID-19 Patients
    Himanshi Allahabadi, Julia Amann, Isabelle Balot, Andrea Beretta, Charles Binkley, and 52 more authors
    IEEE Transactions on Technology and Society, Dec 2022
    RESEARCH LINE
  3. Explaining Siamese Networks in Few-Shot Learning for Audio Data
    Andrea Fedele, Riccardo Guidotti, and Dino Pedreschi
    Dec 2022
    RESEARCH LINE
  4. Explaining Crash Predictions on Multivariate Time Series Data
    Francesco Spinnato, Riccardo Guidotti, Mirco Nanni, Daniele Maccagnola, Giulia Paciello, and 1 more author
    Dec 2022
    RESEARCH LINE
  5. Understanding peace through the world news
    Vasiliki Voukelatou, Ioanna Miliou, Fosca Giannotti, and Luca Pappalardo
    EPJ Data Science, Jan 2022
    RESEARCH LINE

2021

  1. Intelligenza artificiale in ambito diabetologico: prospettive, dalla ricerca di base alle applicazioni cliniche
    Bosi Emanuele Panigutti Cecilia
    il Diabete, Jan 2021
    RESEARCH LINE
  2. GLocalX - From Local to Global Explanations of Black Box AI Models
    Mattia Setzu, Riccardo Guidotti, Anna Monreale, Franco Turini, Dino Pedreschi, and 1 more author
    Artificial Intelligence, May 2021
    RESEARCH LINE
  3. FairLens: Auditing black-box clinical decision support systems
    Cecilia Panigutti, Alan Perotti, André Panisson, Paolo Bajardi, and Dino Pedreschi
    Information Processing & Management, Sep 2021
    RESEARCH LINE
  4. Occlusion-Based Explanations in Deep Recurrent Models for Biomedical Signals
    Michele Resta, Anna Monreale, and Davide Bacciu
    Entropy, Aug 2021
    RESEARCH LINE

2020

  1. Black Box Explanation by Learning Image Exemplars in the Latent Feature Space
    Riccardo Guidotti, Anna Monreale, Stan Matwin, and Dino Pedreschi
    Aug 2020
    RESEARCH LINE
  2. Prediction and Explanation of Privacy Risk on Mobility Data with Neural Networks
    Francesca Naretto, Roberto Pellungrini, Franco Maria Nardini, and Fosca Giannotti
    Aug 2020
    RESEARCH LINE
  3. Explaining Image Classifiers Generating Exemplars and Counter-Exemplars from Latent Representations
    Riccardo Guidotti, Anna Monreale, Stan Matwin, and Dino Pedreschi
    Proceedings of the AAAI Conference on Artificial Intelligence, Apr 2020
    RESEARCH LINE
  4. Predicting and Explaining Privacy Risk Exposure in Mobility Data
    Francesca Naretto, Roberto Pellungrini, Anna Monreale, Franco Maria Nardini, and Mirco Musolesi
    Apr 2020
    RESEARCH LINE
  5. Doctor XAI: an ontology-based approach to black-box sequential data classification explanations
    Cecilia Panigutti, Alan Perotti, and Dino Pedreschi
    In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency , Jan 2020
    RESEARCH LINE

2019

  1. Factual and Counterfactual Explanations for Black Box Decision Making
    Riccardo Guidotti, Anna Monreale, Fosca Giannotti, Dino Pedreschi, Salvatore Ruggieri, and 1 more author
    IEEE Intelligent Systems, Nov 2019
    RESEARCH LINE
  2. Explaining Multi-label Black-Box Classifiers for Health Applications
    Cecilia Panigutti, Riccardo Guidotti, Anna Monreale, and Dino Pedreschi
    Aug 2019
    RESEARCH LINE