XAI Team at ECML-PKDD 2025

The XAI project team had a strong presence at ECML-PKDD 2025 in Oporto, presenting multiple contributions that advance the state of the art in interpretable machine learning.

Key works presented include:

SafeGen: A genetic method for safeguarding privacy and fairness (Cinquini et al., 2025).
MASCOTS: Model-agnostic symbolic counterfactual explanations for time series (Płudowski et al., 2025).
GrouX: Exploring group explainability through local approximation (Setzu et al., 2025).
Forgotten Items: An interpretable unsupervised approach for retail (Corbucci et al., 2025).

These works demonstrate the project’s commitment to privacy, fairness, and interpretability in complex data domains.

References

2025

SafeGen: safeguarding privacy and fairness through a genetic method

Martina Cinquini, Marta Marchiori Manerba, Federico Mazzoni, Francesca Pratesi, and Riccardo Guidotti

Machine Learning, Sep 2025

RESEARCH LINE
5

Abs Bib HTML

To ensure that Machine Learning systems produce unharmful outcomes, pursuing a joint optimization of performance and ethical profiles such as privacy and fairness is crucial. However, jointly optimizing these two ethical dimensions while maintaining predictive accuracy remains a fundamental challenge. Indeed, privacy-preserving techniques may worsen fairness and restrain the model’s ability to learn accurate statistical patterns, while data mitigation techniques may inadvertently compromise privacy. Aiming to bridge this gap, we propose safeGen, a preprocessing fairness enhancing and privacy-preserving method for tabular data. SafeGen employs synthetic data generation through a genetic algorithm to ensure that sensitive attributes are protected while maintaining the necessary statistical properties. We assess our method across multiple datasets, comparing it against state-of-the-art privacy-preserving and fairness approaches through a threefold evaluation: privacy preservation, fairness enhancement, and generated data plausibility. Through extensive experiments, we demonstrate that SafeGen consistently achieves strong anonymization while preserving or improving dataset fairness across several benchmarks. Additionally, through hybrid privacy-fairness constraints and the use of a genetic synthesizer, SafeGen ensures the plausibility of synthetic records while minimizing discrimination. Our findings demonstrate that modeling fairness and privacy within a unified generative method yields significantly better outcomes than addressing these constraints separately, reinforcing the importance of integrated approaches when multiple ethical objectives must be simultaneously satisfied.
@article{CMM2025, author = {Cinquini, Martina and Marchiori Manerba, Marta and Mazzoni, Federico and Pratesi, Francesca and Guidotti, Riccardo}, doi = {10.1007/s10994-025-06835-9}, issn = {1573-0565}, journal = {Machine Learning}, line = {5}, month = sep, number = {10}, open_access = {Gold}, publisher = {Springer Science and Business Media LLC}, title = {SafeGen: safeguarding privacy and fairness through a genetic method}, visible_on_website = {YES}, volume = {114}, year = {2025} }
MASCOTS: Model-Agnostic Symbolic COunterfactual Explanations for Time Series

Dawid Płudowski, Francesco Spinnato, Piotr Wilczyński, Krzysztof Kotowski, Evridiki Vasileia Ntagiou, and 2 more authors

Sep 2025

RESEARCH LINE
1

Abs Bib HTML

Counterfactual explanations provide an intuitive way to understand model decisions by identifying minimal changes required to alter an outcome. However, applying counterfactual methods to time series models remains challenging due to temporal dependencies, high dimensionality, and the lack of an intuitive human-interpretable representation. We introduce MASCOTS, a method that leverages the Bag-of-Receptive-Fields representation alongside symbolic transformations inspired by Symbolic Aggregate Approximation. By operating in a symbolic feature space, it enhances interpretability while preserving fidelity to the original data and model. Unlike existing approaches that either depend on model structure or autoencoder-based sampling, MASCOTS directly generates meaningful and diverse counterfactual observations in a model-agnostic manner, operating on both univariate and multivariate data. We evaluate MASCOTS on univariate and multivariate benchmark datasets, demonstrating comparable validity, proximity, and plausibility to state-of-the-art methods, while significantly improving interpretability and sparsity. Its symbolic nature allows for explanations that can be expressed visually, in natural language, or through semantic representations, making counterfactual reasoning more accessible and actionable.
@inbook{PSW2025, author = {Płudowski, Dawid and Spinnato, Francesco and Wilczyński, Piotr and Kotowski, Krzysztof and Ntagiou, Evridiki Vasileia and Guidotti, Riccardo and Biecek, Przemysław}, booktitle = {Machine Learning and Knowledge Discovery in Databases. Research Track}, doi = {10.1007/978-3-032-06078-5_6}, isbn = {9783032060785}, issn = {1611-3349}, line = {1}, month = sep, open_access = {Gold}, pages = {94–112}, publisher = {Springer Nature Switzerland}, title = {MASCOTS: Model-Agnostic Symbolic COunterfactual Explanations for Time Series}, visible_on_website = {YES}, year = {2025} }
Group Explainability Through Local Approximation

Mattia Setzu, Riccardo Guidotti, Dino Pedreschi, and Fosca Giannotti

Oct 2025

RESEARCH LINE
2

Abs Bib HTML

Machine learning models are becoming increasingly complex and widely adopted. Interpretable machine learning allows us to not only make predictions but also understand the rationale behind automated decisions through explanations. Explanations are typically characterized by their scope: local explanations are generated by local surrogate models for specific instances, while global explanations aim to approximate the behavior of the entire black-box model. In this paper, we break this dichotomy of locality to explore an underexamined area that lies between these two extremes: meso-level explanations. The goal of meso-level explainability is to provide explanations using a set of meso-level interpretable models, which capture patterns at an intermediate level of abstraction. To this end, we propose GrouX, an explainable-by-design algorithm that generates meso-level explanations in the form of feature importance scores. Our approach includes a partitioning phase that identifies meso groups, followed by the training of interpretable models within each group. We evaluate GrouX on a collection of tabular datasets, reporting both the accuracy and complexity of the resulting meso models, and compare it against other meso-level explainability algorithms. Additionally, we analyze the algorithm’s sensitivity to its hyperparameters to better understand its behavior and robustness.
@inbook{SGP2025, address = {ECAI 2025}, author = {Setzu, Mattia and Guidotti, Riccardo and Pedreschi, Dino and Giannotti, Fosca}, booktitle = {ECAI 2025}, doi = {10.3233/faia250902}, isbn = {9781643686318}, issn = {1879-8314}, line = {2}, month = oct, open_access = {Gold}, pages = {952 - 958}, publisher = {IOS Press}, title = {Group Explainability Through Local Approximation}, visible_on_website = {YES}, year = {2025} }
An Interpretable Data-Driven Unsupervised Approach for the Prevention of Forgotten Items

Luca Corbucci, Javier Alejandro Borges Legrottaglie, Francesco Spinnato, Anna Monreale, and Riccardo Guidotti

Oct 2025

RESEARCH LINE
1

Abs Bib HTML

Accurately identifying items forgotten during a supermarket visit and providing clear, interpretable explanations for recommending them remains an underexplored problem within the Next Basket Prediction (NBP) domain. Existing NBP approaches typically only focus on forecasting future purchases, without explicitly addressing the detection of unintentionally omitted items. This gap is partly due to the scarcity of real-world datasets that allow for the reliable estimation of forgotten items. Furthermore, most current NBP methods rely on black-box models, which lack transparency and limit the ability to justify recommendations to end users. In this paper, we formally introduce the forgotten-item prediction task and propose two novel interpretable-by-design algorithms. These methods are tailored to identify forgotten items while offering intuitive, human-understandable explanations. Experiments on a real-world retail dataset show our algorithms outperform state-of-the-art NBP baselines by 10–15% across multiple evaluation metrics.
@inbook{CBS2025, author = {Corbucci, Luca and Borges Legrottaglie, Javier Alejandro and Spinnato, Francesco and Monreale, Anna and Guidotti, Riccardo}, booktitle = {ECAI 2025}, doi = {10.3233/faia250912}, isbn = {9781643686318}, issn = {1879-8314}, line = {1}, month = oct, open_access = {Gold}, publisher = {IOS Press}, title = {An Interpretable Data-Driven Unsupervised Approach for the Prevention of Forgotten Items}, visible_on_website = {YES}, year = {2025} }