Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
JUICE: JUstIfied Counterfactual Explanations
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0002-5460-2491
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0002-1357-1967
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0001-7713-1381
Show others and affiliations
2022 (English)In: Discovery Science: 25th International Conference, DS 2022, Montpellier, France, October 10–12, 2022, Proceedings / [ed] Poncelet Pascal, Dino Ienco, Springer , 2022, p. 493-508Conference paper, Published paper (Refereed)
Abstract [en]

Complex, highly accurate machine learning algorithms support decision-making processes with large and intricate datasets. However, these models have low explainability. Counterfactual explanation is a technique that tries to find a set of feature changes on a given instance to modify the models prediction output from an undesired to a desired class. To obtain better explanations, it is crucial to generate faithful counterfactuals, supported by and connected to observations and the knowledge constructed on them. In this study, we propose a novel counterfactual generation algorithm that provides faithfulness by justification, which may increase developers and users trust in the explanations by supporting the counterfactuals with a known observation. The proposed algorithm guarantees justification for mixed-features spaces and we show it performs similarly with respect to state-of-the-art algorithms across other metrics such as proximity, sparsity, and feasibility. Finally, we introduce the first model-agnostic algorithm to verify counterfactual justification in mixed-features spaces.

Place, publisher, year, edition, pages
Springer , 2022. p. 493-508
Series
Lecture Notes in Computer Science (LNCS), ISSN 0302-9743, E-ISSN 1611-3349 ; 13601
Keywords [en]
Machine learning, Interpretability, Counterfactuals, Faithfulness, Justification, Mixed-features space
National Category
Information Systems
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-211894DOI: 10.1007/978-3-031-18840-4_35ISBN: 978-3-031-18840-4 (electronic)ISBN: 978-3-031-18839-8 (print)OAI: oai:DiVA.org:su-211894DiVA, id: diva2:1714114
Conference
Discovery Science 25th International Conference, DS 2022, 10-12 October, 2022, Montpellier, France
Available from: 2022-11-28 Created: 2022-11-28 Last updated: 2024-09-10Bibliographically approved
In thesis
1. Orange Juice: Enhancing Machine Learning Interpretability
Open this publication in new window or tab >>Orange Juice: Enhancing Machine Learning Interpretability
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

In the current state of AI development, it is reasonable to think that AI will continue to expand and be increasingly utilized across different fields, highly impacting every aspect of humanity's welfare and livelihood. However, different AI researchers and institutions agree that AI has the potential to be extremely beneficial but also may pose existential threats to humanity. It is therefore necessary to develop tools to open the so-called black-box AI algorithms and increase their understandability and trustworthiness, in order to avoid conceivably harmful future scenarios.

The lack of interpretability of AI is a challenge to its own development: it is an obstacle equivalent to those that triggered previous AI winters, such as hardware or technological constraints or public over-expectation. In other words, research in interpretability and model understanding, both from theoretical and pragmatic perspectives, will help avoid a third AI winter, which could be devastating for the current world economy.

Specifically, from the theoretical perspective, the subfields of local explainability and algorithmic fairness require some improvements in order to enhance the explanation output. Local explainability refers to the algorithms that attempt to extract useful explanations for the output of machine learning models for individual instances, while algorithmic fairness refers to the study of biases or fairness issues among different groups of people, whenever the datasets refer to humans. Providing a higher level of explanation accuracy, explanation fidelity and explanation support for the observations of each dataset would help improve the overall level of trustworthiness and the understandability of the explanations. The explainability methods should also be applied to practical scenarios. In the area of autonomous driving, for example, providing confidence intervals on the positioning estimates and positioning errors is important for vehicle operations, and machine learning models coupled with conformal prediction may provide a solution that focuses on the confidence of these estimates, prioritizing safety.  

This thesis contributes to research in the field of AI interpretability, focusing mainly on the algorithms related to local explainability, algorithmic fairness and conformal prediction. Specifically, the thesis targets the improvement of counterfactual and local surrogate explanation algorithms. These explainability methods may also reveal the existence of biases, and therefore the study of algorithmic fairness is a relevant part of interpretability. This thesis focuses on the topic of machine learning fairness assessment through the use of local explainability methods, proposing two novel elements: a single accuracy-based and counterfactual-based bias detection measure and a counterfactual generation method for groups intended for bias detection and fair recommendations across groups. Finally, the idea behind interpretability is to be able to eventually implement such methods in real-world applications. This thesis presents an application of the conformal prediction framework to a regression problem related to autonomous vehicle localization systems. In this application, the framework is able to output the predicted positioning error of a vehicle and its confidence interval with some level of significance.

Place, publisher, year, edition, pages
Stockholm: Department of Computer and Systems Sciences, Stockholm University, 2024. p. 72
Series
Report Series / Department of Computer & Systems Sciences, ISSN 1101-8526 ; 24-013
Keywords
artificial intelligence, machine learning, interpretability, explainability, counterfactual, fairness
National Category
Computer Systems
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-233360 (URN)978-91-8014-929-7 (ISBN)978-91-8014-930-3 (ISBN)
Public defence
2024-11-14, L30, NOD-huset, Borgarfjordsgatan 12, Kista, 09:00 (English)
Opponent
Supervisors
Available from: 2024-10-22 Created: 2024-09-10 Last updated: 2024-10-08Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Kuratomi Hernandez, AlejandroMiliou, IoannaLee, ZedLindgren, TonyPapapetrou, Panagiotis

Search in DiVA

By author/editor
Kuratomi Hernandez, AlejandroMiliou, IoannaLee, ZedLindgren, TonyPapapetrou, Panagiotis
By organisation
Department of Computer and Systems Sciences
Information Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 313 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf