Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
ORANGE: Opposite-label soRting for tANGent Explanations in heterogeneous spaces
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0002-5460-2491
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0002-1357-1967
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0001-7713-1381
Show others and affiliations
2023 (English)In: 2023 IEEE 10th International Conference on Data Science and Advanced Analytics (DSAA), IEEE conference proceedings, 2023, p. 1-10Conference paper, Published paper (Refereed)
Abstract [en]

Most real-world datasets have a heterogeneous feature space composed of binary, categorical, ordinal, and continuous features. However, the currently available local surrogate explainability algorithms do not consider this aspect, generating infeasible neighborhood centers which may provide erroneous explanations. To overcome this issue, we propose ORANGE, a local surrogate explainability algorithm that generates highaccuracy and high-fidelity explanations in heterogeneous spaces. ORANGE has three main components: (1) it searches for the closest feasible counterfactual point to a given instance of interest by considering feasible values in the features to ensure that the explanation is built around the closest feasible instance and not any, potentially non-existent instance in space; (2) it generates a set of neighboring points around this close feasible point based on the correlations among features to ensure that the relationship among features is preserved inside the neighborhood; and (3) the generated instances are weighted, firstly based on their distance to the decision boundary, and secondly based on the disagreement between the predicted labels of the global model and a surrogate model trained on the neighborhood. Our extensive experiments on synthetic and public datasets show that the performance achieved by ORANGE is best-in-class in both explanation accuracy and fidelity.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2023. p. 1-10
Keywords [en]
Correlation, Predictive models, Data science, Sorting
National Category
Computer Sciences
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-224013DOI: 10.1109/DSAA60987.2023.10302474Scopus ID: 2-s2.0-85178999467ISBN: 979-8-3503-4503-2 (electronic)OAI: oai:DiVA.org:su-224013DiVA, id: diva2:1814155
Conference
International Conference on Data Science and Advanced Analytics (DSAA), Thessaloniki, Greece, October 9-13, 2023
Available from: 2023-11-23 Created: 2023-11-23 Last updated: 2024-10-16Bibliographically approved
In thesis
1. Orange Juice: Enhancing Machine Learning Interpretability
Open this publication in new window or tab >>Orange Juice: Enhancing Machine Learning Interpretability
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

In the current state of AI development, it is reasonable to think that AI will continue to expand and be increasingly utilized across different fields, highly impacting every aspect of humanity's welfare and livelihood. However, different AI researchers and institutions agree that AI has the potential to be extremely beneficial but also may pose existential threats to humanity. It is therefore necessary to develop tools to open the so-called black-box AI algorithms and increase their understandability and trustworthiness, in order to avoid conceivably harmful future scenarios.

The lack of interpretability of AI is a challenge to its own development: it is an obstacle equivalent to those that triggered previous AI winters, such as hardware or technological constraints or public over-expectation. In other words, research in interpretability and model understanding, both from theoretical and pragmatic perspectives, will help avoid a third AI winter, which could be devastating for the current world economy.

Specifically, from the theoretical perspective, the subfields of local explainability and algorithmic fairness require some improvements in order to enhance the explanation output. Local explainability refers to the algorithms that attempt to extract useful explanations for the output of machine learning models for individual instances, while algorithmic fairness refers to the study of biases or fairness issues among different groups of people, whenever the datasets refer to humans. Providing a higher level of explanation accuracy, explanation fidelity and explanation support for the observations of each dataset would help improve the overall level of trustworthiness and the understandability of the explanations. The explainability methods should also be applied to practical scenarios. In the area of autonomous driving, for example, providing confidence intervals on the positioning estimates and positioning errors is important for vehicle operations, and machine learning models coupled with conformal prediction may provide a solution that focuses on the confidence of these estimates, prioritizing safety.  

This thesis contributes to research in the field of AI interpretability, focusing mainly on the algorithms related to local explainability, algorithmic fairness and conformal prediction. Specifically, the thesis targets the improvement of counterfactual and local surrogate explanation algorithms. These explainability methods may also reveal the existence of biases, and therefore the study of algorithmic fairness is a relevant part of interpretability. This thesis focuses on the topic of machine learning fairness assessment through the use of local explainability methods, proposing two novel elements: a single accuracy-based and counterfactual-based bias detection measure and a counterfactual generation method for groups intended for bias detection and fair recommendations across groups. Finally, the idea behind interpretability is to be able to eventually implement such methods in real-world applications. This thesis presents an application of the conformal prediction framework to a regression problem related to autonomous vehicle localization systems. In this application, the framework is able to output the predicted positioning error of a vehicle and its confidence interval with some level of significance.

Place, publisher, year, edition, pages
Stockholm: Department of Computer and Systems Sciences, Stockholm University, 2024. p. 72
Series
Report Series / Department of Computer & Systems Sciences, ISSN 1101-8526 ; 24-013
Keywords
artificial intelligence, machine learning, interpretability, explainability, counterfactual, fairness
National Category
Computer Systems
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-233360 (URN)978-91-8014-929-7 (ISBN)978-91-8014-930-3 (ISBN)
Public defence
2024-11-14, L30, NOD-huset, Borgarfjordsgatan 12, Kista, 09:00 (English)
Opponent
Supervisors
Available from: 2024-10-22 Created: 2024-09-10 Last updated: 2024-10-08Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Kuratomi Hernandez, AlejandroLee, ZedMiliou, IoannaLindgren, TonyPapapetrou, Panagiotis

Search in DiVA

By author/editor
Kuratomi Hernandez, AlejandroLee, ZedMiliou, IoannaLindgren, TonyPapapetrou, Panagiotis
By organisation
Department of Computer and Systems Sciences
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 199 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf