Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
SMILE: A feature-based temporal abstraction framework for event-interval sequence classification
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0002-4632-4815
2021 (English)In: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 35, no 1, p. 372-399Article in journal (Refereed) Published
Abstract [en]

In this paper, we study the problem of classification of sequences of temporal intervals. Our main contribution is a novel framework, which we call SMILE, for extracting relevant features from interval sequences to construct classifiers.SMILE introduces the notion of utilizing random temporal abstraction features, we define as e-lets, as a means to capture information pertaining to class-discriminatory events which occur across the span of complete interval sequences. Our empirical evaluation is applied to a wide array of benchmark data sets and fourteen novel datasets for adverse drug event detection. We demonstrate how the introduction of simple sequential features, followed by progressively more complex features each improve classification performance. Importantly, this investigation demonstrates that SMILE significantly improves AUC performance over the current state-of-the-art. The investigation also reveals that the selection of underlying classification algorithm is important to achieve superior predictive performance, and how the number of features influences the performance of our framework.

Place, publisher, year, edition, pages
2021. Vol. 35, no 1, p. 372-399
National Category
Computer Sciences
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-189137DOI: 10.1007/s10618-020-00719-3ISI: 000591944900001OAI: oai:DiVA.org:su-189137DiVA, id: diva2:1518909
Available from: 2021-01-18 Created: 2021-01-18 Last updated: 2022-09-15Bibliographically approved
In thesis
1. Learning from Complex Medical Data Sources
Open this publication in new window or tab >>Learning from Complex Medical Data Sources
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Large, varied, and time-evolving data sources can be observed across many domains and present a unique challenge for classification problems, in which traditional machine learning approaches must be adapted to accommodate for the complex nature of such data. Across most domains, there is also a need for machine learning models that are both well-performing and interpretable, to help provide explanations of a model's decisions that stakeholders can trust and take appropriate actions with. 

In the medical domain, complex Electronic Health Record (EHR) data consists of longitudinal records of patient histories spanning structured and unstructured data types. Exploiting such complex medical data is vital as a means of gaining useful medical insights and predictions, and where establishing stakeholder trust through useful explanations is critical. This thesis has focused on producing state-of-the-art classification methods for exploiting the heterogeneity and temporality of complex data; secondly, on developing novel interpretability methods to aid in the understanding of model predictions from such complex data; and finally on ensuring the medical applicability of the developed methods and other novel methods particularly for the medical problem of adverse drug event (ADE) prediction.

In the first part of this thesis, several state-of-the-art classification frameworks for exploiting complex medical data are outlined, with their utility demonstrated through comparative empirical evaluations to competing framework approaches. In the second part of this thesis, novel interpretability methods are developed and demonstrated for their applicability across domains. In the third part of this thesis, the applicability of interpretability and explanability methods for complex medical data are investigated, refined, and assessed for validity in connection to the use-case of ADE prediction. Main contributions of this thesis include: two novel classification frameworks, including SMILE, demonstrating significantly improved AUC performance over the main framework competitors and other selected competitor approaches; novel generalised ‘time-series tweaking’ methods delivering optimized counter-factual explanations in the time series domain; and findings that attention-based explanations from interpretable deep learning models and the post-hoc SHAP techniques can be leveraged for medical insight and explanations for ADE predictions.

Place, publisher, year, edition, pages
Stockholm: Department of Computer and Systems Sciences, Stockholm University, 2022. p. 114
Series
Report Series / Department of Computer & Systems Sciences, ISSN 1101-8526 ; 22-004
Keywords
Machine Learning, Data Science, Healthcare, Complex Data, Explainable AI, Deep Learning
National Category
Computer and Information Sciences
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-209359 (URN)978-91-8014-014-0 (ISBN)978-91-8014-015-7 (ISBN)
Public defence
2022-10-28, Lilla hörsalen, NOD-huset, Borgarfjordsgatan 12, Kista, 13:00 (English)
Opponent
Supervisors
Available from: 2022-10-05 Created: 2022-09-15 Last updated: 2022-09-28Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Rebane, JonathanKarlsson, IsakPapapetrou, Panagiotis

Search in DiVA

By author/editor
Rebane, JonathanKarlsson, IsakPapapetrou, Panagiotis
By organisation
Department of Computer and Systems Sciences
In the same journal
Data mining and knowledge discovery
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 121 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf