Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Discovering, selecting and exploiting feature sequence records of study participants for the classification of epidemiological data on hepatic steatosis
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0002-4632-4815
2018 (English)In: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, Association for Computing Machinery (ACM), 2018, p. 6-13Conference paper, Published paper (Refereed)
Abstract [en]

In longitudinal epidemiological studies, participants undergo repeated medical examinations and are thus represented by a potentially large number of short examination outcome sequences. Some of those sequences may contain important information in various forms, such as patterns, with respect to the disease under study, while others may be on features of little relevance to the outcome. In this work, we propose a framework for Discovery, Selection and Exploitation (DiSelEx) of longitudinal epidemiological data, aiming to identify informative patterns among these sequences. DiSelEx combines sequence clustering with supervised learning to identify sequence groups that contribute to class separation. Newly derived and old features are evaluated and selected according to their redundancy and informativeness regarding the target variable. The selected feature set is then used to learn a classification model on the study data. We evaluate DiSelEx on cohort participants for the disorder "hepatic steatosis" and report on the impact on predictive performance when using sequential data in comparison to utilizing only the basic classifier.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2018. p. 6-13
Keywords [en]
medical data mining, patient similarity, time-series clustering, feature selection, classification, epidemiological studies, hepatic steatosis
National Category
Information Systems
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-149420DOI: 10.1145/3167132.3167162ISI: 000455180700002ISBN: 978-1-4503-5191-1 (electronic)OAI: oai:DiVA.org:su-149420DiVA, id: diva2:1161588
Conference
The 33rd Annual ACM Symposium on Applied Computing, Pau, France, April 09 - 13, 2018
Available from: 2017-11-30 Created: 2017-11-30 Last updated: 2022-02-28Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Papapetrou, Panagiotis

Search in DiVA

By author/editor
Papapetrou, Panagiotis
By organisation
Department of Computer and Systems Sciences
Information Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 77 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf