Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Detecting Healthcare-Associated Infections in Electronic Health Records: Evaluation of Machine Learning and Preprocessing Techniques
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Karolinska Institutet, Sweden.
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
2014 (English)In: Proceedings of the 6th International Symposium on Semantic Mining in Biomedicine (SMBM 2014), University of Aveiro , 2014, 3-10 p.Conference paper, Published paper (Refereed)
Abstract [en]

Healthcare-associated infections (HAI) are in- fections that patients acquire in the course of medical treatment. Being a severe pub- lic health problem, detecting and monitoring HAI in healthcare documentation is an impor- tant topic to address. Research on automated systems has increased over the past years, but performance is yet to be enhanced. The dataset in this study consists of 214 records obtained from a Point-Prevalence Survey. The records are manually classified into HAI and NoHAI records. Nine different preprocess- ing steps are carried out on the data. Two learning algorithms, Random Forest (RF) and Support Vector Machines (SVM), are applied to the data. The aim is to determine which of the two algorithms is more applicable to the task and if preprocessing methods will affect the performance. RF obtains the best performance results, yielding an F1 -score of 85% and AUC of 0.85 when lemmatisation is used as a preprocessing technique. Irrespec- tive of which preprocessing method is used, RF yields higher recall values than SVM, with a statistically significant difference for all but one preprocessing method. Regarding each classifier separately, the choice of preprocess- ing method led to no statistically significant improvement in performance results.

Place, publisher, year, edition, pages
University of Aveiro , 2014. 3-10 p.
National Category
Information Systems
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-108679DOI: 10.5167/uzh-98982OAI: oai:DiVA.org:su-108679DiVA: diva2:760057
Conference
Sixth International Symposium on Semantic Mining in Biomedicine (SMBM 2014), Aveiro, Portugal, October 6-7, 2014
Available from: 2014-11-03 Created: 2014-11-03 Last updated: 2014-11-24Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Ehrentraut, ClaudiaKvist, MariaDalianis, Hercules
By organisation
Department of Computer and Systems Sciences
Information Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 1116 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf