Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Deep Medical Entity Recognition for Swedish and Spanish
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
Number of Authors: 42018 (English)In: 2018 IEEE International Conference on Bioinformatics and Biomedicine: Proceedings / [ed] Huiru (Jane) Zheng, Zoraida Callejas, David Griol et al., Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 1595-1601Conference paper, Published paper (Refereed)
Abstract [en]

Clinical texts, although challenging to process, are rich in valuable information, and named entity recognition is an important element in any system designed to extract relevant information from such texts. Recently, improved performance for named entity recognition has been achieved through deep learning methods, and here, a recurrent neural network is evaluated for medical named entity recognition in clinical texts in two different languages, Spanish and Swedish. An important factor for any machine learning model is the input representation, how the features are preprocessed and presented to the model. Therefore, a number of different embeddings derived from large corpora of clinical texts, and several combination strategies for embeddings have been evaluated for this task. Combining a bidirectional LSTM with embeddings derived from words and lemmas gave an improvement in performance with over three points in average F-measure over using only shallow learning methods for both languages, while at the same time reducing the dependency on external resources and feature engineering, showing this approach to be suitable for medical named entity recognition. An average F-measure of 74.87 is obtained for Spanish using lemma embeddings and of 76.04 for Swedish when concatenated lemma and word embeddings are used.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2018. p. 1595-1601
Keywords [en]
Clinical text mining, Unstructured Electronic Medical Records, Medical Named Entity Recognition, Recurrent Neural Network
National Category
Language Technology (Computational Linguistics)
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-176288DOI: 10.1109/BIBM.2018.8621282ISBN: 978-1-5386-5489-7 (print)ISBN: 978-1-5386-5488-0 (electronic)OAI: oai:DiVA.org:su-176288DiVA, id: diva2:1373707
Conference
2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Madrid, spain, 3-6 December, 2018
Available from: 2019-11-27 Created: 2019-11-27 Last updated: 2019-12-02Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Search in DiVA

By author/editor
Weegar, Rebecka
By organisation
Department of Computer and Systems Sciences
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 1 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf