The impact of simple feature engineering in multilingual medical NER
2016 (English)In: Proceedings of the Clinical Natural Language Processing Workshop (ClinicalNLP), 2016, W16-4201Conference paper (Refereed)
The goal of this paper is to examine the impact of simple feature engineering mechanisms before applying more sophisticated techniques to the task of medical NER. Sometimes papers using scientifically sound techniques present raw baselines that could be improved adding simple and cheap features. This work focuses on entity recognition for the clinical domain for three languages: English, Swedish and Spanish. The task is tackled using simple features, starting from the window size, capitalization, prefixes, and moving to POS and semantic tags. This work demonstrates that a simple initial step of feature engineering can improve the baseline results significantly. Hence, the contributions of this paper are: first, a short list of guidelines well supported with experimental results on three languages and, second, a detailed description of the relevance of these features for medical NER.
Place, publisher, year, edition, pages
Research subject Computer and Systems Sciences
IdentifiersURN: urn:nbn:se:su:diva-137494OAI: oai:DiVA.org:su-137494DiVA: diva2:1062769
Clinical Natural Language Processing Workshop, Osaka, Japan, December 11-17, 2016