Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Influence of Module Order on Rule-Based De-identification of Personal Names in Electronic Patient Records Written in Swedish
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
2010 (English)In: Proceedings of the Seventh International Conference on Language Resources and Evaluation, LREC 2010, Valletta, Malta, May 19-21, 2010, European Language Resources Association (ELRA) , 2010, 3442-3446 p.Conference paper, Published paper (Other academic)
Abstract [en]

Electronic patient records (EPRs) are a valuable resource for research but for confidentiality reasons they cannot be used freely. In order to make EPRs available to a wider group of researchers, sensitive information such as personal names has to be removed. Deidentification is a process that makes this possible. Both rule-based as well as statistical and machine learning based methods exist to perform de-identification, but the second method requires annotated training material which exists only very sparsely for patient names. It is therefore necessary to use rule-based methods for de-identification of EPRs. Not much is known, however, about the order in which the various rules should be applied and how the different rules influence precision and recall. This paper aims to answer this research question by implementing and evaluating four common rules for de-identification of personal names in EPRs written in Swedish: (1) dictionary name matching, (2) title matching, (3) common words filtering and (4) learning from previous modules. The results show that to obtain the highest recall and precision, the rules should be applied in the following order: title matching, common words filtering and dictionary name matching.

Place, publisher, year, edition, pages
European Language Resources Association (ELRA) , 2010. 3442-3446 p.
National Category
Information Science
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-51937ISBN: 2-9517408-6-7 (print)OAI: oai:DiVA.org:su-51937DiVA: diva2:386419
Conference
Language Resources and Evaluation - LREC , 2010
Available from: 2011-01-12 Created: 2011-01-12 Last updated: 2011-06-23Bibliographically approved

Open Access in DiVA

No full text

Other links

http://www.lrec-conf.org/proceedings/lrec2010/pdf/46_Paper.pdfhttp://daisy.dsv.su.se/fil/visa?id=41368

Search in DiVA

By author/editor
Dalianis, Hercules
By organisation
Department of Computer and Systems Sciences
Information Science

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 26 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf