Pseudonymisation of Personal Names and other PHIs in an Annotated Clinical Swedish Corpus
2012 (English)In: LREC 2012, Eighth International Conference on Language Resources and Evaluation / [ed] Nicoletta Calzolari et al., 2012Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]
Today a large number of patient records are produced and these records contain valuable information, often in free text, about the medical treatment of patients. Since these records contain information that can reveal the identity of patients, known as protected health information (PHI), the records cannot easily be made available for the research community. In this research we have used a PHI annotated clinical corpora, written in Swedish, that we have pseudonymised. Pseudonymisation means to replace the sensitive information with fictive information for example real personal names are replaced with fictive personal names based on the gender of the real names and family relations. We have evaluated our results and our five respondents of who three were clinicians found that the clinical text looks real and is readable. We have also added pseudonymisation for telephone numbers, locations, health care units, dates and ages. In this paper we also present the entire de-identification and pseudonymisation process of a sample clinical text.
Place, publisher, year, edition, pages
2012.
Keywords [en]
Protected Health Information PHI, Electronic Patient Records EPRs, De-identification, Pseudonym, Swedish.
National Category
Information Systems
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-82254ISBN: 978-2-9517408-7-7 (print)OAI: oai:DiVA.org:su-82254DiVA, id: diva2:567232
Conference
LREC 2012, May 23-24-25, 2012, Istanbul, Turkey
Note
Poster presentation at The Third Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM 2012)
2012-11-122012-11-122022-02-24Bibliographically approved