Identifying Synonymy between SNOMED Clinical Terms of Varying Length Using Distributional Analysis of Electronic Health Records
2013 (English)In: AMIA Conference Proceedings Archive, American Medical Informatics Association , 2013, 600-609 p.Conference paper (Refereed)
Medical terminologies and ontologies are important tools for natural language processing of health record narratives. To account for the variability of language use, synonyms need to be stored in a semantic resource as textual instantiations of a concept. Developing such resources manually is, however, prohibitively expensive and likely to result in low coverage. To facilitate and expedite the process of lexical resource development, distributional analysis of large corpora provides a powerful data-driven means of (semi-)automatically identifying semantic relations, including synonymy, between terms. In this paper, we demonstrate how distributional analysis of a large corpus of electronic health records – the MIMIC-II database – can be employed to extract synonyms of SNOMED CT preferred terms. A distinctive feature of our method is its ability to identify synonymous relations between terms of varying length.
Place, publisher, year, edition, pages
American Medical Informatics Association , 2013. 600-609 p.
distributional semantics, semantic space, random indexing, clinical text, electronic health records, synonymy, multiword terms
Research subject Computer and Systems Sciences
IdentifiersURN: urn:nbn:se:su:diva-97218OAI: oai:DiVA.org:su-97218DiVA: diva2:676262
37th Annual Symposium of the American Medical Informatics Association (AMIA), November 16-20, 2013, Washington, DC, USA