Creating a rule based system for text mining of Norwegian breast cancer pathology reports
2015 (English)In: LOUHI, Association for Computational Linguistics , 2015Conference paper (Refereed)
National cancer registries collect cancer related information from multiple sources and make it available for research. Part of this information originates from pathology reports, and in this pre-study the possibil- ity of a system for automatic extraction of information from Norwegian pathology reports is investigated. A set of 40 pathol- ogy reports describing breast cancer tissue samples has been used to develop a rule based system for information extraction. To validate the performance of this system its output has been compared to the data produced by experts doing manual encod- ing of the same pathology reports. On av- erage, a precision of 80%, a recall of 98% and an F-score of 86% has been achieved, showing that such a system is indeed fea- sible.
Place, publisher, year, edition, pages
Association for Computational Linguistics , 2015.
pathology reports, information extraction
Research subject Computer and Systems Sciences
IdentifiersURN: urn:nbn:se:su:diva-122800ISBN: 978-1-941643-32-7OAI: oai:DiVA.org:su-122800DiVA: diva2:868648