Pathology text mining - on Norwegian prostate cancer reports
2016 (English)In: 32nd International Conference on Data Engineering, IEEE Computer Society Digital Library , 2016Conference paper (Refereed)
Pathology reports are written by pathologists, skilled physicians, that know how to interpret disorders in various tissue samples from the human body. To obtain valuable statistics on outcome of disorders, as for example cancer and effect of treatment, statistics are collected. Therefore, cancer pathology reports interpreted and coded into databases at cancer registries. In Norway is this task carried out by the Cancer Registry of Norway (Kreftregisteret) by 25 different human coders. There is a need to automate this process. The authors of this article received 25 prostate cancer pathology reports written in Norwegian from the Cancer Registry of Norway, each documenting various stages of prostate cancer and the corresponding correct manual coding. A rule-based algorithm was produced that processed the reports in order to prototype automation. The output of the algorithm was compared to the output of the manual coding. The evaluation showed an average F-Score of 0.94 on four of these data points namely Total Malign, Primary Gleason, Secondary Gleason and Total Gleason and a lower result with on average F-score of 0.76 on all ten data points. The results are in line with previous research.
Place, publisher, year, edition, pages
IEEE Computer Society Digital Library , 2016.
Clinical text mining, rule based, pathology reports, prostate cancer, Norwegian
Research subject Computer and Systems Sciences
IdentifiersURN: urn:nbn:se:su:diva-136597OAI: oai:DiVA.org:su-136597DiVA: diva2:1055462