General-Purpose Text Categorization Applied to the Medical Domain
2007 (English)Report (Other (popular science, discussion, etc.))
This paper presents work where a general-purpose text categorization method was applied to categorize medical free-texts. The purpose of the experiments was to examine how such a method performs without any domain-specific knowledge, hand-crafting or tuning. Additionally, we compare the results from the general-purpose method with results from runs in which a medical thesaurus as well as automatically extracted keywords were used when building the classifiers. We show that standard text categorization techniques using stemmed unigrams as the basis for learning can be applied directly to categorize medical reports, yielding an F-measure of 83.9, and outperforming the more sophisticated methods.
Place, publisher, year, edition, pages
IdentifiersURN: urn:nbn:se:su:diva-12121OAI: oai:DiVA.org:su-12121DiVA: diva2:178641