Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
General-Purpose Text Categorization Applied to the Medical Domain
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
2007 (English)Report (Other (popular science, discussion, etc.))
Abstract [en]

This paper presents work where a general-purpose text categorization method was applied to categorize medical free-texts. The purpose of the experiments was to examine how such a method performs without any domain-specific knowledge, hand-crafting or tuning. Additionally, we compare the results from the general-purpose method with results from runs in which a medical thesaurus as well as automatically extracted keywords were used when building the classifiers. We show that standard text categorization techniques using stemmed unigrams as the basis for learning can be applied directly to categorize medical reports, yielding an F-measure of 83.9, and outperforming the more sophisticated methods.

Place, publisher, year, edition, pages
2007.
Identifiers
URN: urn:nbn:se:su:diva-12121OAI: oai:DiVA.org:su-12121DiVA: diva2:178641
Available from: 2008-01-16 Created: 2008-01-16Bibliographically approved

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Alemu Argaw, Atelach
By organisation
Department of Computer and Systems Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 50 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf