Detection of Spelling Errors in Swedish Clinical Text
2014 (English)In: NorWES T2014, 2014Conference paper (Refereed)
Spelling errors are common in clinical text because such text is written under pressure and lack of time. It is mostly used for internal communication. To improve text mining and other type of text processing tools, spelling error detection and correction is needed. In this paper we will count spelling errors in Swedish clinical text. The developed algorithm uses word lists for detection such as a Swedish general dictionary, a medical dictionary and a list of abbreviations. The final algorithm has been tested on a Swedish clinical corpus, we obtained 12 per cent spelling errors. After error analysis of the result, it was concluded that many errors were detected by the algorithm due to inadequate word list and faulty preprocessing such as lemmatization and compound splitting. By manually removing these correct words from the list, total spelling errors were decreased to 7.6 per cent.
Place, publisher, year, edition, pages
Swedish clinical text, spelling detection, text mining, information extraction
Research subject Computer and Systems Sciences
IdentifiersURN: urn:nbn:se:su:diva-110972OAI: oai:DiVA.org:su-110972DiVA: diva2:773746
1st Nordic workshop on evaluation of spellchecking and proofing tools (NorWEST2014), SLTC 2014, Uppsala