Medical disorders and findings are examples of important information in health record text. Through developing methods for automatically extracting these entities from the health record text, the possibility of making use of the information by automatic computerised processes increases. That a disorder or finding is mentioned in the health record, however, does not necessarily imply that it has been observed in the patient, because disorders that are ruled out and findings that are not observed in the patient are also mentioned.
This licentiate thesis investigates the possibility of automatically extracting disorders and findings from Swedish health record text and the possibility of automatically determining whether these findings and disorders are negated or not.
A rule- and terminology-based system that uses several Swedish medical terminologies, including SNOMED~CT and ICD-10 for extracting disorders, findings and body structures mentioned in Swedish clinical text was constructed and evaluated. Moreover, an English rule-based system for negation detection, NegEx, was adapted to Swedish and evaluated on clinical text written in Swedish.
The evaluation showed that disorders and findings were recognised with low recall, whereas body structures were recognised with comparatively good results. The negation detection system that was adapted to Swedish achieved the same recall as the English system, but lower precision.
The evaluated systems are accurate enough to be useful in some applications, but need to be further developed, especially when it comes to recognising disorders and findings.
Stockholm: Department of Computer and Systems Sciences, Stockholm University , 2012. , 79 p.