Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Finding Cervical Cancer Symptoms in Swedish Clinical Text using a Machine Learning Approach and NegEx
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Karolinska Institutet, Sweden.ORCID iD: 0000-0002-5780-0063
Show others and affiliations
2015 (English)In: AMIA Annual Symposium Proceedings, American Medical Informatics Association , 2015, p. 1296-1305Conference paper, Published paper (Refereed)
Resource type
Text
Abstract [en]

Detection of early symptoms in cervical cancer is crucial for early treatment and survival. To find symptoms of cervical cancer in clinical text, Named Entity Recognition is needed. In this paper the Clinical Entity Finder, a machine-learning tool trained on annotated clinical text from a Swedish internal medicine emergency unit, is evaluated on cervical cancer records. The Clinical Entity Finder identifies entities of the types body part, finding and disorder and is extended with negation detection using the rule-based tool NegEx, to distinguish between negated and non-negated entities. To measure the performance of the tools on this new domain, two physicians annotated a set of clinical notes from the health records of cervical cancer patients. The inter-annotator agreement for finding, disorder and body part obtained an average F-score of 0.677 and the Clinical Entity Finder extended with NegEx had an average F-score of 0.667.

Place, publisher, year, edition, pages
American Medical Informatics Association , 2015. p. 1296-1305
Series
AMIA Annual Symposium Proceedings, ISSN 1559-4076, E-ISSN 1942-597X
National Category
Information Systems
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-123947PubMedID: 26958270OAI: oai:DiVA.org:su-123947DiVA, id: diva2:878591
Conference
AMIA 2015 Annual Symposium, San Francisco, CA, November 14 - 18, 2015
Available from: 2015-12-09 Created: 2015-12-09 Last updated: 2022-02-23Bibliographically approved
In thesis
1. Mining Clinical Text in Cancer Care
Open this publication in new window or tab >>Mining Clinical Text in Cancer Care
2020 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Health care and clinical practice generate large amounts of text detailing symptoms, test results, diagnoses, treatments, and outcomes for patients. This clinical text, documented in health records, is a potential source of knowledge and an underused resource for improved health care. The focus of this work has been text mining of clinical text in the domain of cancer care, with the aim to develop and evaluate methods for extracting relevant information from such texts. Two different types of clinical documentation have been included: clinical notes from electronic health records in Swedish and Norwegian pathology reports.

Free text, and clinical text in particular, is considered as a kind of unstructured information, which is difficult to process automatically. Therefore, information extraction can be applied to create a more structured representation of a text, making its content more accessible for machine learning and statistics. To this end, this thesis describes the development of an efficient and accurate tool for information extraction for pathology reports.

Another application for clinical text mining is risk prediction and diagnosis prediction. The goal for such prediction is to create a machine learning model capable of identifying patients at risk of a specific disease or some other adverse outcome. The motivation for cancer diagnosis prediction is that an early diagnosis can be beneficial for the outcome of treatment. Here, a disease prediction model was developed and evaluated for prediction of cervical cancer. To create this model, health records of patients diagnosed with cervical cancer were processed in two steps. First, clinical events were extracted from free text clinical notes through the use of named entity recognition. The extracted events were next combined with other event types, such as diagnosis codes and drug codes from the same health records. Finally, machine learning models were trained for predicting cervical cancer, and evaluation showed that events extracted from the free text records were the most informative event type for the diagnosis prediction.

Place, publisher, year, edition, pages
Stockholm: Department of Computer and Systems Sciences, Stockholm University, 2020. p. 64
Series
Report Series / Department of Computer & Systems Sciences, ISSN 1101-8526 ; 20-001
Keywords
text mining, natural language processing, electronic health records, clinical text mining, information extraction
National Category
Computer and Information Sciences
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-176282 (URN)978-91-7797-911-1 (ISBN)978-91-7797-912-8 (ISBN)
Public defence
2020-01-27, L30, NOD-huset, Borgarfjordsgatan 12, Kista, 13:00 (English)
Opponent
Supervisors
Note

At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 4: Accepted. Paper 5: Submitted.

Available from: 2019-12-19 Created: 2019-11-28 Last updated: 2022-02-26Bibliographically approved

Open Access in DiVA

No full text in DiVA

PubMed

Authority records

Weegar, RebeckaKvist, MariaDalianis, Hercules

Search in DiVA

By author/editor
Weegar, RebeckaKvist, MariaDalianis, Hercules
By organisation
Department of Computer and Systems Sciences
Information Systems

Search outside of DiVA

GoogleGoogle Scholar

pubmed
urn-nbn

Altmetric score

pubmed
urn-nbn
Total: 209 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf