Change search
Refine search result
1 - 34 of 34
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Oldest first
  • Newest first
Select
The maximal number of hits you can export is 250. When you want to export more records please use the 'Create feeds' function.
  • 1.
    Allvin, Helen
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Carlsson, Elin
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Dalianis, Hercules
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Danielsson-Ojala, Riitta
    Daudaravieius, Vidas
    Hassel, Martin
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Kokkinakis, Dimitrios
    Lundgrén-Laine, Heljä
    Nilsson, Gunnar H.
    Nytrø, Øystein
    Salanterä, Sanna
    Skeppstedt, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Suominen, Hanna
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Characteristics of Finnish and Swedish intensive care nursing narratives: a comparative analysis to support the development of clinical language technologies2011In: Journal of Biomedical Semantics, ISSN 2041-1480, Vol. 2, no S1, 1-11 p.Article in journal (Refereed)
    Abstract [en]

    Background: Free text is helpful for entering information into electronic health records, but reusing it is a challenge. The need for language technology for processing Finnish and Swedish healthcare text is therefore evident; however, Finnish and Swedish are linguistically very dissimilar. In this paper we present a comparison of characteristics in Finnish and Swedish free-text nursing narratives from intensive care. This creates a framework for characterising and comparing clinical text and lays the groundwork for developing clinical language technologies. Methods: Our material included daily nursing narratives from one intensive care unit in Finland and one in Sweden. Inclusion criteria for patients were an inpatient period of least five days and an age of at least 16 years. We performed a comparative analysis as part of a collaborative effort between Finnish- and Swedish-speaking healthcare and language technology professionals that included both qualitative and quantitative aspects. The qualitative analysis addressed the content and structure of three average- sized health records from each country. In the quantitative analysis 514 Finnish and 379 Swedish health records were studied using various language technology tools. Results: Although the two languages are not closely related, nursing narratives in Finland and Sweden had many properties in common. Both made use of specialised jargon and their content was very similar. However, many of these characteristics were challenging regarding development of language technology to support producing and using clinical documentation. Conclusions: The way Finnish and Swedish intensive care nursing was documented, was not country or language dependent, but shared a common context, principles and structural features and even similar vocabulary elements. Technology solutions are therefore likely to be applicable to a wider range of natural languages, but they need linguistic tailoring. Availability: The Finnish and Swedish data can be found at: http://www.dsv.su.se/ hexanord/data/

  • 2. Chapman, Wendy W.
    et al.
    Hilert, Dieter
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Skeppstedt, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Chapman, Brian
    Conway, Michael
    Tharp, Melissa
    Mowery, Danielle L.
    Deleger, Louise
    Extending the NegEx Lexicon for Multiple Languages2013In: Proceedings of the 14th World Congress on Medical and Health Informatics / [ed] Christoph Ulrich Lehmann, Elske Ammenwerth, Christian Nøhr, IOS Press, 2013, Vol. 192, 677-681 p.Conference paper (Refereed)
    Abstract [en]

    We translated an existing English negation lexicon (NegEx) to Swedish, French, and German and compared the lexicon on corpora from each language. We observed Zipf’s law for all languages, i.e., a few phrases occur a large number of times, and a large number of phrases occur fewer times. Negation triggers “no” and “not” were common for all languages; however, other triggers varied considerably. The lexicon is available in OWL and RDF format and can be extended to other languages. We discuss the challenges in translating negation triggers to other languages and issues in representing multilingual lexical knowledge.

  • 3.
    Dalianis, Hercules
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Henriksson, Aron
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Weegar, Rebecka
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    HEALTH BANK - A Workbench for Data Science Applications in Healthcare2015In: Industry Track Workshop, CEUR Workshop Proceedings , 2015, Vol. 1381, 1-18 p.Conference paper (Refereed)
    Abstract [en]

    The enormous amounts of data that are generated in the healthcare process and stored in electronic health record (EHR) systems are an underutilized resource that, with the use of data science applica- tions, can be exploited to improve healthcare. To foster the development and use of data science applications in healthcare, there is a fundamen- tal need for access to EHR data, which is typically not readily available to researchers and developers. A relatively rare exception is the large EHR database, the Stockholm EPR Corpus, comprising data from more than two million patients, that has been been made available to a lim- ited group of researchers at Stockholm University. Here, we describe a number of data science applications that have been developed using this database, demonstrating the potential reuse of EHR data to support healthcare and public health activities, as well as facilitate medical re- search. However, in order to realize the full potential of this resource, it needs to be made available to a larger community of researchers, as well as to industry actors. To that end, we envision the provision of an in- frastructure around this database called HEALTH BANK – the Swedish Health Record Research Bank. It will function both as a workbench for the development of data science applications and as a data explo- ration tool, allowing epidemiologists, pharmacologists and other medical researchers to generate and evaluate hypotheses. Aggregated data will be fed into a pipeline for open e-access, while non-aggregated data will be provided to researchers within an ethical permission framework. We believe that HEALTH BANK has the potential to promote a growing industry around the development of data science applications that will ultimately increase the efficiency and effectiveness of healthcare.

  • 4.
    Dalianis, Hercules
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    De-identifying Swedish clinical text - refinement of a gold standard and experiments with Conditional random fields2010In: Journal of Biomedical Semantics, ISSN 2041-1480, Vol. 1:6Article in journal (Refereed)
    Abstract [en]

    Background

    In order to perform research on the information contained in Electronic Patient Records (EPRs), access to the data itself is needed. This is often very difficult due to confidentiality regulations. The data sets need to be fully de-identified before they can be distributed to researchers. De-identification is a difficult task where the definitions of annotation classes are not self-evident.

    Results

    We present work on the creation of two refined variants of a manually annotated Gold standard for de-identification, one created automatically, and one created through discussions among the annotators. The data is a subset from the Stockholm EPR Corpus, a data set available within our research group. These are used for the training and evaluation of an automatic system based on the Conditional Random Fields algorithm. Evaluating with four-fold cross-validation on sets of around 4-6 000 annotation instances, we obtained very promising results for both Gold Standards: F-score around 0.80 for a number of experiments, with higher results for certain annotation classes. Moreover, 49 false positives that were verified true positives were found by the system but missed by the annotators.

    Conclusions

    Our intention is to make this Gold standard, The Stockholm EPR PHI Corpus, available to other research groups in the future. Despite being slightly more time-consuming we believe the manual consensus gold standard is the most valuable for further research. We also propose a set of annotation classes to be used for similar de-identification tasks.

  • 5.
    Dalianis, Hercules
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    How Certain are Clinical Assessments?: Annotating Swedish Clinical Text for (Un)certainties, Speculations and Negations2010In: Proceedings of the of the Seventh International Conference on Language Resources and Evaluation, LREC 2010 / [ed] Nicoletta Calzolari, 2010, 3071-3075 p.Conference paper (Other academic)
    Abstract [en]

    Clinical texts contain a large amount of information. Some of this information is embedded in contexts where e.g. a patient status is reasoned about, which may lead to a considerable amount of statements that indicate uncertainty and speculation. We believe that distinguishing such instances from factual statements will be very beneficial for automatic information extraction. We have annotated a subset of the Stockholm Electronic Patient Record Corpus for certain and uncertain expressions as well as speculative and negation keywords, with the purpose of creating a resource for the development of automatic detection of speculative language in Swedish clinical text. We have analyzed the results from the initial annotation trial by means of pairwise Inter-Annotator Agreement (IAA) measured with F-score. Our main findings are that IAA results for certain expressions and negations are very high, but for uncertain expressions and speculative keywords results are less encouraging. These instances need to be defined in more detail. With this annotation trial, we have created an important resource that can be used to further analyze the properties of speculative language in Swedish clinical text. Our intention is to release this subset to other research groups in the future after removing identifiable information.

  • 6.
    Grigonyte, Gintare
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Henriksson, Aron
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Swedification patterns of Latin and Greek affixes in clinical text2016In: Nordic Journal of Linguistics, ISSN 0332-5865, E-ISSN 1502-4717, Vol. 39, no 1, 5-37 p.Article in journal (Refereed)
    Abstract [en]

    Swedish medical language is rich with Latin and Greek terminology which has undergone a Swedification since the 1980s. However, many original expressions are still used by clinical professionals. The goal of this study is to obtain precise quantitative measures of how the foreign terminology is manifested in Swedish clinical text. To this end, we explore the use of Latin and Greek affixes in Swedish medical texts in three genres: clinical text, scientific medical text and online medical information for laypersons. More specifically, we use frequency lists derived from tokenised Swedish medical corpora in the three domains, and extract word pairs belonging to types that display both the original and Swedified spellings. We describe six distinct patterns explaining the variation in the usage of Latin and Greek affixes in clinical text. The results show that to a large extent affixes in clinical text are Swedified and that prefixes are used more conservatively than suffixes.

  • 7.
    Grigonyté, Gintaré
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Karolinska Institutet, Sweden.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Improving Readability of Swedish Electronic Health Records through Lexical Simplification: First Results2014In: Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), Stroudsburg, USA: Association for Computational Linguistics, 2014, 74-83 p.Conference paper (Refereed)
    Abstract [en]

    This paper describes part of an ongoing effort to improve the readability of Swedish electronic health records (EHRs). An EHR contains systematic documentation of a single patient’s medical history across time, entered by healthcare professionals with the purpose of enabling safe and informed care. Linguistically, medical records exemplify a highly specialised domain, which can be superficially characterised as having telegraphic sentences involving displaced or missing words, abundant abbreviations, spelling variations including misspellings, and terminology. We report results on lexical simplification of Swedish EHRs, by which we mean detecting the unknown, out-ofdictionary words and trying to resolve them either as compounded known words, abbreviations or misspellings.

  • 8.
    Grigonyté, Gintaré
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Karolinska Institute, Sweden.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Spelling Variation of Latin and Greek words in Swedish Medical Text2014Conference paper (Refereed)
  • 9.
    Isenius, Niklas
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Initial Results in the Development of SCAN: a Swedish Clinical Abbreviation Normalizer2012In: CLEFeHealth 2012: The CLEF 2012 Workshop on Cross-Language Evaluation of Methods, Applications, and Resources for eHealth Document Analysis / [ed] Hanna Suominen, Canberra, Australia: NICTA, National ICT Australia and The Australian National University , 2012Conference paper (Refereed)
    Abstract [en]

    Abbreviations are common in clinical documentation, as this type of text is written under time-pressure and serves mostly for internal communication. This study attempts to apply and extend existing rule-based algorithms that have been developed for English and Swedish abbreviation detection, in order to create an abbreviation detection algorithm for Swedish clinical texts that can identify and suggest definitions for abbreviations and acronyms. This can be used as a pre-processing step for further information extraction and text mining models, as well as for readability solutions.

    Through a literature review, a number of heuristics were defined for automatic abbreviation detection. These were used in the construction of the Swedish Clinical Abbreviation Normalizer (SCAN). The heuristics were: a) freely available external resources: a dictionary of general Swedish, a dictionary of medical terms and a dictionary of known Swedish medical abbreviations, b) maximum word lengths (from three to eight characters), and c) heuristics for handling common patterns such as hyphenation. For each token in the text, the algorithm checks whether it is a known word in one of the lexicons, and whether it fulfills the criteria for word length and the created heuristics. The final algorithm was evaluated on a set of 300 Swedish clinical notes from an emergency department at the Karolinska University Hospital, Stockholm. These notes were annotated for abbreviations, a total of 2,050 tokens. This set was annotated by a physician accustomed to reading and writing medical records.

    The algorithm was tested in different variants, where the word lists were modified, heuristics adapted to characteristics found in the texts, and different combinations of word lengths. The best performing version of the algorithm achieved an F-Measure score of 79%, with 76% recall and 81% precision, which is a considerable improvement over the baseline where each token was only matched against the word lists (51% F-measure, 87% recall, 36% precision). Not surprisingly, precision results are higher when the maximum word length is set to the lowest (three), and recall results higher when it is set to the highest (eight).

    Algorithms for rule-based systems, mainly developed for English, can be successfully adapted for abbreviation detection in Swedish medical records. System performance relies heavily on the quality of the external resources, as well as on the created heuristics. In order to improve results, part-of-speech information and/or local context is needed for disambiguation. In the case of Swedish, compounding also needs to be handled.

  • 10. Kelly, Liadh
    et al.
    Goeuriot, Lorraine
    Suominen, Hanna
    Schreck, Tobias
    Leroy, Gondy
    Mowery, Danielle
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Chapman, Wendy W.
    Martinez, David
    Zuccon, Guido
    Palotti, João
    Overview of the ShARe/CLEF eHealth Evaluation Lab 20142014In: Information Access Evaluation. Multilinguality, Multimodality, and Interaction: 5th International Conference of the CLEF Initiative, CLEF 2014, Sheffield, UK, September 15-18, 2014. Proceedings / [ed] Evangelos Kanoulas, Cham: Springer, 2014, Vol. 8685, 172-191 p.Conference paper (Refereed)
  • 11.
    Kvist, Maria
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    SCAN: a Swedish Clinical Abbreviation Normalizer: further Development and Adaptation to Radiology2014In: Information Access Evaluation. Multilinguality, Multimodality, and Interaction: 5th International Conference of the CLEF Initiative, CLEF 2014, Sheffield, UK, September 15-18, 2014. Proceedings, Cham: Springer, 2014, 62-73 p.Conference paper (Refereed)
  • 12.
    Kvist, Maria
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Karolinska Institutet, Sweden.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Professional Language in Swedish Radiology Reports – Characterization for Patient-Adapted Text Simplification2013In: Scandinavian Conference on Health Informatics 2013 / [ed] Gustav Bellika et al., Linköping: Linköping University Electronic Press, 2013, 55-59 p.Conference paper (Refereed)
    Abstract [en]

    In health care, there is a need for patient adaption of clinical text, so that patients can understand their own health records. As a base for construction of automated text simplification tools, characterization of the clinical language is needed. We describe a corpus of 0.43 mill. radiology reports from a University Hospital, characterize it quantitatively and per-form a qualitative content analysis. The results show that a limited set of words and phrases are recurrent in the reports and can be used for exchange to more easy-to-read vocabu-lary. Semantic categories such as body parts, findings, proce-dures, and administrative information can be used in the sim-plification process. This study investigates the potentials and the pitfalls for text simplification of medical Swedish into general Swedish for laymen.

  • 13. Lövestam, Elin
    et al.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Abbreviations in Swedish Clinical Text - use by three professions2014In: Studies in Health Technology and Informatics, ISSN 1879-8365, Vol. 205, 720-724 p.Article in journal (Refereed)
  • 14. Mowery, Danielle L.
    et al.
    South, Brett R.
    Christensen, Lee
    Leng, Jianwei
    Peltonen, Laura-Maria
    Salantera, Sanna
    Suominen, Hanna
    Martinez, David
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Elhadad, Noemie
    Savova, Guergana
    Pradhan, Sameer
    Chapman, Wendy W.
    Normalizing acronyms and abbreviations to aid patient understanding of clinical texts: ShARe/CLEF eHealth Challenge 2013, Task 22016In: Journal of Biomedical Semantics, ISSN 2041-1480, E-ISSN 2041-1480, Vol. 7, 43Article in journal (Refereed)
    Abstract [en]

    Background: The ShARe/CLEF eHealth challenge lab aims to stimulate development of natural language processing and information retrieval technologies to aid patients in understanding their clinical reports. In clinical text, acronyms and abbreviations, also referenced as short forms, can be difficult for patients to understand. For one of three shared tasks in 2013 (Task 2), we generated a reference standard of clinical short forms normalized to the Unified Medical Language System. This reference standard can be used to improve patient understanding by linking to web sources with lay descriptions of annotated short forms or by substituting short forms with a more simplified, lay term. Methods: In this study, we evaluate 1) accuracy of participating systems' normalizing short forms compared to a majority sense baseline approach, 2) performance of participants' systems for short forms with variable majority sense distributions, and 3) report the accuracy of participating systems' normalizing shared normalized concepts between the test set and the Consumer Health Vocabulary, a vocabulary of lay medical terms. Results: The best systems submitted by the five participating teams performed with accuracies ranging from 43 to 72 %. A majority sense baseline approach achieved the second best performance. The performance of participating systems for normalizing short forms with two or more senses with low ambiguity (majority sense greater than 80 %) ranged from 52 to 78 % accuracy, with two or more senses with moderate ambiguity (majority sense between 50 and 80 %) ranged from 23 to 57 % accuracy, and with two or more senses with high ambiguity (majority sense less than 50 %) ranged from 2 to 45 % accuracy. With respect to the ShARe test set, 69 % of short form annotations contained common concept unique identifiers with the Consumer Health Vocabulary. For these 2594 possible annotations, the performance of participating systems ranged from 50 to 75 % accuracy. Conclusion: Short form normalization continues to be a challenging problem. Short form normalization systems perform with moderate to reasonable accuracies. The Consumer Health Vocabulary could enrich its knowledge base with missed concept unique identifiers from the ShARe test set to further support patient understanding of unfamiliar medical terms.

  • 15. Mowery, Danielle L.
    et al.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Chapman, Wendy W.
    Medical diagnosis lost in translation – Analysis of uncertainty and negation expressions in English and Swedish clinical texts2012In: BioNLP '12: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing, Stroudsburg, USA: Association for Computational Linguistics, 2012, 56-64 p.Conference paper (Refereed)
    Abstract [en]

    In the English clinical and biomedical text domains, negation and certainty usage are two well-studied phenomena. However, few studies have made an in-depth characterization of uncertainties expressed in a clinical setting, and compared this between different annotation efforts. This preliminary, qualitative study attempts to 1) create a clinical uncertainty and negation taxonomy, 2) develop a translation map to convert annotation labels from an English schema into a Swedish schema, and 3) characterize and compare two data sets using this taxonomy. We define a clinical uncertainty and negation taxonomy and a translation map for converting annotation labels between two schemas and report observed similarities and differences between the two data sets.

  • 16. Mowery, Danielle
    et al.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    South, Brett R.
    Christensen, Lee
    Martinez, David
    Kelly, Liadh
    Goeuriot, Lorraine
    Elhadad, Noemie
    Pradhan, Sameer
    Savova, Guergana
    Chapman, Wendy W.
    Task 2: ShARe/CLEF eHealth Evaluation Lab 20142014In: CLEFeHealth eHealth Evaluation Lab 2014, WISU Verlag Aachen, 2014Conference paper (Refereed)
  • 17. Mowery, Danielle
    et al.
    Wiebe, Janyce
    Ross, Mindy
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Stephane, Meystre,
    Chapman, Wendy
    Generating Patient Problem Lists from the ShARe Corpus using SNOMED CT/SNOMED CT CORE Problem List2014In: Proceedings of BioNLP 2014, Stroudsburg: Association for Computational Linguistics , 2014, 54-58 p.Conference paper (Refereed)
  • 18. Suominen, Hanna
    et al.
    Salanterä, Sanna
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Chapman, Wendy W.
    Savova, Guergana
    Elhadad, Noemie
    Pradhan, Sameer
    South, Brett R.
    Mowery, Danielle L.
    Jones, Gareth J.F.
    Leveling, Johannes
    Kelly, Liadh
    Goeuriot, Lorraine
    Martinez, David
    Zuccon, Guido
    Overview of the ShARe/CLEF eHealth Evaluation Lab 20132013In: Information Access Evaluation. Multilinguality, Multimodality, and Visualization: Proceedings / [ed] Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B., Springer Berlin/Heidelberg, 2013, 212-231 p.Conference paper (Refereed)
    Abstract [en]

    Discharge summaries and other free-text reports in healthcare transfer information between working shifts and geographic locations. Patients are likely to have difficulties in understanding their content, because of their medical jargon, non-standard abbreviations, and ward-specific idioms. This paper reports on an evaluation lab with an aim to support the continuum of care by developing methods and resources that make clinical reports in English easier to understand for patients, and which helps them in finding information related to their condition. This ShARe/CLEFeHealth2013 lab offered student mentoring and shared tasks: identification and normalisation of disorders (1a and 1b) and normalisation of abbreviations and acronyms (2) in clinical reports with respect to terminology standards in healthcare as well as information retrieval (3) to address questions patients may have when reading clinical reports. The focus on patients’ information needs as opposed to the specialised information needs of physicians and other healthcare workers was the main feature of the lab distinguishing it from previous shared tasks. De-identified clinical reports for the three tasks were from US intensive care and originated from the MIMIC II database. Other text documents for Task 3 were from the Internet and originated from the Khresmoi project. Task 1 annotations originated from the ShARe annotations. For Tasks 2 and 3, new annotations, queries, and relevance assessments were created. 64, 56, and 55 people registered their interest in Tasks 1, 2, and 3, respectively. 34 unique teams (3 members per team on average) participated with 22, 17, 5, and 9 teams in Tasks 1a, 1b, 2 and 3, respectively. The teams were from Australia, China, France, India, Ireland, Republic of Korea, Spain, UK, and USA. Some teams developed and used additional annotations, but this strategy contributed to the system performance only in Task 2. The best systems had the F1 score of 0.75 in Task 1a; Accuracies of 0.59 and 0.72 in Tasks 1b and 2; and Precision at 10 of 0.52 in Task 3. The results demonstrate the substantial community interest and capabilities of these systems in making clinical reports easier to understand for patients. The organisers have made data and tools available for future research and development.

  • 19.
    Svee, Eric-Oluf
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Karolinska Institute, Sweden.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Capturing and Representing Values for Requirements of Personal Health Records2013In: PoEM Short Papers: Short Paper Proceedings of the 6th IFIP WG 8.1 Working Conference on the Practice of Enterprise Modeling (PoEM 2013) / [ed] Janis Grabis, Marite Kirikova, Jelena Zdravkovic, Janis Stirna, 2013, 166-175 p.Conference paper (Refereed)
    Abstract [en]

    Patients’ access to their medical records in the form of Personal Health Records (PHRs) is a central part of the ongoing shift in health policy, where patient empowerment is in focus. A survey was conducted to gauge the stakeholder requirements of patients in regards to functionality requests in PHRs. Models from goal-oriented requirements engineering were created to express the values and preferences held by patients in regards to PHRs from this survey. The present study concludes that patient values can be extracted from survey data, allowing the incorporation of values in the common workflow of requirements engineering without extensive reworking.

  • 20.
    Tanushi, Hideyuki
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Dalianis, Hercules
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Duneld, Martin
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Karolinska University Hospital.
    Skeppstedt, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Negation Scope Delimitation in Clinical Text Using Three Approaches: NegEx, PyConTextNLP and SynNeg2013In: Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013) / [ed] Stephan Oepen, Kristin Hagen, Janne Bondi Johannessen, Linköping: Linköping University Electronic Press , 2013, 387-474 p.Conference paper (Refereed)
    Abstract [en]

    Negation detection is a key component in clinical information extraction systems, as health record text contains reasonings in which the physician excludes different diagnoses by negating them. Many systems for negation detection rely on negation cues (e.g. not), but only few studies have investigated if the syntactic structure of the sentences can be used for determining the scope of these cues. We have in this paper compared three different systems for negation detection in Swedish clinical text (NegEx, PyConTextNLP and SynNeg), which have different approaches for determining the scope of negation cues. NegEx uses the distance between the cue and the disease, PyConTextNLP relies on a list of conjunctions limiting the scope of a cue, and in SynNeg the boundaries of the sentence units, provided by a syntactic parser, limit the scope of the cues. The three systems produced similar results, detecting negation with an F-score of around 80%, but using a parser had advantages when handling longer, complex sentences or short sentences with contradictory statements.

  • 21.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Shades of Certainty: Annotation and Classification of Swedish Medical Records2012Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Access to information is fundamental in health care. This thesis presents research on Swedish medical records with the overall goal of building intelligent information access tools that can aid health personnel, researchers and other professions in their daily work, and, ultimately, improve health care in general.

    The issue of ethics and identifiable information is addressed by creating an annotated gold standard corpus and porting an existing de-identification system to Swedish from English. The aim is to move towards making textual resources available to researchers without risking exposure of patients’ confidential information. Results for the rule-based system are not encouraging, but results for the gold standard are fairly high.

    Affirmed, uncertain and negated information needs to be distinguished when building accurate information extraction tools. Annotation models are created, with the aim of building automated systems. One model distinguishes certain and uncertain sentences, and is applied on medical records from several clinical departments. In a second model, two polarities and three levels of certainty are applied on diagnostic statements from an emergency department. Overall results are promising. Differences are seen depending on clinical practice, annotation task and level of domain expertise among the annotators.

    Using annotated resources for automatic classification is studied. Encouraging overall results using local context information are obtained. The fine-grained certainty levels are used for building classifiers for real-world e-health scenarios.

    This thesis contributes two annotation models of certainty and one of identifiable information, applied on Swedish medical records. A deeper understanding of the language use linked to conveying certainty levels is gained. Three annotated resources that can be used for further research have been created, and implications for automated systems are presented.

  • 22.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Automatic Classification of Factuality Levels: A Case Study on Swedish Diagnoses and the Impact of Local Context2011In: The Fourth International Symposium on Languages in Biology and Medicine, Singapore, 2011Conference paper (Refereed)
    Abstract [en]

    Clinicians express different levels of knowledge certainty when reasoning about a patient’s status. Automatic extraction of relevant information is crucial in the clinical setting, which means that factuality levels need to be distinguished. We present an automatic classifier using Conditional Random Fields, which is trained and tested on a Swedish clinical corpus annotated for factuality levels at a diagnosis statement level: the Stockholm EPR Diagnosis-Factuality Corpus. The classifier obtains promising results (best overall results are 0.699 average F-measure using all classes, 0.762 F-measure using merged classes), using simple local context features. Preceding context is more useful than posterior, although best results are obtained using a window size of +/-4. Lower levels of certainty are more problematic than higher levels, which was also the case for the human annotators in creating the corpus. A manual error analysis shows that conjunctions and other higher-level features are common sources of errors.

  • 23.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Towards A Better Understanding of Uncertainties and Speculations in Swedish Clinical Text – Analysis of an Initial Annotation Trial2010In: Proceedings of the Workshop on Negation and Speculation in Natural Language Processing, University of Antwerpen , 2010, 14-22 p.Conference paper (Other academic)
    Abstract [en]

    In view of the increasing need to facilitate processing the content of scientific papers, we present an annotation scheme for annotating full papers with zones of conceptualisation, reflecting the information structure and knowledge types which constitute a scientific investigation. The latter are the Core Scientific Concepts (CoreSCs) and include Hypothesis, Motivation, Goal, Object, Background, Method, Experiment, Model, Observation, Result and Conclusion. The CoreSC scheme has been used to annotate a corpus of 265 full papers in physical chemistry and biochemistry and we are currently automating the recognition of CoreSCs in papers. We discuss how the CoreSC scheme relates to other views of scientific papers and indeed how the former could be used to help identify negation and speculation in scientific texts.

  • 24.
    Velupillai, Sumithra
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Dalianis, Hercules
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Hassel, Martin
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Nilsson, Gunnar
    Developing a standard for de-identifying electronic patient records written in Swedish: precision, recall and F-measure in a manual and computerized annotation trial2009In: International Journal of Medical Informatics, ISSN 1386-5056, E-ISSN 1872-8243, Vol. 78, no 12, e19-e26 p.Article in journal (Refereed)
    Abstract [en]

    Background

    Electronic patient records (EPRs) contain a large amount of information written in free text. This information is considered very valuable for research but is also very sensitive since the free text parts may contain information that could reveal the identity of a patient. Therefore, methods for de-identifying EPRs are needed. The work presented here aims to perform a manual and automatic Protected Health Information (PHI)-annotation trial for EPRs written in Swedish.

    Methods

    This study consists of two main parts: the initial creation of a manually PHI-annotated gold standard, and the porting and evaluation of an existing de-identification software written for American English to Swedish in a preliminary automatic de-identification trial. Results are measured with precision, recall and F-measure.

    Results

    This study reports fairly high Inter-Annotator Agreement (IAA) results on the manually created gold standard, especially for specific tags such as names. The average IAA over all tags was 0.65 F-measure (0.84 F-measure highest pairwise agreement). For name tags the average IAA was 0.80 F-measure (0.91 F-measure highest pairwise agreement). Porting a de-identification software written for American English to Swedish directly was unfortunately non-trivial, yielding poor results.

    Conclusion

    Developing gold standard sets as well as automatic systems for de-identification tasks in Swedish is feasible. However, discussions and definitions on identifiable information is needed, as well as further developments both on the tag sets and the annotation guidelines, in order to get a reliable gold standard. A completely new de-identification software needs to be developed.

  • 25.
    Velupillai, Sumithra
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Dalianis, Hercules
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Factuality Levels of Diagnoses in Swedish Clinical Text2011In: User Centred Networked Health Care - Proceedings of MIE 2011 / [ed] Anne Moen, Stig Kjær Andersen, Jos Aarts, Petter Hurlen, 2011, 559-563 p.Conference paper (Refereed)
    Abstract [en]

    Different levels of knowledge certainty, or factuality levels, are expressed in clinical health record documentation. This information is currently not fully exploited, as the subtleties expressed in natural language cannot easily be machine analyzed. Extracting relevant information from knowledge-intensive resources such as electronic health records can be used for improving health care in general by e.g. building automated information access systems. We present an annotation model of six factuality levels linked to diagnoses in Swedish clinical assessments from an emergency ward. Our main findings are that overall agreement is fairly high (0.7/0.58 F-measure, 0.73/0.6 Cohen's κ, Intra/Inter). These distinctions are important for knowledge models, since only approx. 50% of the diagnoses are affirmed with certainty. Moreover, our results indicate that there are patterns inherent in the diagnosis expressions themselves conveying factuality levels, showing that certainty is not only dependent on context cues.

  • 26.
    Velupillai, Sumithra
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Duneld, Martin
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Henriksson, Aron
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Skeppstedt, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Dalianis, Hercules
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Louhi 2014: Special issue on health text mining and information analysis: introduction2015In: BMC Medical Informatics and Decision Making, ISSN 1472-6947, E-ISSN 1472-6947, Vol. 2, no SI, 1-3 p.Article in journal (Refereed)
  • 27.
    Velupillai, Sumithra
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Duneld, MartinStockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.Henriksson, AronStockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.Kvist, MariaStockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.Skeppstedt, MariaStockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.Dalianis, HerculesStockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Louhi 2014: Special issue on health text mining and information analysis2015Conference proceedings (editor) (Refereed)
  • 28.
    Velupillai, Sumithra
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Hassel, Martin
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Dalianis, Hercules
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Finding the Parallel: Automatic Dictionary Construction and Identification of Parallel Text Pairs2010In: Using Corpora in Contrastive and Translation Studies / [ed] edited by Richard Xiao, Newcastle: Cambridge Scholars Publishing , 2010Chapter in book (Other academic)
  • 29.
    Velupillai, Sumithra
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Ibrahim, Omran
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Karolinska Institute, Sweden.
    Functions for personal health records in Sweden – patient perspectives2013In: Scandinavian Conference on Health Informatics 2013: Copenhagen, Denmark, August 20, 2013 / [ed] Gustav Bellika et al., Linköping: Linköping University Press , 2013, 95-95 p.Conference paper (Refereed)
    Abstract [en]

    As part of the ongoing shift in health policy, with focus on patient empowerment, the Swedish government prioritizes the patients’ access to their medical records. Different models for personal health records (PHR) are suggested.

    Studies have shown difficulties for patients when navigating and understanding the information in their records. Electronic health record systems are physician-oriented and do not include patient-oriented functions. One problem with medical records is that they contain a lot of data which is usually kept as unstructured text in narrative form; this information overload needs to be structured and presented in a manner that patients understand. Furthermore, in order for the PHR to be a supporting tool for patients, there is a need to identify which key functions should be implemented to support patients. Usage of PHR is highly dependent on the information offered and that functions available meet patient needs. In Sweden, little research has been conducted regarding PHR functions  referred by patients. This study addresses the research question “Which PHR functions are preferred by patients living in Sweden?”.

  • 30.
    Velupillai, Sumithra
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Fine-grained Certainty Level Annotations Used for Coarser-grained E-health Scenarios: Certainty Classication of Diagnostic Statements in Swedish Clinical Text2012In: Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II / [ed] Alexander Gelbukh, Berlin/Heidelberg: Springer Berlin/Heidelberg, 2012, 450-461 p.Conference paper (Refereed)
    Abstract [en]

    An important task in information access methods is distinguishingfactual information from speculative or negated information.Fine-grained certainty levels of diagnostic statements in Swedish clinicaltext are annotated in a corpus from a medical university hospital.The annotation model has two polarities (positive and negative) andthree certainty levels. However, there are many e-health scenarios wheresuch ne-grained certainty levels are not practical for information extraction.Instead, more coarse-grained groups are needed. We presentthree scenarios: adverse event surveillance, decision support alerts andautomatic summaries and collapse the ne-grained certainty level classi-cations into coarser-grained groups. We build automatic classiers foreach scenario and analyze the results quantitatively. Annotation discrepanciesare analyzed qualitatively through manual corpus analysis. Ourmain ndings are that it is feasible to use a corpus of ne-grained certaintylevel annotations to build classiers for coarser-grained real-worldscenarios: 0.89, 0.91 and 0.8 F-score (overall average).

  • 31.
    Velupillai, Sumithra
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. University of Utah, USA.
    Mowery, Danielle L.
    Abdelrahman, Samir
    Christensen, Lee
    Chapman, Wendy W.
    BluLab: Temporal Information Extraction for the 2015 Clinical TempEval Challenge2015In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Association for Computational Linguistics, 2015, 815-819 p.Conference paper (Refereed)
    Abstract [en]

    The 2015 Clinical TempEval Challenge addressed the problem of temporal reasoning in the clinical domain by providing an annotated corpus of pathology and clinical notes related to colon cancer patients. The challenge consisted of six subtasks: TIMEX3 and event span detection, TIMEX3 and event attribute classification, document relation time and narrative container relation classification. Our BluLab team participated in all six subtasks. For the TIMEX3 and event subtasks, we developed a ClearTK support vector machine pipeline using mainly simple lexical features along with information from rule-based systems. For the relation subtasks, we employed a conditional random fields classification approach, with input from a rule-based system for the narrative container relation subtask. Our team ranked first for all TIMEX3 and event subtasks, as well as for the document relation subtask.

  • 32.
    Velupillai, Sumithra
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Mowery, Danielle L
    South, Brett R.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Dalianis, Hercules
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis2015In: IMIA Yearbook of Medical Informatics, ISSN 0943-4747, Vol. 10, 183-193 p.Article in journal (Refereed)
  • 33.
    Velupillai, Sumithra
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Skeppstedt, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Karolinska Institutet, Sweden.
    Mowery, Danielle
    Chapman, Brian E.
    Dalianis, Hercules
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Chapman, Wendy W.
    Cue-based assertion classification for Swedish clinical text-Developing a lexicon for pyConTextSwe2014In: Artificial Intelligence in Medicine, ISSN 0933-3657, E-ISSN 1873-2860, Vol. 61, no 3, 137-144 p.Article in journal (Refereed)
    Abstract [en]

    Objective: The ability of a cue-based system to accurately assert whether a disorder is affirmed, negated, or uncertain is dependent, in part, on its cue lexicon. In this paper, we continue our study of porting an assertion system (pyConTextNLP) from English to Swedish (pyConTextSwe) by creating an optimized assertion lexicon for clinical Swedish. Methods and material: We integrated cues from four external lexicons, along with generated inflections and combinations. We used subsets of a clinical corpus in Swedish. We applied four assertion classes (definite existence, probable existence, probable negated existence and definite negated existence) and two binary classes (existence yes/no and uncertainty yes/no) to pyConTextSwe. We compared pyConTextSwe's performance with and without the added cues on a development set, and improved the lexicon further after an error analysis. On a separate evaluation set, we calculated the system's final performance. Results: Following integration steps, we added 454 cues to pyConTextSwe. The optimized lexicon developed after an error analysis resulted in statistically significant improvements on the development set (83%F-score, overall). The system's final F-scores on an evaluation set were 81% (overall). For the individual assertion classes, F-score results were 88% (definite existence), 81% (probable existence), 55% (probable negated existence), and 63% (definite negated existence). For the binary classifications existence yes/no and uncertainty yes/no, final system performance was 97%/87% and 78%/86% F-score, respectively. Conclusions: We have successfully ported pyConTextNLP to Swedish (pyConTextSwe). We have created an extensive and useful assertion lexicon for Swedish clinical text, which could form a valuable resource for similar studies, and which is publicly available.

  • 34.
    Velupillai, Sumithra
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Weegar, Rebecka
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Temporal Annotation of Swedish Intensive Care Notes2016Conference paper (Refereed)
    Abstract [en]

    We describe the creation of a corpus of Swedish intensive care unit (ICU) notes annotated for temporal expressions. Clinical notes from an ICU in Stockholm, Sweden were used. The HeidelTime system was adapted to develop Swedish clinical time expression (TIMEX3) resources. Overall micro-average Inter-Annotator Agreement is high (86% F1). We have created Swedish lexical resources with clinically specific time expressions that will be useful for the development of a Swedish clinical text temporal reasoning system.

1 - 34 of 34
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf