Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Evaluating Pretraining Strategies for Clinical BERT Models
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0003-0165-9926
Number of Authors: 32022 (English)In: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), European Language Resources Association , 2022, p. 410-416Conference paper, Published paper (Refereed)
Abstract [en]

Research suggests that using generic language models in specialized domains may be sub-optimal due to significant domain differences. As a result, various strategies for developing domain-specific language models have been proposed, including techniques for adapting an existing generic language model to the target domain, e.g. through various forms of vocabulary modifications and continued domain-adaptive pretraining with in-domain data. Here, an empirical investigation is carried out in which various strategies for adapting a generic language model to the clinical domain are compared to pretraining a pure clinical language model. Three clinical language models for Swedish, pretrained for up to ten epochs, are fine-tuned and evaluated on several downstream tasks in the clinical domain. A comparison of the language models’ downstream performance over the training epochs is conducted. The results show that the domain-specific language models outperform a general-domain language model, although there is little difference in performance between the various clinical language models. However, compared to pretraining a pure clinical language model with only in-domain data, leveraging and adapting an existing general-domain language model requires fewer epochs of pretraining with in-domain data.

Place, publisher, year, edition, pages
European Language Resources Association , 2022. p. 410-416
Keywords [en]
language models, domain-adaptive pretraining, Swedish clinical text
National Category
Natural Language Processing
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-207397OAI: oai:DiVA.org:su-207397DiVA, id: diva2:1683487
Conference
Conference on Language Resources and Evaluation (LREC 2022), 21-23 June 2022, Marseille, France.
Available from: 2022-07-15 Created: 2022-07-15 Last updated: 2025-02-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Länk till publikationen

Authority records

Lamproudis, AnastasiosHenriksson, AronDalianis, Hercules

Search in DiVA

By author/editor
Lamproudis, AnastasiosHenriksson, AronDalianis, Hercules
By organisation
Department of Computer and Systems Sciences
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 157 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf