Operational message
There are currently operational disruptions. Troubleshooting is in progress.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Preserving the Privacy of Language Models: Experiments in Clinical NLP
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0001-8988-8226
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

State-of-the-art methods in natural language processing (NLP) increasingly rely on large pre-trained language models. The strength of these models stems from their large number of parameters and the enormous amounts of data used to train them. The datasets are of a scale that makes it difficult, if not impossible, to audit them manually. When unwieldy amounts of potentially sensitive data are used to train large models, an important problem arises: unwelcome memorization of the training data.

All datasets—including those based on publicly available data—can contain personally identifiable information (PII). When models memorize sensitive data, they become vulnerable to privacy attacks. Very few datasets for NLP can be guaranteed to be free of sensitive data. Consequently, most NLP models are susceptible to privacy leakage. This susceptibility is especially concerning in clinical NLP, where the data typically consist of electronic health records (EHRs). Leaking data from EHRs is never acceptable from a privacy perspective. This doctoral thesis investigates the privacy risks of using sensitive data and how they can be mitigated—while maintaining their utility as training data.

A BERT model pre-trained using clinical data is subjected to a training data extraction attack. The same model is used to evaluate a membership inference attack that has been proposed to quantify the privacy risks of masked language models. Multiple experiments assess the performance gains from adapting pre-trained models to the clinical domain. Then, the impact of automatic de-identification on the performance of BERT models is evaluated for both pre-training and fine-tuning data. The final experiments of the thesis explore how synthetic training corpora can be generated while limiting the use of sensitive data, and working under computational constraints. The quality of these corpora, and the factors affecting their utility, are explored by training and evaluating BERT models.

The results show that domain adaptation leads to significantly better performance on clinical NLP tasks. They also show that extracting training data from BERT models is difficult and suggest that the risks can be further decreased by automatically de-identifying the training data. Automatic de-identification is found to preserve the utility of the data used for pre-training and fine-tuning BERT models. However, we also find that contemporary membership inference attacks are unable to quantify the privacy benefits of this technique. Similarly, high-quality synthetic corpora can be generated using limited resources, but further research is needed to determine the privacy gains from using them. The results show that automatic de-identification and training data synthesis reduce the privacy risks of using sensitive data for NLP while preserving the utility of the data. However, these benefits are difficult to quantify, and there are no rigorous methods for comparing different privacy-preserving techniques.

Abstract [sv]

Den språkteknologiska forskningsfronten förlitar sig i hög utsträckning på stora förtränade språkmodeller. Deras styrka kommer av deras stora antal parametrar och de enorma mängder data som används för att träna dessa. Deras träningsdatamängder är så stora att det är svårt, om inte omöjligt, att granska dem manuellt. När oregerliga mängder potentiellt känsliga data används för att träna stora modeller uppstår ett besvärligt problem: oönskad memorering av träningsdata.

Alla datamängder—även offentligt tillgängliga sådana—kan innehålla personligt identifierbar information (PII). När modeller memorerar sådana känsliga data blir de sårbara för olika integritetsröjande angrepp. Väldigt få datamängder kan garanteras vara fria från PII. Därmed är också de flesta språkteknologiska modeller sårbara för angrepp. Dessa sårbarheter är särskilt besvärande när språkteknologi tillämpas inom den medicinska domänen. Där utgörs ofta träningsdata av patientjournaler—data som aldrig får läcka. Denna doktorsavhandling undersöker vilka integritetsrisker som kommer av att använda känsliga data och hur dessa risker kan bemötas—utan att påverka användbarheten hos dessa data.

En BERT-modell som tränats med patientjournaler utsätts för ett datautvinningsangrepp. Samma modell och data utsätts för ett tillhörighetsbedömande angrepp. Detta angrepp har tidigare föreslagits som en metod för att bedöma integritetsrisker hos maskerade språkmodeller. Flera experiment undersöker nyttan av att domänanpassa modeller med medicinska data. Ytterligare experiment granskar sedan huruvida automatiskt avidentifierade data lämpar sig för förträning och finjustering av språkmodeller. Avhandlingens sista experiment utforskar hur användningen av känsliga data kan begränsas vid framtagningen av syntetiska träningsdata. Kvalitén på dessa data, samt vilka faktorer som påverkar deras användbarhet, bedöms genom att träna och utvärdera BERT-modeller.

Resultaten visar tydligt att domänanpassning leder till bättre presterande modeller för medicinska tillämpningar. De visar också att riskerna att träningsdata kan utvinnas ur BERT-modeller är små, och att de risker som kvarstår kan begränsas ytterligare genom att automatiskt avidentifiera modellernas träningsdata. Automatisk avidentifiering visar sig även bibehålla datamängdernas användbarhet när de används för att förträna och finjustera BERT-modeller. Det visar sig dock att det är svårt att kvantifiera integritetsvinsterna av denna metod, och att tillhörighetsbedömande angrepp inte mäter nyttan med denna integritetsbevarande metod. Experimenten med syntetiska data visar att högkvalitativa sådana kan framställas även med sparsam användning av känsliga data, och med begränsad beräkningskapacitet. Avhandlingen visar att automatisk avidentifiering och datasyntes kan minska riskerna som kommer av att använda känsliga data—samtidigt som de bibehåller sin användbarhet—men att det saknas tillförlitliga metoder för att mäta och jämföra olika integritetsbevarande metoder.

Place, publisher, year, edition, pages
Stockholm: Department of Computer and Systems Sciences, Stockholm University , 2025. , p. 126
Series
Report Series / Department of Computer & Systems Sciences, ISSN 1101-8526 ; 26-001
Keywords [en]
natural language processing, privacy, membership inference, training data extraction, automatic de-identification, synthetic data, named entity recognition, domain adaptation, large language models
National Category
Natural Language Processing
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-250015ISBN: 978-91-8107-462-8 (print)ISBN: 978-91-8107-463-5 (electronic)OAI: oai:DiVA.org:su-250015DiVA, id: diva2:2017117
Public defence
2026-01-13, Lilla hörsalen, NOD-huset, Borgarfjordsgatan 12, Kista, 13:30 (English)
Opponent
Supervisors
Available from: 2025-12-17 Created: 2025-11-27 Last updated: 2025-12-10Bibliographically approved
List of papers
1. Are Clinical BERT Models Privacy Preserving? The Difficulty of Extracting Patient-Condition Associations
Open this publication in new window or tab >>Are Clinical BERT Models Privacy Preserving? The Difficulty of Extracting Patient-Condition Associations
2021 (English)In: Proceedings of the AAAI 2021 Fall Symposium on Human Partnership with Medical AI: Design, Operationalization, and Ethics (AAAI-HUMAN 2021) / [ed] Thomas E. Doyle; Aisling Kelliher; Reza Samavi; Barbara Barry; Steven Yule; Sarah Parker; Michael Noseworthy; Qian Yang, 2021Conference paper, Published paper (Refereed)
Abstract [en]

Language models may be trained on data that contain personal information, such as clinical data. Such sensitive data must not leak for privacy reasons. This article explores whether BERT models trained on clinical data are susceptible to training data extraction attacks. Multiple large sets of sentences generated from the model with top-k sampling and nucleus sampling are studied. The sentences are examined to determine the degree to which they contain information associating patients with their conditions. The sentence sets are then compared to determine if there is a correlation between the degree of privacy leaked and the linguistic quality attained by each generation technique. We find that the relationship between linguistic quality and privacy leakage is weak and that the risk of a successful training data extraction attack on a BERT-based model is small.

Series
CEUR Workshop Proceedings, E-ISSN 1613-0073 ; 3068
Keywords
nlp, natural language processing, privacy preserving machine learning, language models, transformers, natural language generation
National Category
Natural Language Processing
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-201651 (URN)
Conference
AAAI 2021 Fall Symposium on Human Partnership with Medical AI: Design, Operationalization, and Ethics (AAAI-HUMAN 2021), Virtual Event, November 4-6, 2021
Available from: 2022-01-31 Created: 2022-01-31 Last updated: 2025-11-27Bibliographically approved
2. Using Membership Inference Attacks to Evaluate Privacy-Preserving Language Modeling Fails for Pseudonymizing Data
Open this publication in new window or tab >>Using Membership Inference Attacks to Evaluate Privacy-Preserving Language Modeling Fails for Pseudonymizing Data
2023 (English)In: 24th Nordic Conference on Computational Linguistics (NoDaLiDa), 2023, p. 318-323Conference paper, Published paper (Refereed)
Abstract [en]

Large pre-trained language models dominate the current state-of-the-art for many natural language processing applications, including the field of clinical NLP. Several studies have found that these can be susceptible to privacy attacks that are unacceptable in the clinical domain where personally identifiable information (PII) must not be exposed.

However, there is no consensus regarding how to quantify the privacy risks of different models. One prominent suggestion is to quantify these risks using membership inference attacks. In this study, we show that a state-of-the-art membership inference attack on a clinical BERT model fails to detect the privacy benefits from pseudonymizing data. This suggests that such attacks may be inadequate for evaluating token-level privacy preservation of PIIs.

Series
Northern European Association for Language Technology (NEALT), ISSN 1736-8197, E-ISSN 1736-6305 ; 52
National Category
Natural Language Processing
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-216681 (URN)
Conference
Nordic Conference on Computational Linguistics
Available from: 2023-04-24 Created: 2023-04-24 Last updated: 2025-11-27Bibliographically approved
3. SweClinEval: A Benchmark for Swedish Clinical Natural Language Processing
Open this publication in new window or tab >>SweClinEval: A Benchmark for Swedish Clinical Natural Language Processing
2025 (English)In: Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), 2025, p. 767-775Conference paper, Published paper (Refereed)
Abstract [en]

The lack of benchmarks in certain domains and for certain languages makes it difficult to track progress regarding the state-of-the-art of NLP in those areas, potentially impeding progress in important, specialized domains. Here, we introduce the first Swedish benchmark for clinical NLP: SweClinEval. The first iteration of the benchmark consists of six clinical NLP tasks, encompassing both document-level classification and named entity recognition tasks, with real clinical data. We evaluate nine different encoder models, both Swedish and multilingual. The results show that domain-adapted models outperform generic models on sequence-level classification tasks, while certain larger generic models outperform the clinical models on named entity recognition tasks. We describe how the benchmark can be managed despite limited possibilities to share sensitive clinical data, and discuss plans for extending the benchmark in future iterations.

Series
NEALT Proceedings Series, ISSN 1736-8197, E-ISSN 1736-6305
National Category
Natural Language Processing
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-240589 (URN)978-9908-53-109-0 (ISBN)
Conference
The Joint Nordic Conference on Computational Linguistics and Baltic Conference on Human Language Technologies, 2-5 March 2025, Tallin, Estonia.
Available from: 2025-03-10 Created: 2025-03-10 Last updated: 2025-11-27Bibliographically approved
4. Downstream Task Performance of BERT Models Pre-Trained Using Automatically De-Identified Clinical Data
Open this publication in new window or tab >>Downstream Task Performance of BERT Models Pre-Trained Using Automatically De-Identified Clinical Data
2022 (English)In: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), European Language Resources Association , 2022, p. 4245-4252Conference paper, Published paper (Refereed)
Abstract [en]

Automatic de-identification is a cost-effective and straightforward way of removing large amounts of personally identifiable information from large and sensitive corpora. However, these systems also introduce errors into datasets due to their imperfect precision. These corruptions of the data may negatively impact the utility of the de-identified dataset. This paper de-identifies a very large clinical corpus in Swedish either by removing entire sentences containing sensitive data or by replacing sensitive words with realistic surrogates. These two datasets are used to perform domain adaptation of a general Swedish BERT model. The impact of the de-identification techniques is assessed by training and evaluating the models using six clinical downstream tasks. The results are then compared to a similar BERT model domain-adapted using an unaltered version of the clinical corpus. The results show that using an automatically de-identified corpus for domain adaptation does not negatively impact downstream performance. We argue that automatic de-identification is an efficient way of reducing the privacy risks of domain-adapted models and that the models created in this paper should be safe to distribute to other academic researchers.

Place, publisher, year, edition, pages
European Language Resources Association, 2022
Keywords
Privacy-preserving machine learning, pseudonymization, de-identification, Swedish clinical text, pre-trained language models, BERT, downstream tasks, NER, multi-label classification
National Category
Natural Language Processing
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-207395 (URN)
Conference
Conference on Language Resources and Evaluation (LREC 2022), Marseilles, France, 21-23 June 2022
Available from: 2022-07-15 Created: 2022-07-15 Last updated: 2025-11-27
5. End-to-End Pseudonymization of Fine-Tuned Clinical BERT Models: Privacy Preservation with Maintained Data Utility
Open this publication in new window or tab >>End-to-End Pseudonymization of Fine-Tuned Clinical BERT Models: Privacy Preservation with Maintained Data Utility
2024 (English)In: BMC Medical Informatics and Decision Making, E-ISSN 1472-6947, article id 162Article in journal (Refereed) Published
Abstract [en]

Many state-of-the-art results in natural language processing (NLP) rely on large pre-trained language models (PLMs). These models consist of large amounts of parameters that are tuned using vast amounts of training data. These factors cause the models to memorize parts of their training data, making them vulnerable to various privacy attacks. This is cause for concern, especially when these models are applied in the clinical domain, where data are very sensitive.

One privacy-preserving technique that aims to mitigate these problems is training data pseudonymization. This technique automatically identifies and replaces sensitive entities with realistic but non-sensitive surrogates. Pseudonymization has yielded promising results in previous studies. However, no previous study has applied pseudonymization to both the pre-training data of PLMs and the fine-tuning data used to solve clinical NLP tasks.

This study evaluates the predictive performance effects of end-to-end pseudonymization of clinical BERT models on five clinical NLP tasks compared to pre-training and fine-tuning on unaltered sensitive data. A large number of statistical tests are performed, revealing minimal harm to performance when using pseudonymized fine-tuning data. The results also find no deterioration from end-to-end pseudonymization of pre-training and fine-tuning data. These results demonstrate that pseudonymizing training data to reduce privacy risks can be done without harming data utility for training PLMs.

Keywords
Natural language processing, language models, BERT, electronic health records, clinical text, de-identification, pseudonymization, privacy preservation, Swedish
National Category
Natural Language Processing
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-232099 (URN)10.1186/s12911-024-02546-8 (DOI)38915012 (PubMedID)2-s2.0-85196757461 (Scopus ID)
Available from: 2024-07-24 Created: 2024-07-24 Last updated: 2025-11-27Bibliographically approved
6. Data-Constrained Synthesis of Training Data for De-Identification
Open this publication in new window or tab >>Data-Constrained Synthesis of Training Data for De-Identification
2025 (English)In: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) / [ed] Wanxiang Che; Joyce Nabende; Ekaterina Shutova; Mohammad Taher Pilehvar, Association for Computational Linguistics , 2025, p. 27414-27427Conference paper, Published paper (Refereed)
Abstract [en]

Many sensitive domains — such as the clinical domain — lack widely available datasets due to privacy risks. The increasing generative capabilities of large language models (LLMs) have made synthetic datasets a viable path forward. In this study, we domain-adapt LLMs to the clinical domain and generate synthetic clinical texts that are machine-annotated with tags for personally identifiable information using capable encoder-based NER models. The synthetic corpora are then used to train synthetic NER models. The results show that training NER models using synthetic corpora incurs only a small drop in predictive performance. The limits of this process are investigated in a systematic ablation study — using both Swedish and Spanish data. Our analysis shows that smaller datasets can be sufficient for domain-adapting LLMs for data synthesis. Instead, the effectiveness of this process is almost entirely contingent on the performance of the machine-annotating NER models trained using the original data.

Place, publisher, year, edition, pages
Association for Computational Linguistics, 2025
Series
Association for Computational Linguistics (ACL). Annual Meeting Conference Proceedings, ISSN 0736-587X
National Category
Natural Language Processing
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-246981 (URN)10.18653/v1/2025.acl-long.1329 (DOI)979-8-89176-251-0 (ISBN)
Conference
The 63rd Annual Meeting of the Association for Computational Linguistics, 27 July-1 August, 2025, Vienna, Austria.
Available from: 2025-09-15 Created: 2025-09-15 Last updated: 2025-11-27Bibliographically approved

Open Access in DiVA

Preserving the Privacy of Language Models: Experiments in Clinical NLP(16081 kB)318 downloads
File information
File name FULLTEXT03.pdfFile size 16081 kBChecksum SHA-512
a2c38ed6828fd38fcd24619aac23cd32e2f827c0462023974d08c864c49c8e319ae85b800522f4b9bed2063ea6ea3b2dacf15394b3da4c4cc4910ac6e51f3cc6
Type fulltextMimetype application/pdf

Authority records

Vakili, Thomas

Search in DiVA

By author/editor
Vakili, Thomas
By organisation
Department of Computer and Systems Sciences
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 318 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 2145 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf