Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
FoodSafeSum: Enabling Natural Language Processing Applications for Food Safety Document Summarization and Analysis
Athens University of Economics and Business, Athens, Greece.
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0002-7938-2747
University of Pisa, Pisa, Italy.
University of Pisa, Pisa, Italy.
Show others and affiliations
Number of Authors: 92025 (English)In: Findings of the Association for Computational Linguistics: EMNLP 2025 / [ed] Christos Christodoulopoulos; Tanmoy Chakraborty; Carolyn Rose; Violet Peng, Association for Computational Linguistics , 2025, p. 16786-16804Conference paper, Published paper (Refereed)
Abstract [en]

Food safety demands timely detection, regulation, and public communication, yet the lack of structured datasets hinders Natural Language Processing (NLP) research. We present and release a new dataset of human-written and Large Language Model (LLM)-generated summaries of food safety documents, plus food safety related metadata. We evaluate its utility on three NLP tasks directly reflecting food safety practices: multilabel classification for organizing documents into domain-specific categories; document retrieval for accessing regulatory and scientific evidence; and question answering via retrieval-augmented generation that improves factual accuracy.We show that LLM summaries perform comparably or better than human ones across tasks. We also demonstrate clustering of summaries for event tracking and compliance monitoring. This dataset enables NLP applications that support core food safety practices, including the organization of regulatory and scientific evidence, monitoring of compliance issues, and communication of risks to the public.

Place, publisher, year, edition, pages
Association for Computational Linguistics , 2025. p. 16786-16804
National Category
Natural Language Processing
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-250605DOI: 10.18653/v1/2025.findings-emnlp.911Scopus ID: 2-s2.0-105028992879ISBN: 979-8-89176-335-7 (electronic)OAI: oai:DiVA.org:su-250605DiVA, id: diva2:2023275
Conference
Conference on Empirical Methods in Natural Language Processing (EMNLP), November 2025, Suzhou, China.
Available from: 2025-12-18 Created: 2025-12-18 Last updated: 2026-02-10Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Randl, Korbinian RobertHenriksson, Aron

Search in DiVA

By author/editor
Randl, Korbinian RobertHenriksson, AronPavlopoulos, Ioannis
By organisation
Department of Computer and Systems Sciences
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 15 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf