Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Identifying Speakers and Addressees in Dialogues Extracted from Literary Fiction
Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.ORCID iD: 0000-0003-4040-3544
Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.ORCID iD: 0000-0002-6027-4156
Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.ORCID iD: 0000-0002-9447-8544
Show others and affiliations
2018 (English)In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018) / [ed] Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga, European Language Resources Association, 2018, p. 817-824Conference paper, Published paper (Refereed)
Abstract [en]

This paper describes an approach to identifying speakers and addressees in dialogues extracted from literary fiction, along with a dataset annotated for speaker and addressee. The overall purpose of this is to provide annotation of dialogue interaction between characters in literary corpora in order to allow for enriched search facilities and construction of social networks from the corpora. To predict speakers and addressees in a dialogue, we use a sequence labeling approach applied to a given set of characters. We use features relating to the current dialogue, the preceding narrative, and the complete preceding context. The results indicate that even with a small amount of training data, it is possible to build a fairly accurate classifier for speaker and addressee identification across different authors, though the identification of addressees is the more difficult task.

Place, publisher, year, edition, pages
European Language Resources Association, 2018. p. 817-824
Keywords [en]
literary corpora, speaker identification, addressee identification, quote attribution
National Category
General Language Studies and Linguistics Natural Language Processing
Research subject
Computational Linguistics
Identifiers
URN: urn:nbn:se:su:diva-154260ISBN: 979-10-95546-00-9 (print)OAI: oai:DiVA.org:su-154260DiVA, id: diva2:1192159
Conference
Language Resources and Evaluation Conference, Miyazaki, Japan, 7–12 May, 2018
Funder
Swedish Research Council, 821-2013-2003Available from: 2018-03-21 Created: 2018-03-21 Last updated: 2025-02-01Bibliographically approved

Open Access in DiVA

fulltext(157 kB)2133 downloads
File information
File name FULLTEXT01.pdfFile size 157 kBChecksum SHA-512
8f8c0fe426ae9f17c1a5531f4dc4602fc492f4c90b5bc5a6b21a55edd0d73956266693e20d30642492185ea335f5a68b9ebba82aba75d45b157933fc937c428c
Type fulltextMimetype application/pdf

Other links

Free full text

Authority records

Ek, AdamWirén, MatsÖstling, RobertNilsson Björkenstam, KristinaGrigonytė, GintarėGustafson Capková, Sofia

Search in DiVA

By author/editor
Ek, AdamWirén, MatsÖstling, RobertNilsson Björkenstam, KristinaGrigonytė, GintarėGustafson Capková, Sofia
By organisation
Computational Linguistics
General Language Studies and LinguisticsNatural Language Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 2133 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 2072 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf