Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Inferring the location of authors from words in their texts
Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.ORCID iD: 0000-0002-6027-4156
Stockholm University, Faculty of Humanities, Department of Linguistics, General Linguistics.ORCID iD: 0000-0002-0840-1357
2015 (English)In: Proceedings of the 20th Nordic Conference of Computational Linguistics: NODALIDA 2015 / [ed] Beáta Megyesi, Linköping: Linköping University Electronic Press, ACL Anthology , 2015, 211-218 p.Conference paper, Published paper (Refereed)
Abstract [en]

For the purposes of computational dialectology or other geographically bound text analysis tasks, texts must be annotated with their or their authors' location. Many texts are locatable but most have no ex- plicit annotation of place. This paper describes a series of experiments to determine how positionally annotated microblog posts can be used to learn location indicating words which then can be used to locate blog texts and their authors. A Gaussian distribution is used to model the locational qualities of words. We introduce the notion of placeness to describe how locational words are.

We find that modelling word distributions to account for several locations and thus several Gaussian distributions per word, defining a filter which picks out words with high placeness based on their local distributional context, and aggregating locational information in a centroid for each text gives the most useful results. The results are applied to data in the Swedish language.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, ACL Anthology , 2015. 211-218 p.
Series
Linköping Electronic Conference Proceedings, ISSN 1650-3638 ; 109
National Category
General Language Studies and Linguistics
Research subject
Computational Linguistics
Identifiers
URN: urn:nbn:se:su:diva-127529ISBN: 978-91-7519-098-3 (print)OAI: oai:DiVA.org:su-127529DiVA: diva2:909564
Conference
20th Nordic Conference of Computational Linguistics (NODALIDA 2015), Vilnius, Lithuania, May 11–13, 2015
Projects
SINUS (Spridning av innovationer i nutida svenska)
Funder
Swedish Research Council
Available from: 2016-03-07 Created: 2016-03-07 Last updated: 2016-11-18Bibliographically approved

Open Access in DiVA

No full text

Other links

Free full text

Search in DiVA

By author/editor
Karlgren, JussiÖstling, RobertParkvall, Mikael
By organisation
Computational LinguisticsGeneral Linguistics
General Language Studies and Linguistics

Search outside of DiVA

GoogleGoogle Scholar

Total: 118 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf