Change search
ReferencesLink to record
Permanent link

Direct link
Inferring the location of authors from words in their texts
Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.ORCID iD: 0000-0002-6027-4156
Stockholm University, Faculty of Humanities, Department of Linguistics, General Linguistics.ORCID iD: 0000-0002-0840-1357
2015 (English)In: Proceedings of the 20th Nordic Conference of Computational Linguistics: NODALIDA 2015 / [ed] Beáta Megyesi, Linköping: Linköping University Electronic Press, ACL Anthology , 2015, 211-218 p.Conference paper (Refereed)
Abstract [en]

For the purposes of computational dialectology or other geographically bound text analysis tasks, texts must be annotated with their or their authors' location. Many texts are locatable but most have no ex- plicit annotation of place. This paper describes a series of experiments to determine how positionally annotated microblog posts can be used to learn location indicating words which then can be used to locate blog texts and their authors. A Gaussian distribution is used to model the locational qualities of words. We introduce the notion of placeness to describe how locational words are.

We find that modelling word distributions to account for several locations and thus several Gaussian distributions per word, defining a filter which picks out words with high placeness based on their local distributional context, and aggregating locational information in a centroid for each text gives the most useful results. The results are applied to data in the Swedish language.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, ACL Anthology , 2015. 211-218 p.
Series
, Linköping Electronic Conference Proceedings, ISSN 1650-3638 ; 109
National Category
General Language Studies and Linguistics
Research subject
Computational Linguistics
Identifiers
URN: urn:nbn:se:su:diva-127529ISBN: 978-91-7519-098-3OAI: oai:DiVA.org:su-127529DiVA: diva2:909564
Conference
20th Nordic Conference of Computational Linguistics (NODALIDA 2015), Vilnius, Lithuania, May 11–13, 2015
Projects
SINUS (Spridning av innovationer i nutida svenska)
Funder
Swedish Research Council
Available from: 2016-03-07 Created: 2016-03-07 Last updated: 2016-07-08Bibliographically approved

Open Access in DiVA

No full text

Other links

Free full textConference website

Search in DiVA

By author/editor
Karlgren, JussiÖstling, RobertParkvall, Mikael
By organisation
Computational LinguisticsGeneral Linguistics
General Language Studies and Linguistics

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 89 hits
ReferencesLink to record
Permanent link

Direct link