Change search
ReferencesLink to record
Permanent link

Direct link
Stagger: an Open-Source Part of Speech Tagger for Swedish
Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.ORCID iD: 0000-0002-6027-4156
2013 (English)In: Northern European Journal of Language Technology (NEJLT), ISSN 2000-1533, Vol. 3, 1-18 p.Article in journal (Refereed) Published
Abstract [en]

This work presents Stagger, a new open-source part of speech tagger for Swedish based on the Averaged Perceptron. By using the SALDO morphological lexicon and semi-supervised learning in the form of Collobert and Weston embeddings, it reaches an accuracy of 96.4% on the standard Stockholm-Umeå Corpus dataset, making it the best single part of speech tagging system reported for Swedish. Accuracy increases to 96.6% on the latest version of the corpus, where the annotation has been revised to increase consistency. Stagger is also evaluated on a new corpus of Swedish blog posts, investigating its out-of-domain performance.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2013. Vol. 3, 1-18 p.
Keyword [en]
PoS tagging, part of speech tagging, Swedish, neural language models
National Category
Language Technology (Computational Linguistics)
Research subject
Linguistics; Computational Linguistics
URN: urn:nbn:se:su:diva-94806DOI: 10.3384/nejlt.2000-1533.1331OAI: diva2:656074
Available from: 2013-10-14 Created: 2013-10-14 Last updated: 2013-10-15Bibliographically approved

Open Access in DiVA

stagger.pdf(1122 kB)652 downloads
File information
File name FULLTEXT01.pdfFile size 1122 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Östling, Robert
By organisation
Computational Linguistics
In the same journal
Northern European Journal of Language Technology (NEJLT)
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 652 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 314 hits
ReferencesLink to record
Permanent link

Direct link