Change search
Refine search result
12 1 - 50 of 60
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Bell, Linda
    et al.
    TeliaSonera (R and D).
    Boye, Johan
    TeliaSonera (R and D).
    Gustafson, Joakim
    TeliaSonera (R&D).
    Heldner, Mattias
    TeliaSonera (R&D).
    Lindström, Anders
    TeliaSonera (R and D).
    Wirén, Mats
    TeliaSonera (R&D).
    The Swedish NICE Corpus – Spoken dialogues between children and embodied characters in a computer game scenario2005In: Proceedings Interspeech 2005 - Eurospeech: 9th European Conference on Speech Communication and Technology, Lisbon, Portugal: ISCA , 2005, p. 2765-2768Conference paper (Refereed)
    Abstract [en]

    This article describes the collection and analysis of a Swedish database of spontaneous and unconstrained children-machine dialogues. The Swedish NICE corpus consists of spoken dialogues between children aged 8 to 15 and embodied fairytale characters in a computer game scenario. Compared to previously collected corpora of children's computer-directed speech, the Swedish NICE corpus contains extended interactions, including three-party conversation, in which the young users used spoken dialogue as the primary means of progression in the game.

    Download full text (pdf)
    The Swedish NICE Corpus – Spoken dialogues between children and embodied characters in a computer game scenario
  • 2.
    Bell, Linda
    et al.
    TeliaSonera (R & D).
    Boye, Johan
    TeliaSonera (R & D).
    Gustafson, Joakim
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Modality Convergence in a Multimodal Dialogue System2000In: Proceedings of Götalog, 2000, p. 29-34Conference paper (Other academic)
    Abstract [en]

    When designing multimodal dialogue systems allowing speech as well as graphical operations, it is important to understand not only how people make use of the different modalities in their utterances, but also how the system might influence a user's choice of modality by its own behavior. This paper describes an experiment in which subjects interacted with two versions of a simulated multimodal dialogue system. One version used predominantly graphical means when referring to specific objects; the other used predominantly verbal referential expressions. The purpose of the study was to find out what effect, if any, the system's referential strategy had on the user's behavior. The results provided limited support for the hypothesis that the system can influence users to adopt another modality for the purpose of referring

    Download full text (pdf)
    Modality Convergence in a Multimodal Dialogue System
  • 3.
    Boye, Johan
    et al.
    TeliaSonera (R & D).
    Gustafson, Joakim
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Robust spoken language understanding in a computer game2006In: Speech Communication, ISSN 0167-6393, E-ISSN 1872-7182, Vol. 48, no 3-4, p. 335-353Article in journal (Refereed)
    Abstract [en]

    We present and evaluate a robust method for the interpretation of spoken input to a conversational computer game. The scenario of the game is that of a player interacting with embodied fairy-tale characters in a 3D world via spoken dialogue (supplemented by graphical pointing actions) to solve various problems. The player himself cannot directly perform actions in the world, but interacts with the fairy-tale characters to have them perform various tasks, and to get information about the world and the problems to solve. Hence the role of spoken dialogue as the primary means of control is obvious and natural to the player. Naturally, this means that robust spoken language understanding becomes a critical component. To this end, the paper describes a semantic representation formalism and an accompanying parsing algorithm which works off the output of the speech recogniser's statistical language model. The evaluation shows that the parser is robust in the sense of considerably improving on the noisy output of the speech recogniser.

    Download full text (pdf)
    Robust spoken language understanding in a computer game
  • 4.
    Boye, Johan
    et al.
    TeliaSonera.
    Wirén, Mats
    TeliaSonera.
    Multi-slot semantics for natural-language call routing systems2007In: Proceedings of Bridging the Gap: Academic and Industrial Research in Dialog Technology, 2007, p. 68-75Conference paper (Refereed)
    Abstract [en]

    Statistical classification techniques for natural-language call routing systems have matured to the point where it is possible to distinguish between several hundreds of semantic categories with an accuracy that is sufficient for commercial deployments. For category sets of this size, the problem of maintaining consistency among manually tagged utterances becomes limiting, as lack of consistency in the training data will degrade performance of the classifier. It is thus essential that the set of categories be structured in a way that alleviates this problem, and enables consistency to be preserved as the domain keeps changing. In this paper, we describe our experiences of using a two-level multi-slot semantics as a way of meeting this problem. Furthermore, we explore the ramifications of the approach with respect to classification, evaluation and dialogue design for call routing systems.

    Download full text (pdf)
    Multi-slot semantics for natural-language call routing
  • 5.
    Boye, Johan
    et al.
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Negotiative Spoken-Dialogue Interfaces to Databases2003In: Proceedings of Diabruck, Wallerfangen, Germany, 2003Conference paper (Refereed)
    Abstract [en]

    The aim of this paper is to develop a principled and empirically motivated approach to robust, negotiative spoken dialogue with databases. Robustness is achieved by limiting the set of representable utterance types. Still, the vast majority of utterances that occur in practice can be handled.

    Download full text (pdf)
    Negotiative Spoken-Dialogue Interfaces to Databases
  • 6.
    Boye, Johan
    et al.
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Robust parsing and spoken negotiative dialogue with databases2008In: Natural Language Engineering, ISSN 1351-3249, E-ISSN 1469-8110, Vol. 14, no 3, p. 289-312Article in journal (Refereed)
    Abstract [en]

    This paper presents a robust parsing algorithm and semantic formalism for the interpretation of utterances in spoken negotiative dialogue with databases. The algorithm works in two passes: a domain-specific pattern-matching phase and a domain-independent semantic analysis phase. Robustness is achieved by limiting the set of representable utterance types to an empirically motivated subclass which is more expressive than propositional slot–value lists, but much less expressive than first-order logic. Our evaluation shows that in actual practice the vast majority of utterances that occur can be handled, and that the parsing algorithm is highly efficient and accurate.

    Download full text (pdf)
    Boye & Wirén 2008
  • 7.
    Boye, Johan
    et al.
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Robust Parsing of Utterances in Negotiative Dialogue2003In: Proceedings 8th European Conference on Speech Communication and Technology (Eurospeech), Geneva, Switzerland, 2003Conference paper (Refereed)
    Abstract [en]

    This paper presents an algorithm for domain-dependent parsing of utterances in negotiative dialogue. To represent such utterances, the algorithm outputs semantic expressions that are more expressive than propositional slot-filler structures. It is very fast and robust, yet precise and capable of correctly combining information from different utterance fragments.

    Download full text (pdf)
    Robust Parsing of Utterances in Negotiative Dialogue
  • 8.
    Boye, Johan
    et al.
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Gustafson, Joakim
    TeliaSonera (R & D).
    Contextual reasoning in multimodal dialogue systems: two case studies2004In: Proceedings of The 8th Workshop on the Semantics and Pragmatics of Dialogue Catalogue'04, Barcelona, 2004, p. 19-21Conference paper (Refereed)
    Abstract [en]

    This paper describes an approach to contextual reasoning for interpretation ofspoken multimodal dialogue. The approach is based on combining recencybased search for antecedents with an object-oriented domain representation insuch a way that the search is highly constrained by the type information of theantecedents. By furthermore representingcandidate antecedents from the dialoguehistory and visual context in a uniformway, a single machinery (based on -reduction in lambda calculus) can be usedfor resolving many kinds of underspecified utterances. The approach has beenimplemented in two highly different domains.

    Download full text (pdf)
    Boye, Wirén, Gustafson 2004
  • 9.
    Börstell, Carl
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Sign Language.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Mesch, Johanna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Sign Language.
    Gärdenfors, Moa
    Stockholm University, Faculty of Humanities, Department of Linguistics, Sign Language.
    Towards an Annotation of Syntactic Structure in the Swedish Sign Language Corpus2016In: Workshop Proceedings: 7th Workshop on the Representation and Processing of Sign Languages: Corpus Mining / [ed] Eleni Efthimiou, Stavroula-Evita Fotinea, Thomas Hanke, Julie Hochgesang, Jette Kristoffersen, Johanna Mesch, Paris: ELRA , 2016, p. 19-24Conference paper (Refereed)
    Abstract [en]

    This paper describes on-going work on extending the annotation of the Swedish Sign Language Corpus (SSLC) with a level of syntactic structure. The basic annotation of SSLC in ELAN consists of six tiers: four for sign glosses (two tiers for each signer; one for each of a signer’s hands), and two for written Swedish translations (one for each signer). In an additional step by Östling et al. (2015), all ¨ glosses of the corpus have been further annotated for parts of speech. Building on the previous steps, we are now developing annotation of clause structure for the corpus, based on meaning and form. We define a clause as a unit in which a predicate asserts something about one or more elements (the arguments). The predicate can be a (possibly serial) verbal or nominal. In addition to predicates and their arguments, criteria for delineating clauses include non-manual features such as body posture, head movement and eye gaze. The goal of this work is to arrive at two additional annotation tier types in the SSLC: one in which the sign language texts are segmented into clauses, and the other in which the individual signs are annotated for their argument types.

    Download full text (pdf)
    fulltext
  • 10. Cap, Fabienne
    et al.
    Adesam, Yvonne
    Ahrenberg, Lars
    Borin, Lars
    Bouma, Gerlof
    Forsberg, Markus
    Kann, Viggo
    Östling, Robert
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Smith, Aaron
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Nivre, Joakim
    SWORD: Towards Cutting-Edge Swedish Word Processing2016In: Proceedings of SLTC 2016, 2016Conference paper (Refereed)
    Abstract [en]

    Despite many years of research on Swedish language technology, there is still no well-documented standard for Swedish word processing covering the whole spectrum from low-level tokenization to morphological analysis and disambiguation. SWORD is a new initiative within the SWE-CLARIN consortium aiming to develop documented standards for Swedish word processing. In this paper, we report on a pilot study of Swedish tokenization, where we compare the output of six different tokenizers on four different text types. For one text type (Wikipedia articles), we also compare to the tokenization produced by six manual annotators.

    Download full text (pdf)
    fulltext
  • 11.
    Carter, David
    et al.
    SRI International.
    Rayner, Manny
    SRI International.
    Eklund, Robert
    TeliaSonera (R & D).
    Kaja, Jaan
    TeliaSonera (R & D).
    Lyberg, Bertil
    TeliaSonera (R & D).
    Sautermeister, Per
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R& D).
    Neumeyer, Leonardo
    SRI International.
    Weng, Fuliang
    SRI International.
    Common speech/language issues2000In: The spoken language translator / [ed] Manny Rayner, David Carter, Pierrette Bouillon, Vassilis Digalakis, Mats Wirén, Cambridge: Cambridge University Press, 2000, p. 284-294Chapter in book (Other academic)
  • 12.
    Carter, David
    et al.
    SRI International.
    Rayner, Manny
    SRI International.
    Eklund, Robert
    TeliaSonera (R & D).
    MacDermid, Catriona
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Evaluation2000In: The spoken language translator / [ed] Manny Rayner, David Carter, Pierrette Bouillon, Vassilis Digalakis, Mats Wirén, Cambridge: Cambridge University Press, 2000, p. 297-312Chapter in book (Other academic)
  • 13.
    Dalianis, Hercules
    et al.
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Östling, RobertStockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.Weegar, RebeckaStockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.Wirén, MatsStockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Special Issue of Selected Contributions from the Seventh Swedish Language Technology Conference (SLTC 2018)2019Conference proceedings (editor) (Other academic)
    Abstract [en]

    This Special Issue contains three papers that are extended versions of abstracts presented at the Seventh Swedish Language Technology Conference (SLTC 2018), held at Stockholm University 8–9 November 2018.1 SLTC 2018 received 34 submissions, of which 31 were accepted for presentation. The number of registered participants was 113, including both attendees at SLTC 2018 and two co-located workshops that took place on 7 November. 32 participants were internationally affiliated, of which 14 were from outside the Nordic countries. Overall participation was thus on a par with previous editions of SLTC, but international participation was higher.

  • 14.
    Ek, Adam
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Distinguishing Narration and Speech in Prose Fiction Dialogues2019In: Proceedings of the Digital Humanities in the Nordic Countries 4th Conference / [ed] Costanza Navarretta, Manex Agirrezabal, Bente Maegaard, CEUR-WS.org , 2019, p. 124-132Conference paper (Refereed)
    Abstract [en]

    This paper presents a supervised method for a novel task, namely, detecting elements of narration in passages of dialogue in prose fiction. The method achieves an F1-score of 80.8%, exceeding the best baseline by almost 33 percentage points. The purpose of the method is to enable a more fine-grained analysis of fictional dialogue than has previously been possible, and to provide a component for the further analysis of narrative structure in general.

    Download full text (pdf)
    fulltext
  • 15.
    Ek, Adam
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Östling, Robert
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Nilsson Björkenstam, Kristina
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Grigonytė, Gintarė
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Gustafson Capková, Sofia
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Identifying Speakers and Addressees in Dialogues Extracted from Literary Fiction2018In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018) / [ed] Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga, European Language Resources Association, 2018, p. 817-824Conference paper (Refereed)
    Abstract [en]

    This paper describes an approach to identifying speakers and addressees in dialogues extracted from literary fiction, along with a dataset annotated for speaker and addressee. The overall purpose of this is to provide annotation of dialogue interaction between characters in literary corpora in order to allow for enriched search facilities and construction of social networks from the corpora. To predict speakers and addressees in a dialogue, we use a sequence labeling approach applied to a given set of characters. We use features relating to the current dialogue, the preceding narrative, and the complete preceding context. The results indicate that even with a small amount of training data, it is possible to build a fairly accurate classifier for speaker and addressee identification across different authors, though the identification of addressees is the more difficult task.

    Download full text (pdf)
    fulltext
  • 16. Eklund, Robert
    et al.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Effects of open and directed prompts on filled pauses and utterance production2010In: Proceedings from Fonetik 2010, Lund, June 2–4, 2010 / [ed] Susanne Schötz and Gilbert Ambrazaitis, Lund: Mediatryck , 2010, p. 23-28Conference paper (Other academic)
    Abstract [en]

    This paper describes an experiment where open and directed prompts were alternated when collecting speech data for the deployment of a call-routing application. The experiment tested whether open and directed prompts resulted in any differences with respect to the filled pauses exhibited by the callers, which is interesting in the light of the “many-options” hypothesis of filled pause production. The experiment also investigated the effects of the prompts on utterance form and meaning of the callers.

    Download full text (pdf)
    FULLTEXT01
  • 17.
    Eklund, Robert
    et al.
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    ”Njutandes av en Monte Christo no 5 och en iskall Mojito”: Observationer om användning av s-particip2006In: Svenskans beskrivning 28, Förhandlingar vid Tjugoåttonde sammankomsten för svenskans beskrivning, 2006, p. 97-108Conference paper (Refereed)
    Download full text (pdf)
    Eklund_Wiren_2006_S-particip
  • 18.
    Grigonyte, Gintare
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Henriksson, Aron
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Swedification patterns of Latin and Greek affixes in clinical text2016In: Nordic Journal of Linguistics, ISSN 0332-5865, E-ISSN 1502-4717, Vol. 39, no 1, p. 5-37Article in journal (Refereed)
    Abstract [en]

    Swedish medical language is rich with Latin and Greek terminology which has undergone a Swedification since the 1980s. However, many original expressions are still used by clinical professionals. The goal of this study is to obtain precise quantitative measures of how the foreign terminology is manifested in Swedish clinical text. To this end, we explore the use of Latin and Greek affixes in Swedish medical texts in three genres: clinical text, scientific medical text and online medical information for laypersons. More specifically, we use frequency lists derived from tokenised Swedish medical corpora in the three domains, and extract word pairs belonging to types that display both the original and Swedified spellings. We describe six distinct patterns explaining the variation in the usage of Latin and Greek affixes in clinical text. The results show that to a large extent affixes in clinical text are Swedified and that prefixes are used more conservatively than suffixes.

  • 19.
    Grigonyté, Gintaré
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Karolinska Institutet, Sweden.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Improving Readability of Swedish Electronic Health Records through Lexical Simplification: First Results2014In: Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), Stroudsburg, USA: Association for Computational Linguistics, 2014, p. 74-83Conference paper (Refereed)
    Abstract [en]

    This paper describes part of an ongoing effort to improve the readability of Swedish electronic health records (EHRs). An EHR contains systematic documentation of a single patient’s medical history across time, entered by healthcare professionals with the purpose of enabling safe and informed care. Linguistically, medical records exemplify a highly specialised domain, which can be superficially characterised as having telegraphic sentences involving displaced or missing words, abundant abbreviations, spelling variations including misspellings, and terminology. We report results on lexical simplification of Swedish EHRs, by which we mean detecting the unknown, out-ofdictionary words and trying to resolve them either as compounded known words, abbreviations or misspellings.

    Download full text (pdf)
    Improving Readability of Swedish Electronic Health Records through Lexical Simplification: First Results
  • 20.
    Grigonyté, Gintaré
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Karolinska Institute, Sweden.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Spelling Variation of Latin and Greek words in Swedish Medical Text2014Conference paper (Refereed)
  • 21.
    Gustafson, Joakim
    et al.
    KTH Speech, Music and Hearing.
    Bell, Linda
    KTH Speech, Music and Hearing.
    Beskow, Jonas
    KTH Speech, Music and Hearing.
    Boye, Johan
    TeliaSonera (R & D).
    Carlson, Rolf
    KTH Speech, Music and Hearing.
    Edlund, Jens
    KTH Speech, Music and Hearing.
    Granström, Björn
    KTH Speech, Music and Hearing.
    House, David
    KTH Speech, Music and Hearing.
    Wirén, Mats
    TeliaSonera (R & D).
    AdApt — A Multimodal Conversational Dialogue System in an Apartment Domain2000In: Proceedings of the Sixth International Conference on Spoken Language Processing (ICSLP), Beijing, China, 2000, p. 134-137Conference paper (Refereed)
    Download full text (pdf)
    AdApt — A Multimodal Conversational Dialogue System in an Apartment Domain
  • 22.
    Gustafson, Joakim
    et al.
    TeliaSonera (R & D), KTH Speech, Music and Hearing.
    Bell, Linda
    TeliaSonera (R & D), KTH Speech, Music and Hearing.
    Boye, Johan
    TeliaSonera (R & D).
    Edlund, Jens
    KTH Speech, Music and Hearing.
    Wirén, Mats
    TeliaSonera (R & D).
    Constraint Manipulation and Visualization in a Multimodal Dialogue System2002In: Proceedings of the ISCA Workshop on Multimodal Dialogue in Mobile Environments, Kloster Irsee, Germany., 2002Conference paper (Refereed)
    Abstract [en]

    When interacting with spoken and multimodal dialogue systems, it is often difficult for users to understand and influence how their input is processed by the system. In this paper, wedescribe how these problems were addressed in the multimodal real-estate dialogue systemAdApt. During the course of a dialogue, the user's contraints are translated into symbolicicons that are visualized on the screen and can be manipulated by drag-and-drop operations.Users are thus given a clear picture of how their utterances are understood, and are given atransparent means of controlling the interaction with the system.

    Download full text (pdf)
    Constraint Manipulation and Visualization in a Multimodal Dialogue System
  • 23.
    Gustafson, Joakim
    et al.
    TeliaSonera (R & D).
    Bell, Linda
    TeliaSonera (R & D).
    Boye, Johan
    TeliaSonera (R & D).
    Lindström, Anders
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    The NICE fairy-tale game system2004In: Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue at HLT-NAACL 2004, Boston, 2004Conference paper (Refereed)
    Abstract [en]

    This paper presents the NICE fairy-tale game system, in which adults and children can interact with various animated characters in a 3D world. Computer games is an interesting application for spoken and multimodal dialogue systems. Moreover, for the development of future computer games, multimodal dialogue has the potential to greatly enrichen the user's experience. In this paper, we also present some requirements that have to be fulfilled to successfully integrate spoken dialogue technology with a computer game application

    Download full text (pdf)
    The NICE fairy-tale game system
  • 24.
    Kurfali, Murathan
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Zero-shot cross-lingual identification of direct speech using distant supervision2020In: The 4th Joint SIGHUM Workshopon Computational Linguistics for Cultural Heritage,Social Sciences, Humanities and Literature: Co-located with the 28th International Conferenceon Computational Linguistics COLING’2020, 2020, p. 105-111Conference paper (Refereed)
    Download full text (pdf)
    fulltext
  • 25.
    Kurfali, Murathan
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Östling, Robert
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Sjons, Johan
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    A Multi-Word Expression Dataset for Swedish2020In: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille: European Language Resources Association (ELRA) , 2020, p. 4402-4409Conference paper (Refereed)
    Abstract [en]

    We present a new set of 96 Swedish multi-word expressions annotated with degree of (non-)compositionality. In contrast to most previous compositionality datasets we also consider syntactically complex constructions and publish a formal specification of each expression. This allows evaluation of computational models beyond word bigrams, which have so far been the norm. Finally, we use the annotations to evaluate a system for automatic compositionality estimation based on distributional semantics. Our analysis of the disagreements between human annotators and the distributional model reveal interesting questions related to the perception of compositionality, and should be informative to future work in the area.

  • 26.
    Kurfalı, Murathan
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics. Albin Zehe, Leonard Konle, Lea Dümpelmann, Evelyn Gius, Svenja Guhr, Andreas Hotho, Fotis Jannidis, Lucas Kaufmann, Markus Krug, Frank Puppe, Nils Reiter, Annekea Schreiber.
    Breaking the Narrative: Scene Segmentation through Sequential Sentence Classification2021In: / [ed] Albin Zehe, Leonard Konle, Lea Dümpelmann, Evelyn Gius, Svenja Guhr, Andreas Hotho, Fotis Jannidis, Lucas Kaufmann, Markus Krug, Frank Puppe, Nils Reiter, Annekea Schreiber, CEUR-WS.org, 2021, p. 49-53Conference paper (Refereed)
    Abstract [en]

    In this paper, we describe our submission to the Shared Task on Scene Segmentation (STSS). The shared task requires participants to segment novels into coherent segments, called scenes. We approach this as a sequential sentence classification task and offer a BERT-based solution with a weighted cross-entropy loss. According to the results, the proposed approach performs relatively well on the task as our model ranks first and second, in official in-domain and out-domain evaluations, respectively. However, the overall low performances (0.37 F1-score) suggest that there is still much room for improvement.

    Download full text (pdf)
    kurfali_wiren_scenes.pdf
  • 27. Ljunglöf, Peter
    et al.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Syntactic parsing2010In: Handbook of Natural Language Processing / [ed] Nitin Indurkhya & Fred J. Damerau, Boca Raton, Florida: Chapman & Hall/CRC , 2010, 2, p. 59-91Chapter in book (Other (popular science, discussion, etc.))
    Abstract [en]

    This chapter presents basic techniques for grammar-driven natural language parsing, that is, analyzing a string of words (typically a sentence) to determine its structural description according to a formal grammar. In most circumstances, this is not a goal in itself but rather an intermediary step for the purpose of further processing, such as the assignment of a meaning to the sentence. To this end, the desired output of grammar-driven parsing is typically a hierarchical, syntactic structure suitable for semantic interpretation (the topic of Chapter 5). The string of words constituting the input will usually have been processed in separate phases of tokenization (Chapter 2) and lexical analysis (Chapter 3), which is hence not part of parsing proper.

  • 28. Megyesi, Beáta
    et al.
    Granstedt, Lena
    Johansson, Sofia
    Stockholm University, Faculty of Humanities, Department of Swedish Language and Multilingualism, Scandinavian Languages.
    Prentice, Julia
    Rosén, Dan
    Schenström, Carl-Johan
    Sundberg, Gunlög
    Stockholm University, Faculty of Humanities, Department of Swedish Language and Multilingualism, Scandinavian Languages.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Volodina, Elena
    Learner Corpus Anonymization in the Age of GDPR: Insights from the Creation of a Learner Corpus of Swedish2018In: Proceedings of the 7th Workshop on NLP for Computer Assisted Language Learning at SLTC 2018 (NLP4CALL 2018), Linköping: Linköping University Electronic Press, 2018, p. 47-56, article id 006Conference paper (Refereed)
    Abstract [en]

    This paper reports on the status of learner corpus anonymization for the ongoing research infrastructure project SweLL. The main project aim is to deliver and make available for research a well-annotated corpus of essays written by second language (L2) learners of Swedish. As the practice shows, annotation of learner texts is a sensitive process demanding a lot of compromises between ethical and legal demands on the one hand, and research and technical demands, on the other. Below, is a concise description of the current status of pseudonymization of language learner data to ensure anonymity of the learners, with numerous examples of the above-mentioned compromises.

    Download full text (pdf)
    fulltext
  • 29.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Björkstrand, Thomas
    Stockholm University, Faculty of Humanities, Department of Linguistics, Sign Language.
    Grigonyté, Gintaré
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Gustafson-Capková, Sofia
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Mesch, Johanna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Sign Language.
    Östling, Robert
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Schönström, Krister
    Stockholm University, Faculty of Humanities, Department of Linguistics, Swedish as a Second Language for the Deaf.
    Wallin, Lars
    Stockholm University, Faculty of Humanities, Department of Linguistics, Sign Language.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    SWE-CLARIN partner presentation: Natural Language Processing Resources from the Department of Linguistics, Stockholm University2014In: The first Swedish national SWE-CLARIN workshop: LT-based e-HSS in Sweden – taking stock and looking ahead / [ed] Lars Borin, 2014Conference paper (Other academic)
    Abstract [en]

    The aim of the CLARIN Research Infrastructure and SWE-CLARIN is to facilitate for scholars in the humanities and social sciences to access primary data in the form of natural language, and to provide tools for exploring, annotating and analysing these data. This paper gives an overview of the resources and tools developed at the Department of Linguistics at Stockholm University planned to be made available within the SWE-CLARIN project. The paper also outlines our collaborations with neighbouring areas in the humanities and social sciences where these resources and tools will be put to use.

    Download full text (pdf)
    "SWE-CLARIN partner presentation:.."
  • 30.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Gustafson Capková, Sofia
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    The Stockholm University Strindberg Corpus: Content and Possibilities2014In: Strindberg on International Stages/Strindberg in Translation / [ed] Roland Lysell, Cambridge: Cambridge Scholars Publishing, 2014Chapter in book (Other academic)
    Abstract [en]

    We have approached the works of August Strindberg from  a computational linguistic point of view, resulting in The Stockholm University Strindberg Corpus, consisting of seven of Strindberg's autobiographical works with linguistic annotation. The corpus is freely available for research. We use this corpus for three quantitative studies of Strindberg’s work: in the first, we describe the novels included in the corpus by keywords; in the second, we compare Strindberg’s use of emotionally charged words with selected prose of both his contemporaries and present-day authors; in the third, we explore the semantic prosody of KVINNA (“woman”) and MAN (“man”).

    Download full text (pdf)
    KNB_SGC_MW_SUSC2014.pdf
  • 31.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Gustafson-Capková, Sofia
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Stockholm University Strindberg Corpus: Contents and possibilities2012Conference paper (Other academic)
    Abstract [en]

    The Stockholm University Strindberg Corpus (SUSC) consists of seven novels by August Strindberg annotated for parts-of-speech with morphological analysis and lemmas. The corpus is freely available.

    SUSC consists of approximately 400 000 tokens annotated for parts-of-speech, including morphological analysis and lemmas, using the Stockholm-Umeå Corpus tag set in PAROLE-format. The annotated texts have been converted to XML which makes the corpus searchable with corpus analysis tools such as Xaira. This allows for e.g., searching for concordances with a specific wordform, part-of-speech and/or lemma, for pattern matching, and collocation extraction.

    The current version of the corpus includes seven works which can be classified as autobiographical:

    • Tjänstekvinnans son (The son of a servant, 1886-87)
    • Han och hon (He and she, 1919)
    • Inferno (Inferno, 1897)
    • Legender and Jakob brottas (Legends and Jacob wrestles, 1898)
    • Fagervik och Skamsund (Fair haven and Foulstrand, 1902)
    • Ensam (Alone, 1903)

    We are aware of three other electronic collections of Strindberg’s works: Projekt Runeberg, Litteraturbanken and Språkbanken. While these are valuable resources, SUSC is an important addition because, unlike the first two, it is linguistically annotated, and unlike the third, the data is available for download and thus can be fully inspected and processed using the researcher’s software of choice. Even more importantly, researchers can add their analyses as new layers of annotation of the corpus.

  • 32.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Gustavsson, Lisa
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    En korpusstudie om multimodal synkroni i tidig ordinlärning2013Conference paper (Other academic)
    Abstract [sv]

    I denna studie undersöker vi synkroni i tidig multimodal interaktion mellan föräldrar och barn. Med synkroni menas här återkommande mönster eller strukturella regelbundenheter (vad gäller ord, prosodi, blickriktning, gester och handlingar) som kan reducera komplexitet i språkinlärning.

    Data består av inspelningar av fem longitudinella dyader med två barn (0;7-2;7 år) och deras föräldrar. Inspelningarna transkriberas och annoteras med grundtonsfrekvens, blickriktning, gester och hantering av objekt. Vi undersöker synkroni genom att studera samtliga omnämnanden av två valda objekt (två dockor). För varje omnämnande undersöks grundtonsfrekvens och om omnämnandet kombineras med att den vuxne/barnet tittar på, pekar mot eller rör objektet.

    Man tänker sig att barnet använder sig av grundläggande perceptuella processer för att ta fasta på mönster och regelbundenheter i interaktionen med den vuxne, både i den akustiska signalen men också i den fysiska omgivningen (Gogate & Hollich, 2010). Den vuxne är dessutom benägen att framhäva den språkliga strukturen i interaktion med barnet, t ex genom att den vuxne talar om ett objekt och samtidigt visar objektet för barnet eller låter barnet känna på objektet. Denna synkroniserade multimodala input blir en hjälp för barnet att strukturera och sortera talsignalen och göra kopplingar mellan ord och objekt. I den här studien vill vi försöka fånga den här typen av multimodal synkroni genom att studera två specifika målord och hur interaktionen ser ut just kring dessa ord. Vi tänker oss att regelbundenheter vad gäller prosodi, blickriktningar och gester kommer att vara mer synkroniserade när barnet är mindre och målorden nya, än när barnen är äldre och målorden bekanta.

    Studien är del av ett projekt där vi försöker förklara tidig språkinlärning utifrån generella sociala och kognitiva förmågor. Genom att studera tidig förälder-barn-interaktion vill vi undersöka hur språkliga konstruktioner växer fram, vilka funktioner de har och hur de korrelerar med andra stimuli i barnets omgivning.

    Gogate, L., Hollich, G. 2010. Invariance detection within an interactive system: A perceptual gateway to language development. Psychological Review 117(2), 496-516.

    Download full text (pdf)
    Abstract
  • 33.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Multimodal annotation of parent-child interaction in a free-play setting2013In: Multimodal Corpora 2013: Beyond Audio and Video / [ed] J. Edlund, D. Heylen, P. Paggio, 2013Conference paper (Refereed)
    Abstract [en]

    This paper describes the verbal, non-verbal, and discourse annotation of a longitudinal corpus of parent-child interaction. The verbal annotation includes transcription of child-directed speech and child vocalizations. The non-verbal annotation describes gestures and objectrelatedactions by both parent and child. The verbal and non-verbal annotation is combined in discourse annotation that distinguishes initial from subsequent mentions, and further categorizes initial mentions depending on initiative.

    Download full text (pdf)
    Multimodal annotation of parent-child interaction in a free-play setting
  • 34.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Multimodal Annotation of Synchrony in Longitudinal Parent–Child Interaction2014In: MMC 2014 Multimodal Corpora: Combining applied and basic research targets: Workshop at LREC 2014, European Language Resources Association, 2014Conference paper (Refereed)
    Abstract [en]

    This paper describes the multimodal annotation of speech, gaze and hand movement in a corpus of longitudinal parent–child interaction,and reports results on synchrony, structural regularities which appear to be a key means for parents to facilitate learning of new conceptsto children. The results provide additional support for our previous finding that parents display decreasing synchrony as a function ofthe age of the child.

    Download full text (pdf)
    bjorkenstam_wiren_MMC2014
  • 35.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Reference to Objects in Longitudinal Parent-Child Interaction2012In: Workshop on Language, Action and Perception (APL), 2012Conference paper (Refereed)
    Abstract [en]

    A cognitive model of language learning needs to be dialogue-driven and multimodal to reflect how parent and child interact, using words, eye gaze, and object manipulation.

    In this paper, we present a scheme for multimodal annotation of parent-child interaction. We use this annotation for studying invariance across modalities. Our basic hypothesis is that perception of invariance (or synchrony) in multimodal patterns in auditory-visual speech is the device primarily used to reduce complexity in language learning.

    To this end, we have added verbal and non-verbal annotation to a corpus of longitudinal video and sound recordings of parent-child dyads. We use this data to try to determine if the amount of synchrony across modalities of parent-child interaction decreases as the child grows older and learns more language and gestures.

    Download full text (pdf)
    Reference to Objects in Longitudinal Parent-Child Interaction
  • 36.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Variation sets in child-directed speech2015In: / [ed] Ellen Marklund, Iris-Corinna Schwarz, 2015Conference paper (Refereed)
  • 37.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Eklund, Robert
    Linköping University.
    Disfluency in Child-Directed Speech2013In: Proceedings of Fonetik 2013: The XXVIth Annual Phonetics Meeting 12–13 June 2013, Linköping University Linköping, Sweden / [ed] Robert Eklund, Linköping: Department of Culture a nd Communication, Linköping University, Sweden , 2013, p. 57-60Conference paper (Refereed)
    Abstract [en]

    We report results from a longitudinal study of the rate and location of disfluencies in child-directed speech, using data for children between 0;6 and 2;9 years. We compare these results to adult-directed speech by the same speakers.

    Download full text (pdf)
    NilssonBjorkenstam_Wiren_Eklund_Fonetik2013
  • 38.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Östling, Robert
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Modelling the informativeness and timing of non-verbal cues in parent–child interaction2016In: The 54th Annual Meeting of the Association for Computational Linguistics: Proceedings of the 7th Workshop on Cognitive Aspects of Computational Language Learning, Stroudsburg, PA, USA: Association for Computational Linguistics, 2016, p. 82-90Conference paper (Refereed)
    Abstract [en]

    How do infants learn the meanings of their first words? This study investigates the informativeness and temporal dynamics of non-verbal cues that signal the speaker's referent in a model of early word–referent mapping. To measure the information provided by such cues, a supervised classifier is trained on information extracted from a multimodally annotated corpus of 18 videos of parent–child interaction with three children aged 7 to 33 months. Contradicting previous research, we find that gaze is the single most informative cue, and we show that this finding can be attributed to our fine-grained temporal annotation. We also find that offsetting the timing of the non-verbal cues reduces accuracy, especially if the offset is negative. This is in line with previous research, and suggests that synchrony between verbal and non-verbal cues is important if they are to be perceived as causally related.

    Download full text (pdf)
    Modelling the informativeness and timing of non-verbal cues in parent–child interaction
  • 39.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Östling, Robert
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Modelling the informativeness of different modalities in parent-child interaction2015In: Workshop on Extensive and Intensive Recordings of Children's Language Environment / [ed] Alex Cristia, Melanie Soderstrom, 2015Conference paper (Refereed)
  • 40. Rayner, Manny
    et al.
    Carter, DavidBouillon, PierretteDigalakis, VassilisWirén, MatsStockholm University, Faculty of Humanities, Department of Linguistics.
    The spoken language translator2000Collection (editor) (Other academic)
  • 41.
    Rayner, Manny
    et al.
    SRI International.
    Carter, David
    SRI International.
    Bouillon, Pierrette
    University of Geneva, ISSCO.
    Wirén, Mats
    TeliaSonera (R & D).
    Translation using the core language engine2000In: The spoken language translator / [ed] Manny Rayner, David Carter, Pierrette Bouillon, Vassilis Digalakis, Mats Wirén, Cambridge: Cambridge University Press, 2000, p. 25-56Chapter in book (Other academic)
  • 42.
    Rayner, Manny
    et al.
    SRI International.
    Carter, David
    SRI International.
    Bretan, Ivan
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Eklund, Robert
    TeliaSonera (R & D).
    Kirchmeier-Andersen, Sabine
    Philp, Christina
    Rational reuse of linguistic data2000In: The spoken language translator / [ed] Manny Rayner, David Carter, Pierrette Bouillon, Vassilis Digalakis, Mats Wirén, Cambridge: Cambridge University Press, 2000, p. 212-228Chapter in book (Other academic)
  • 43.
    Rayner, Manny
    et al.
    SRI International.
    Wirén, Mats
    TeliaSonera (R & D).
    Eklund, Robert
    TeliaSonera (R & D).
    Swedish coverage2000In: The spoken language translator / [ed] Manny Rayner, David Carter, Pierrette Bouillon, Vassilis Digalakis, Mats Wirén, Cambridge: Cambridge University Press, 2000, p. 180-191Chapter in book (Other academic)
  • 44. Rosén, Dan
    et al.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Volodina, Elena
    Error Coding of Second-Language Learner Texts Based on Mostly Automatic Alignment of Parallel Corpora2018In: CLARIN Annual Conference 2018: Proceedings / [ed] Inguna Skadina, Maria Eskevich, 2018, p. 181-184Conference paper (Refereed)
    Abstract [en]

    Error coding of second-language learner text, that is, detecting, correcting and annotating errors, is a cumbersome task which in turn requires interpretation of the text to decide what the errors are. This paper describes a system with which the annotator corrects the learner text by editing it prior to the actual error annotation. During the editing, the system automatically generates a parallel corpus of the learner and corrected texts. Based on this, the work of the annotator consists of three independent tasks that are otherwise often conflated in error coding: correcting the learner text, repairing inconsistent alignments, and performing the actual error annotation.

    Download full text (pdf)
    fulltext
  • 45.
    Samuelsson, Christer
    et al.
    Xerox Research Centre, Europe, Grenoble, France.
    Wirén, Mats
    TeliaSonera (R & D).
    Parsing techniques2000In: Handbook of natural language processing / [ed] Robert Dale, Hermann Moisl, Harold Somers, New York: Marcel Dekker, 2000, p. 59-91Chapter in book (Other academic)
  • 46. Volodina, Elena
    et al.
    Granstedt, Lena
    Matsson, Arild
    Megyesi, Beáta
    Pilán, Ildikó
    Prentice, Julia
    Rosén, Dan
    Rudebeck, Lisa
    Stockholm University, Faculty of Humanities, Department of Swedish Language and Multilingualism.
    Schenström, Carl-Johan
    Sundberg, Gunlög
    Stockholm University, Faculty of Humanities, Department of Swedish Language and Multilingualism, Scandinavian Languages.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    The Swell Language Learner Corpus: From Design to Annotation2019In: Northern European Journal of Language Technology (NEJLT), ISSN 2000-1533, Vol. 6, p. 67-104, article id 4Article in journal (Refereed)
    Abstract [en]

    The article presents a new language learner corpus for Swedish, SweLL, and the methodology from collection and pesudonymisation to protect personal information of learners to annotation adapted to second language learning. The main aim is to deliver a well-annotated corpus of essays written by second language learners of Swedish and make it available for research through a browsable environment. To that end, a new annotation tool and a new project management tool have been implemented, both with the main purpose to ensure reliability and quality of the final corpus. In the article we discuss reasoning behind metadata selection, principles of gold corpus compilation and argue for separation of normalization from correction annotation.

    Download full text (pdf)
    SweLL_NEJLT
  • 47. Volodina, Elena
    et al.
    Granstedt, Lena
    Megyesi, Beáta
    Prentice, Julia
    Rosén, Dan
    Schenström, Carl-Johan
    Sundberg, Gunlög
    Stockholm University, Faculty of Humanities, Department of Swedish Language and Multilingualism, Scandinavian Languages.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Annotation of learner corpora: first SweLL insights2018In: Proceedings of 7th Workshop on NLP for Computer Assisted Language Learning at SLTC 2018, 2018Conference paper (Refereed)
  • 48. Volodina, Elena
    et al.
    Megyesi, Beáta
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Granstedt, Lena
    Prentice, Julia
    Reichenberg, Monica
    Sundberg, Gunlög
    Stockholm University, Faculty of Humanities, Department of Swedish Language and Multilingualism, Scandinavian Languages.
    A Friend in Need? Research agenda for electronic Second Language infrastructure2016Conference paper (Refereed)
    Abstract [en]

    In this article, we describe the research and societal needs as well as ongoing efforts to shape Swedish as a Second Language (L2) infrastructure. Our aim is to develop an electronic research infrastructure that would stimulate empiric research into learners' language development by preparing data and developing language technology methods and algorithms that can successfully deal with deviations in the learner language.

    Download full text (pdf)
    fulltext
  • 49.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Language and Computers, Markus Dickinson, Chris Brew, Detmar Meurers, Wiley-Blackwell, 20132013In: Computational linguistics - Association for Computational Linguistics (Print), ISSN 0891-2017, E-ISSN 1530-9312, Vol. 39, no 3, p. 777-780Article, book review (Other academic)
    Download full text (pdf)
    fulltext
  • 50.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Roland Schäfer & Felix Bildhauer, Web Corpus Construction (Synthesis Lectureson Human Language Technologies 22)2014In: Nordic Journal of Linguistics, ISSN 0332-5865, E-ISSN 1502-4717, Vol. 37, no 3, p. 457-463Article, book review (Other academic)
    Download full text (pdf)
    fulltext
12 1 - 50 of 60
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf