Change search
Refine search result
12 1 - 50 of 51
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Bell, Linda
    et al.
    TeliaSonera (R and D).
    Boye, Johan
    TeliaSonera (R and D).
    Gustafson, Joakim
    TeliaSonera (R&D).
    Heldner, Mattias
    TeliaSonera (R&D).
    Lindström, Anders
    TeliaSonera (R and D).
    Wirén, Mats
    TeliaSonera (R&D).
    The Swedish NICE Corpus – Spoken dialogues between children and embodied characters in a computer game scenario2005In: Proceedings Interspeech 2005 - Eurospeech: 9th European Conference on Speech Communication and Technology, Lisbon, Portugal: ISCA , 2005, p. 2765-2768Conference paper (Refereed)
    Abstract [en]

    This article describes the collection and analysis of a Swedish database of spontaneous and unconstrained children-machine dialogues. The Swedish NICE corpus consists of spoken dialogues between children aged 8 to 15 and embodied fairytale characters in a computer game scenario. Compared to previously collected corpora of children's computer-directed speech, the Swedish NICE corpus contains extended interactions, including three-party conversation, in which the young users used spoken dialogue as the primary means of progression in the game.

  • 2.
    Bell, Linda
    et al.
    TeliaSonera (R & D).
    Boye, Johan
    TeliaSonera (R & D).
    Gustafson, Joakim
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Modality Convergence in a Multimodal Dialogue System2000In: Proceedings of Götalog, 2000, p. 29-34Conference paper (Other academic)
    Abstract [en]

    When designing multimodal dialogue systems allowing speech as well as graphical operations, it is important to understand not only how people make use of the different modalities in their utterances, but also how the system might influence a user's choice of modality by its own behavior. This paper describes an experiment in which subjects interacted with two versions of a simulated multimodal dialogue system. One version used predominantly graphical means when referring to specific objects; the other used predominantly verbal referential expressions. The purpose of the study was to find out what effect, if any, the system's referential strategy had on the user's behavior. The results provided limited support for the hypothesis that the system can influence users to adopt another modality for the purpose of referring

  • 3.
    Boye, Johan
    et al.
    TeliaSonera (R & D).
    Gustafson, Joakim
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Robust spoken language understanding in a computer game2006In: Speech Communication, ISSN 0167-6393, E-ISSN 1872-7182, Vol. 48, no 3-4, p. 335-353Article in journal (Refereed)
    Abstract [en]

    We present and evaluate a robust method for the interpretation of spoken input to a conversational computer game. The scenario of the game is that of a player interacting with embodied fairy-tale characters in a 3D world via spoken dialogue (supplemented by graphical pointing actions) to solve various problems. The player himself cannot directly perform actions in the world, but interacts with the fairy-tale characters to have them perform various tasks, and to get information about the world and the problems to solve. Hence the role of spoken dialogue as the primary means of control is obvious and natural to the player. Naturally, this means that robust spoken language understanding becomes a critical component. To this end, the paper describes a semantic representation formalism and an accompanying parsing algorithm which works off the output of the speech recogniser's statistical language model. The evaluation shows that the parser is robust in the sense of considerably improving on the noisy output of the speech recogniser.

  • 4.
    Boye, Johan
    et al.
    TeliaSonera.
    Wirén, Mats
    TeliaSonera.
    Multi-slot semantics for natural-language call routing systems2007In: Proceedings of Bridging the Gap: Academic and Industrial Research in Dialog Technology, 2007, p. 68-75Conference paper (Refereed)
    Abstract [en]

    Statistical classification techniques for natural-language call routing systems have matured to the point where it is possible to distinguish between several hundreds of semantic categories with an accuracy that is sufficient for commercial deployments. For category sets of this size, the problem of maintaining consistency among manually tagged utterances becomes limiting, as lack of consistency in the training data will degrade performance of the classifier. It is thus essential that the set of categories be structured in a way that alleviates this problem, and enables consistency to be preserved as the domain keeps changing. In this paper, we describe our experiences of using a two-level multi-slot semantics as a way of meeting this problem. Furthermore, we explore the ramifications of the approach with respect to classification, evaluation and dialogue design for call routing systems.

  • 5.
    Boye, Johan
    et al.
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Negotiative Spoken-Dialogue Interfaces to Databases2003In: Proceedings of Diabruck, Wallerfangen, Germany, 2003Conference paper (Refereed)
    Abstract [en]

    The aim of this paper is to develop a principled and empirically motivated approach to robust, negotiative spoken dialogue with databases. Robustness is achieved by limiting the set of representable utterance types. Still, the vast majority of utterances that occur in practice can be handled.

  • 6.
    Boye, Johan
    et al.
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Robust parsing and spoken negotiative dialogue with databases2008In: Natural Language Engineering, ISSN 1351-3249, E-ISSN 1469-8110, Vol. 14, no 3, p. 289-312Article in journal (Refereed)
    Abstract [en]

    This paper presents a robust parsing algorithm and semantic formalism for the interpretation of utterances in spoken negotiative dialogue with databases. The algorithm works in two passes: a domain-specific pattern-matching phase and a domain-independent semantic analysis phase. Robustness is achieved by limiting the set of representable utterance types to an empirically motivated subclass which is more expressive than propositional slot–value lists, but much less expressive than first-order logic. Our evaluation shows that in actual practice the vast majority of utterances that occur can be handled, and that the parsing algorithm is highly efficient and accurate.

  • 7.
    Boye, Johan
    et al.
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Robust Parsing of Utterances in Negotiative Dialogue2003In: Proceedings 8th European Conference on Speech Communication and Technology (Eurospeech), Geneva, Switzerland, 2003Conference paper (Refereed)
    Abstract [en]

    This paper presents an algorithm for domain-dependent parsing of utterances in negotiative dialogue. To represent such utterances, the algorithm outputs semantic expressions that are more expressive than propositional slot-filler structures. It is very fast and robust, yet precise and capable of correctly combining information from different utterance fragments.

  • 8.
    Boye, Johan
    et al.
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Gustafson, Joakim
    TeliaSonera (R & D).
    Contextual reasoning in multimodal dialogue systems: two case studies2004In: Proceedings of The 8th Workshop on the Semantics and Pragmatics of Dialogue Catalogue'04, Barcelona, 2004, p. 19-21Conference paper (Refereed)
    Abstract [en]

    This paper describes an approach to contextual reasoning for interpretation ofspoken multimodal dialogue. The approach is based on combining recencybased search for antecedents with an object-oriented domain representation insuch a way that the search is highly constrained by the type information of theantecedents. By furthermore representingcandidate antecedents from the dialoguehistory and visual context in a uniformway, a single machinery (based on -reduction in lambda calculus) can be usedfor resolving many kinds of underspecified utterances. The approach has beenimplemented in two highly different domains.

  • 9.
    Börstell, Carl
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Sign Language.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Mesch, Johanna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Sign Language.
    Gärdenfors, Moa
    Stockholm University, Faculty of Humanities, Department of Linguistics, Sign Language.
    Towards an Annotation of Syntactic Structure in the Swedish Sign Language Corpus2016In: Workshop Proceedings: 7th Workshop on the Representation and Processing of Sign Languages: Corpus Mining / [ed] Eleni Efthimiou, Stavroula-Evita Fotinea, Thomas Hanke, Julie Hochgesang, Jette Kristoffersen, Johanna Mesch, Paris: ELRA , 2016, p. 19-24Conference paper (Refereed)
    Abstract [en]

    This paper describes on-going work on extending the annotation of the Swedish Sign Language Corpus (SSLC) with a level of syntactic structure. The basic annotation of SSLC in ELAN consists of six tiers: four for sign glosses (two tiers for each signer; one for each of a signer’s hands), and two for written Swedish translations (one for each signer). In an additional step by Östling et al. (2015), all ¨ glosses of the corpus have been further annotated for parts of speech. Building on the previous steps, we are now developing annotation of clause structure for the corpus, based on meaning and form. We define a clause as a unit in which a predicate asserts something about one or more elements (the arguments). The predicate can be a (possibly serial) verbal or nominal. In addition to predicates and their arguments, criteria for delineating clauses include non-manual features such as body posture, head movement and eye gaze. The goal of this work is to arrive at two additional annotation tier types in the SSLC: one in which the sign language texts are segmented into clauses, and the other in which the individual signs are annotated for their argument types.

  • 10. Cap, Fabienne
    et al.
    Adesam, Yvonne
    Ahrenberg, Lars
    Borin, Lars
    Bouma, Gerlof
    Forsberg, Markus
    Kann, Viggo
    Östling, Robert
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Smith, Aaron
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Nivre, Joakim
    SWORD: Towards Cutting-Edge Swedish Word Processing2016In: Proceedings of SLTC 2016, 2016Conference paper (Refereed)
    Abstract [en]

    Despite many years of research on Swedish language technology, there is still no well-documented standard for Swedish word processing covering the whole spectrum from low-level tokenization to morphological analysis and disambiguation. SWORD is a new initiative within the SWE-CLARIN consortium aiming to develop documented standards for Swedish word processing. In this paper, we report on a pilot study of Swedish tokenization, where we compare the output of six different tokenizers on four different text types. For one text type (Wikipedia articles), we also compare to the tokenization produced by six manual annotators.

  • 11.
    Carter, David
    et al.
    SRI International.
    Rayner, Manny
    SRI International.
    Eklund, Robert
    TeliaSonera (R & D).
    Kaja, Jaan
    TeliaSonera (R & D).
    Lyberg, Bertil
    TeliaSonera (R & D).
    Sautermeister, Per
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R& D).
    Neumeyer, Leonardo
    SRI International.
    Weng, Fuliang
    SRI International.
    Common speech/language issues2000In: The spoken language translator / [ed] Manny Rayner, David Carter, Pierrette Bouillon, Vassilis Digalakis, Mats Wirén, Cambridge: Cambridge University Press, 2000, p. 284-294Chapter in book (Other academic)
  • 12.
    Carter, David
    et al.
    SRI International.
    Rayner, Manny
    SRI International.
    Eklund, Robert
    TeliaSonera (R & D).
    MacDermid, Catriona
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Evaluation2000In: The spoken language translator / [ed] Manny Rayner, David Carter, Pierrette Bouillon, Vassilis Digalakis, Mats Wirén, Cambridge: Cambridge University Press, 2000, p. 297-312Chapter in book (Other academic)
  • 13.
    Ek, Adam
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Distinguishing Narration and Speech in Prose Fiction Dialogues2019In: Proceedings of the Digital Humanities in the Nordic Countries 4th Conference / [ed] Costanza Navarretta, Manex Agirrezabal, Bente Maegaard, CEUR-WS.org , 2019, p. 124-132Conference paper (Refereed)
    Abstract [en]

    This paper presents a supervised method for a novel task, namely, detecting elements of narration in passages of dialogue in prose fiction. The method achieves an F1-score of 80.8%, exceeding the best baseline by almost 33 percentage points. The purpose of the method is to enable a more fine-grained analysis of fictional dialogue than has previously been possible, and to provide a component for the further analysis of narrative structure in general.

  • 14.
    Ek, Adam
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Östling, Robert
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Nilsson Björkenstam, Kristina
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Grigonytė, Gintarė
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Gustafson Capková, Sofia
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Identifying Speakers and Addressees in Dialogues Extracted from Literary Fiction2018In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018) / [ed] Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga, European Language Resources Association, 2018, p. 817-824Conference paper (Refereed)
    Abstract [en]

    This paper describes an approach to identifying speakers and addressees in dialogues extracted from literary fiction, along with a dataset annotated for speaker and addressee. The overall purpose of this is to provide annotation of dialogue interaction between characters in literary corpora in order to allow for enriched search facilities and construction of social networks from the corpora. To predict speakers and addressees in a dialogue, we use a sequence labeling approach applied to a given set of characters. We use features relating to the current dialogue, the preceding narrative, and the complete preceding context. The results indicate that even with a small amount of training data, it is possible to build a fairly accurate classifier for speaker and addressee identification across different authors, though the identification of addressees is the more difficult task.

  • 15. Eklund, Robert
    et al.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Effects of open and directed prompts on filled pauses and utterance production2010In: Proceedings from Fonetik 2010, Lund, June 2–4, 2010 / [ed] Susanne Schötz and Gilbert Ambrazaitis, Lund: Mediatryck , 2010, p. 23-28Conference paper (Other academic)
    Abstract [en]

    This paper describes an experiment where open and directed prompts were alternated when collecting speech data for the deployment of a call-routing application. The experiment tested whether open and directed prompts resulted in any differences with respect to the filled pauses exhibited by the callers, which is interesting in the light of the “many-options” hypothesis of filled pause production. The experiment also investigated the effects of the prompts on utterance form and meaning of the callers.

  • 16.
    Eklund, Robert
    et al.
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    ”Njutandes av en Monte Christo no 5 och en iskall Mojito”: Observationer om användning av s-particip2006In: Svenskans beskrivning 28, Förhandlingar vid Tjugoåttonde sammankomsten för svenskans beskrivning, 2006, p. 97-108Conference paper (Refereed)
  • 17.
    Grigonyte, Gintare
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Henriksson, Aron
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Swedification patterns of Latin and Greek affixes in clinical text2016In: Nordic Journal of Linguistics, ISSN 0332-5865, E-ISSN 1502-4717, Vol. 39, no 1, p. 5-37Article in journal (Refereed)
    Abstract [en]

    Swedish medical language is rich with Latin and Greek terminology which has undergone a Swedification since the 1980s. However, many original expressions are still used by clinical professionals. The goal of this study is to obtain precise quantitative measures of how the foreign terminology is manifested in Swedish clinical text. To this end, we explore the use of Latin and Greek affixes in Swedish medical texts in three genres: clinical text, scientific medical text and online medical information for laypersons. More specifically, we use frequency lists derived from tokenised Swedish medical corpora in the three domains, and extract word pairs belonging to types that display both the original and Swedified spellings. We describe six distinct patterns explaining the variation in the usage of Latin and Greek affixes in clinical text. The results show that to a large extent affixes in clinical text are Swedified and that prefixes are used more conservatively than suffixes.

  • 18.
    Grigonyté, Gintaré
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Karolinska Institutet, Sweden.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Improving Readability of Swedish Electronic Health Records through Lexical Simplification: First Results2014In: Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), Stroudsburg, USA: Association for Computational Linguistics, 2014, p. 74-83Conference paper (Refereed)
    Abstract [en]

    This paper describes part of an ongoing effort to improve the readability of Swedish electronic health records (EHRs). An EHR contains systematic documentation of a single patient’s medical history across time, entered by healthcare professionals with the purpose of enabling safe and informed care. Linguistically, medical records exemplify a highly specialised domain, which can be superficially characterised as having telegraphic sentences involving displaced or missing words, abundant abbreviations, spelling variations including misspellings, and terminology. We report results on lexical simplification of Swedish EHRs, by which we mean detecting the unknown, out-ofdictionary words and trying to resolve them either as compounded known words, abbreviations or misspellings.

  • 19.
    Grigonyté, Gintaré
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Kvist, Maria
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Karolinska Institute, Sweden.
    Velupillai, Sumithra
    Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Spelling Variation of Latin and Greek words in Swedish Medical Text2014Conference paper (Refereed)
  • 20.
    Gustafson, Joakim
    et al.
    KTH Speech, Music and Hearing.
    Bell, Linda
    KTH Speech, Music and Hearing.
    Beskow, Jonas
    KTH Speech, Music and Hearing.
    Boye, Johan
    TeliaSonera (R & D).
    Carlson, Rolf
    KTH Speech, Music and Hearing.
    Edlund, Jens
    KTH Speech, Music and Hearing.
    Granström, Björn
    KTH Speech, Music and Hearing.
    House, David
    KTH Speech, Music and Hearing.
    Wirén, Mats
    TeliaSonera (R & D).
    AdApt — A Multimodal Conversational Dialogue System in an Apartment Domain2000In: Proceedings of the Sixth International Conference on Spoken Language Processing (ICSLP), Beijing, China, 2000, p. 134-137Conference paper (Refereed)
  • 21.
    Gustafson, Joakim
    et al.
    TeliaSonera (R & D), KTH Speech, Music and Hearing.
    Bell, Linda
    TeliaSonera (R & D), KTH Speech, Music and Hearing.
    Boye, Johan
    TeliaSonera (R & D).
    Edlund, Jens
    KTH Speech, Music and Hearing.
    Wirén, Mats
    TeliaSonera (R & D).
    Constraint Manipulation and Visualization in a Multimodal Dialogue System2002In: Proceedings of the ISCA Workshop on Multimodal Dialogue in Mobile Environments, Kloster Irsee, Germany., 2002Conference paper (Refereed)
    Abstract [en]

    When interacting with spoken and multimodal dialogue systems, it is often difficult for users to understand and influence how their input is processed by the system. In this paper, wedescribe how these problems were addressed in the multimodal real-estate dialogue systemAdApt. During the course of a dialogue, the user's contraints are translated into symbolicicons that are visualized on the screen and can be manipulated by drag-and-drop operations.Users are thus given a clear picture of how their utterances are understood, and are given atransparent means of controlling the interaction with the system.

  • 22.
    Gustafson, Joakim
    et al.
    TeliaSonera (R & D).
    Bell, Linda
    TeliaSonera (R & D).
    Boye, Johan
    TeliaSonera (R & D).
    Lindström, Anders
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    The NICE fairy-tale game system2004In: Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue at HLT-NAACL 2004, Boston, 2004Conference paper (Refereed)
    Abstract [en]

    This paper presents the NICE fairy-tale game system, in which adults and children can interact with various animated characters in a 3D world. Computer games is an interesting application for spoken and multimodal dialogue systems. Moreover, for the development of future computer games, multimodal dialogue has the potential to greatly enrichen the user's experience. In this paper, we also present some requirements that have to be fulfilled to successfully integrate spoken dialogue technology with a computer game application

  • 23.
    Ljunglöf, Peter
    et al.
    Göteborgs universitet.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Syntactic parsing2010In: Handbook of Natural Language Processing / [ed] Nitin Indurkhya & Fred J. Damerau, Boca Raton, Florida: Chapman & Hall/CRC , 2010, 2, p. 59-91Chapter in book (Other (popular science, discussion, etc.))
    Abstract [en]

    This chapter presents basic techniques for grammar-driven natural language parsing, that is, analyzing a string of words (typically a sentence) to determine its structural description according to a formal grammar. In most circumstances, this is not a goal in itself but rather an intermediary step for the purpose of further processing, such as the assignment of a meaning to the sentence. To this end, the desired output of grammar-driven parsing is typically a hierarchical, syntactic structure suitable for semantic interpretation (the topic of Chapter 5). The string of words constituting the input will usually have been processed in separate phases of tokenization (Chapter 2) and lexical analysis (Chapter 3), which is hence not part of parsing proper.

  • 24. Megyesi, Beáta
    et al.
    Granstedt, Lena
    Johansson, Sofia
    Stockholm University, Faculty of Humanities, Department of Swedish Language and Multilingualism, Scandinavian Languages.
    Prentice, Julia
    Rosén, Dan
    Schenström, Carl-Johan
    Sundberg, Gunlög
    Stockholm University, Faculty of Humanities, Department of Swedish Language and Multilingualism, Scandinavian Languages.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Volodina, Elena
    Learner Corpus Anonymization in the Age of GDPR: Insights from the Creation of a Learner Corpus of Swedish2018In: Proceedings of the 7th Workshop on NLP for Computer Assisted Language Learning at SLTC 2018 (NLP4CALL 2018), Linköping: Linköping University Electronic Press, 2018, p. 47-56, article id 006Conference paper (Refereed)
    Abstract [en]

    This paper reports on the status of learner corpus anonymization for the ongoing research infrastructure project SweLL. The main project aim is to deliver and make available for research a well-annotated corpus of essays written by second language (L2) learners of Swedish. As the practice shows, annotation of learner texts is a sensitive process demanding a lot of compromises between ethical and legal demands on the one hand, and research and technical demands, on the other. Below, is a concise description of the current status of pseudonymization of language learner data to ensure anonymity of the learners, with numerous examples of the above-mentioned compromises.

  • 25.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Björkstrand, Thomas
    Stockholm University, Faculty of Humanities, Department of Linguistics, Sign Language.
    Grigonyté, Gintaré
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Gustafson-Capková, Sofia
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Mesch, Johanna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Sign Language.
    Östling, Robert
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Schönström, Krister
    Stockholm University, Faculty of Humanities, Department of Linguistics, Swedish as a Second Language for the Deaf.
    Wallin, Lars
    Stockholm University, Faculty of Humanities, Department of Linguistics, Sign Language.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    SWE-CLARIN partner presentation: Natural Language Processing Resources from the Department of Linguistics, Stockholm University2014In: The first Swedish national SWE-CLARIN workshop: LT-based e-HSS in Sweden – taking stock and looking ahead / [ed] Lars Borin, 2014Conference paper (Other academic)
    Abstract [en]

    The aim of the CLARIN Research Infrastructure and SWE-CLARIN is to facilitate for scholars in the humanities and social sciences to access primary data in the form of natural language, and to provide tools for exploring, annotating and analysing these data. This paper gives an overview of the resources and tools developed at the Department of Linguistics at Stockholm University planned to be made available within the SWE-CLARIN project. The paper also outlines our collaborations with neighbouring areas in the humanities and social sciences where these resources and tools will be put to use.

  • 26.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Gustafson Capková, Sofia
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    The Stockholm University Strindberg Corpus: Content and Possibilities2014In: Strindberg on International Stages/Strindberg in Translation / [ed] Roland Lysell, Cambridge: Cambridge Scholars Publishing, 2014Chapter in book (Other academic)
    Abstract [en]

    We have approached the works of August Strindberg from  a computational linguistic point of view, resulting in The Stockholm University Strindberg Corpus, consisting of seven of Strindberg's autobiographical works with linguistic annotation. The corpus is freely available for research. We use this corpus for three quantitative studies of Strindberg’s work: in the first, we describe the novels included in the corpus by keywords; in the second, we compare Strindberg’s use of emotionally charged words with selected prose of both his contemporaries and present-day authors; in the third, we explore the semantic prosody of KVINNA (“woman”) and MAN (“man”).

  • 27.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Gustafson-Capková, Sofia
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Stockholm University Strindberg Corpus: Contents and possibilities2012In: Arvet efter Strindberg - The Strindberg Legacy. The 18th International Strindberg Conference. Stockholm University, May 31 -- June 3, 2012., 2012Conference paper (Other academic)
    Abstract [en]

    The Stockholm University Strindberg Corpus (SUSC) consists of seven novels by August Strindberg annotated for parts-of-speech with morphological analysis and lemmas. The corpus is freely available.

    SUSC consists of approximately 400 000 tokens annotated for parts-of-speech, including morphological analysis and lemmas, using the Stockholm-Umeå Corpus tag set in PAROLE-format. The annotated texts have been converted to XML which makes the corpus searchable with corpus analysis tools such as Xaira. This allows for e.g., searching for concordances with a specific wordform, part-of-speech and/or lemma, for pattern matching, and collocation extraction.

    The current version of the corpus includes seven works which can be classified as autobiographical:

    • Tjänstekvinnans son (The son of a servant, 1886-87)
    • Han och hon (He and she, 1919)
    • Inferno (Inferno, 1897)
    • Legender and Jakob brottas (Legends and Jacob wrestles, 1898)
    • Fagervik och Skamsund (Fair haven and Foulstrand, 1902)
    • Ensam (Alone, 1903)

    We are aware of three other electronic collections of Strindberg’s works: Projekt Runeberg, Litteraturbanken and Språkbanken. While these are valuable resources, SUSC is an important addition because, unlike the first two, it is linguistically annotated, and unlike the third, the data is available for download and thus can be fully inspected and processed using the researcher’s software of choice. Even more importantly, researchers can add their analyses as new layers of annotation of the corpus.

  • 28.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Gustavsson, Lisa
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    En korpusstudie om multimodal synkroni i tidig ordinlärning2013Conference paper (Other academic)
    Abstract [sv]

    I denna studie undersöker vi synkroni i tidig multimodal interaktion mellan föräldrar och barn. Med synkroni menas här återkommande mönster eller strukturella regelbundenheter (vad gäller ord, prosodi, blickriktning, gester och handlingar) som kan reducera komplexitet i språkinlärning.

    Data består av inspelningar av fem longitudinella dyader med två barn (0;7-2;7 år) och deras föräldrar. Inspelningarna transkriberas och annoteras med grundtonsfrekvens, blickriktning, gester och hantering av objekt. Vi undersöker synkroni genom att studera samtliga omnämnanden av två valda objekt (två dockor). För varje omnämnande undersöks grundtonsfrekvens och om omnämnandet kombineras med att den vuxne/barnet tittar på, pekar mot eller rör objektet.

    Man tänker sig att barnet använder sig av grundläggande perceptuella processer för att ta fasta på mönster och regelbundenheter i interaktionen med den vuxne, både i den akustiska signalen men också i den fysiska omgivningen (Gogate & Hollich, 2010). Den vuxne är dessutom benägen att framhäva den språkliga strukturen i interaktion med barnet, t ex genom att den vuxne talar om ett objekt och samtidigt visar objektet för barnet eller låter barnet känna på objektet. Denna synkroniserade multimodala input blir en hjälp för barnet att strukturera och sortera talsignalen och göra kopplingar mellan ord och objekt. I den här studien vill vi försöka fånga den här typen av multimodal synkroni genom att studera två specifika målord och hur interaktionen ser ut just kring dessa ord. Vi tänker oss att regelbundenheter vad gäller prosodi, blickriktningar och gester kommer att vara mer synkroniserade när barnet är mindre och målorden nya, än när barnen är äldre och målorden bekanta.

    Studien är del av ett projekt där vi försöker förklara tidig språkinlärning utifrån generella sociala och kognitiva förmågor. Genom att studera tidig förälder-barn-interaktion vill vi undersöka hur språkliga konstruktioner växer fram, vilka funktioner de har och hur de korrelerar med andra stimuli i barnets omgivning.

    Gogate, L., Hollich, G. 2010. Invariance detection within an interactive system: A perceptual gateway to language development. Psychological Review 117(2), 496-516.

  • 29.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Multimodal annotation of parent-child interaction in a free-play setting2013In: Multimodal Corpora 2013: Beyond Audio and Video / [ed] J. Edlund, D. Heylen, P. Paggio, 2013Conference paper (Refereed)
    Abstract [en]

    This paper describes the verbal, non-verbal, and discourse annotation of a longitudinal corpus of parent-child interaction. The verbal annotation includes transcription of child-directed speech and child vocalizations. The non-verbal annotation describes gestures and objectrelatedactions by both parent and child. The verbal and non-verbal annotation is combined in discourse annotation that distinguishes initial from subsequent mentions, and further categorizes initial mentions depending on initiative.

  • 30.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Multimodal Annotation of Synchrony in Longitudinal Parent–Child Interaction2014In: MMC 2014 Multimodal Corpora: Combining applied and basic research targets: Workshop at LREC 2014, European Language Resources Association, 2014Conference paper (Refereed)
    Abstract [en]

    This paper describes the multimodal annotation of speech, gaze and hand movement in a corpus of longitudinal parent–child interaction,and reports results on synchrony, structural regularities which appear to be a key means for parents to facilitate learning of new conceptsto children. The results provide additional support for our previous finding that parents display decreasing synchrony as a function ofthe age of the child.

  • 31.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Reference to Objects in Longitudinal Parent-Child Interaction2012In: Workshop on Language, Action and Perception (APL), 2012Conference paper (Refereed)
    Abstract [en]

    A cognitive model of language learning needs to be dialogue-driven and multimodal to reflect how parent and child interact, using words, eye gaze, and object manipulation.

    In this paper, we present a scheme for multimodal annotation of parent-child interaction. We use this annotation for studying invariance across modalities. Our basic hypothesis is that perception of invariance (or synchrony) in multimodal patterns in auditory-visual speech is the device primarily used to reduce complexity in language learning.

    To this end, we have added verbal and non-verbal annotation to a corpus of longitudinal video and sound recordings of parent-child dyads. We use this data to try to determine if the amount of synchrony across modalities of parent-child interaction decreases as the child grows older and learns more language and gestures.

  • 32.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Variation sets in child-directed speech2015In: / [ed] Ellen Marklund, Iris-Corinna Schwarz, 2015Conference paper (Refereed)
  • 33.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Eklund, Robert
    Linköping University.
    Disfluency in Child-Directed Speech2013In: Proceedings of Fonetik 2013: The XXVIth Annual Phonetics Meeting 12–13 June 2013, Linköping University Linköping, Sweden / [ed] Robert Eklund, Linköping: Department of Culture a nd Communication, Linköping University, Sweden , 2013, p. 57-60Conference paper (Refereed)
    Abstract [en]

    We report results from a longitudinal study of the rate and location of disfluencies in child-directed speech, using data for children between 0;6 and 2;9 years. We compare these results to adult-directed speech by the same speakers.

  • 34.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Östling, Robert
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Modelling the informativeness and timing of non-verbal cues in parent–child interaction2016In: The 54th Annual Meeting of the Association for Computational Linguistics: Proceedings of the 7th Workshop on Cognitive Aspects of Computational Language Learning, Stroudsburg, PA, USA: Association for Computational Linguistics, 2016, p. 82-90Conference paper (Refereed)
    Abstract [en]

    How do infants learn the meanings of their first words? This study investigates the informativeness and temporal dynamics of non-verbal cues that signal the speaker's referent in a model of early word–referent mapping. To measure the information provided by such cues, a supervised classifier is trained on information extracted from a multimodally annotated corpus of 18 videos of parent–child interaction with three children aged 7 to 33 months. Contradicting previous research, we find that gaze is the single most informative cue, and we show that this finding can be attributed to our fine-grained temporal annotation. We also find that offsetting the timing of the non-verbal cues reduces accuracy, especially if the offset is negative. This is in line with previous research, and suggests that synchrony between verbal and non-verbal cues is important if they are to be perceived as causally related.

  • 35.
    Nilsson Björkenstam, Kristina
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Östling, Robert
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Modelling the informativeness of different modalities in parent-child interaction2015In: Workshop on Extensive and Intensive Recordings of Children's Language Environment / [ed] Alex Cristia, Melanie Soderstrom, 2015Conference paper (Refereed)
  • 36. Rayner, Manny
    et al.
    Carter, DavidBouillon, PierretteDigalakis, VassilisWirén, MatsStockholm University, Faculty of Humanities, Department of Linguistics.
    The spoken language translator2000Collection (editor) (Other academic)
  • 37.
    Rayner, Manny
    et al.
    SRI International.
    Carter, David
    SRI International.
    Bouillon, Pierrette
    University of Geneva, ISSCO.
    Wirén, Mats
    TeliaSonera (R & D).
    Translation using the core language engine2000In: The spoken language translator / [ed] Manny Rayner, David Carter, Pierrette Bouillon, Vassilis Digalakis, Mats Wirén, Cambridge: Cambridge University Press, 2000, p. 25-56Chapter in book (Other academic)
  • 38.
    Rayner, Manny
    et al.
    SRI International.
    Carter, David
    SRI International.
    Bretan, Ivan
    TeliaSonera (R & D).
    Wirén, Mats
    TeliaSonera (R & D).
    Eklund, Robert
    TeliaSonera (R & D).
    Kirchmeier-Andersen, Sabine
    Philp, Christina
    Rational reuse of linguistic data2000In: The spoken language translator / [ed] Manny Rayner, David Carter, Pierrette Bouillon, Vassilis Digalakis, Mats Wirén, Cambridge: Cambridge University Press, 2000, p. 212-228Chapter in book (Other academic)
  • 39.
    Rayner, Manny
    et al.
    SRI International.
    Wirén, Mats
    TeliaSonera (R & D).
    Eklund, Robert
    TeliaSonera (R & D).
    Swedish coverage2000In: The spoken language translator / [ed] Manny Rayner, David Carter, Pierrette Bouillon, Vassilis Digalakis, Mats Wirén, Cambridge: Cambridge University Press, 2000, p. 180-191Chapter in book (Other academic)
  • 40. Rosén, Dan
    et al.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Volodina, Elena
    Error Coding of Second-Language Learner Texts Based on Mostly Automatic Alignment of Parallel Corpora2018In: CLARIN Annual Conference 2018: Proceedings / [ed] Inguna Skadina, Maria Eskevich, 2018, p. 181-184Conference paper (Refereed)
    Abstract [en]

    Error coding of second-language learner text, that is, detecting, correcting and annotating errors, is a cumbersome task which in turn requires interpretation of the text to decide what the errors are. This paper describes a system with which the annotator corrects the learner text by editing it prior to the actual error annotation. During the editing, the system automatically generates a parallel corpus of the learner and corrected texts. Based on this, the work of the annotator consists of three independent tasks that are otherwise often conflated in error coding: correcting the learner text, repairing inconsistent alignments, and performing the actual error annotation.

  • 41.
    Samuelsson, Christer
    et al.
    Xerox Research Centre, Europe, Grenoble, France.
    Wirén, Mats
    TeliaSonera (R & D).
    Parsing techniques2000In: Handbook of natural language processing / [ed] Robert Dale, Hermann Moisl, Harold Somers, New York: Marcel Dekker, 2000, p. 59-91Chapter in book (Other academic)
  • 42. Volodina, Elena
    et al.
    Megyesi, Beata
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Granstedt, Lena
    Prentice, Julia
    Reichenberg, Monica
    Sundberg, Gunlög
    Stockholm University, Faculty of Humanities, Department of Swedish Language and Multilingualism, Scandinavian Languages.
    A Friend in Need? Research agenda for electronic Second Language infrastructure2016Conference paper (Refereed)
    Abstract [en]

    In this article, we describe the research and societal needs as well as ongoing efforts to shape Swedish as a Second Language (L2) infrastructure. Our aim is to develop an electronic research infrastructure that would stimulate empiric research into learners' language development by preparing data and developing language technology methods and algorithms that can successfully deal with deviations in the learner language.

  • 43.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Language and Computers, Markus Dickinson, Chris Brew, Detmar Meurers, Wiley-Blackwell, 20132013In: Computational linguistics - Association for Computational Linguistics (Print), ISSN 0891-2017, E-ISSN 1530-9312, Vol. 39, no 3, p. 777-780Article, book review (Other academic)
  • 44.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Review of "Web Corpus Construction" by Schäfer & Bildhauer2014In: Nordic Journal of Linguistics, ISSN 0332-5865, E-ISSN 1502-4717, Vol. 37, no 3, p. 457-463Article, book review (Other academic)
  • 45.
    Wirén, Mats
    et al.
    TeliaSonera (R & D).
    Eklund, Robert
    TeliaSonera (R & D).
    Engberg, Fredrik
    TeliaSonera (CID).
    Westermark, Johan
    TeliaSonera (CID).
    Experiences of an In-Service Wizard-of-Oz Data Collection for the Deployment of a Call-Routing Application2007In: Bridging the Gap: Academic and Industrial Research in Dialog Technologies Workshop Proceedings, Madison, WI: Omnipress , 2007, p. 56-63Conference paper (Other academic)
    Abstract [en]

    This paper describes our experiences of collecting a corpus of 42,000 dialogues for a call-routing application using a Wizard-of-Oz approach. Contrary to common practice in the industry, we did not use the kind of automated application that elicits some speech from the customers and then sends all of them to the same destination, such as the existing touch-tone menu, without paying attention to what they have said. Contrary to the traditional Wizard-of-Oz paradigm,our data-collection application was fully integrated within an existing service, replacing the existing touch-tonenavigation system with a simulated callroutingsystem. Thus, the subjects were real customers calling about real tasks,and the wizards were service agents from our customer care. We provide a detailed exposition of the data collection as such and the application used, and compare our approach to methods previously used.

  • 46.
    Wirén, Mats
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Matsson, Arild
    Rosén, Dan
    Volodina, Elena
    SVALA: Annotation of Second-Language Learner Text Based on Mostly Automatic Alignment of Parallel Corpora2019In: Selected papers from the CLARIN Annual Conference 2018, Pisa, 8-10 October 2018 / [ed] Inguna Skadina, Maria Eskevich, Linköping: Linköping University Electronic Press, 2019, p. 222-234, article id 023Conference paper (Refereed)
    Abstract [en]

    Annotation of second-language learner text is a cumbersome manual task which in turn requires interpretation to postulate the intended meaning of the learner’s language. This paper describes SVALA, a tool which separates the logical steps in this process while providing rich visual support for each of them. The first step is to pseudonymize the learner text to fulfil the legal and ethical requirements for a distributable learner corpus. The second step is to correct the text, which is carried out in the simplest possible way by text editing. During the editing, SVALA automatically maintains a parallel corpus with alignments between words in the learner source text and corrected text, while the annotator may repair inconsistent word alignments. Finally, the actual labelling of the corrections (the postulated errors) is performed. We describe the objectives, design and workflow of SVALA, and our plans for further development.

  • 47.
    Wirén, Mats
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    N. Björkenstam, Kristina
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Östling, Robert
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Modelling the Informativeness of Non-Verbal Cues in Parent–Child Interaction2017In: Proceedings of Interspeech 2017 / [ed] Francisco Lacerda, David House, Mattias Heldner, Joakim Gustafson, Sofia Strömbergsson, Marcin Włodarczak, The International Speech Communication Association (ISCA), 2017, p. 2203-2207Conference paper (Refereed)
    Abstract [en]

    Non-verbal cues from speakers, such as eye gaze and hand positions, play an important role in word learning. This is consistent with the notion that for meaning to be reconstructed, acoustic patterns need to be linked to time-synchronous patterns from at least one other modality. In previous studies of a multimodally annotated corpus of parent–child interaction, we have shown that parents interacting with infants at the early word-learning stage (7–9 months) display a large amount of time-synchronous patterns, but that this behaviour tails off with increasing age of the children. Furthermore, we have attempted to quantify the informativeness of the different nonverbal cues, that is, to what extent they actually help to discriminate between different possible referents, and how critical the timing of the cues is. The purpose of this paper is to generalise our earlier model by quantifying informativeness resulting from non-verbal cues occurring both before and after their associated verbal references.

  • 48.
    Wirén, Mats
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Nilsson Björkenstam, Kristina
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Grigonytė, Gintarė
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Cortes, Elisabet Eir
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Longitudinal Studies of Variation Sets in Child-directed Speech2016In: The 54th Annual Meeting of the Association for Computational Linguistics: Proceedings of the 7th Workshop on Cognitive Aspects of Computational Language Learning, Stroudsburg, PA, USA: Association for Computational Linguistics, 2016, p. 44-52Conference paper (Refereed)
    Abstract [en]

    One of the characteristics of child-directed speech is its high degree of repetitiousness. Sequences of repetitious utterances with a constant intention, variation sets, have been shown to be correlated with children’s language acquisition. To obtain a baseline for the occurrences of variation sets in Swedish, we annotate 18 parent–child dyads using a generalised definition according to which the varying form may pertain not just to the wording but also to prosody and/or non-verbal cues. To facilitate further empirical investigation, we introduce a surface algorithm for automatic extraction of variation sets which is easily replicable and language-independent. We evaluate the algorithm on the Swedish gold standard, and use it for extracting variation sets in Croatian, English and Russian. We show that the proportion of variation sets in child-directed speech decreases consistently as a function of children's age across Swedish, Croatian, English and Russian.

  • 49.
    Wirén, Mats
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Nilsson Björkenstam, Kristina
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Sjons, Johan
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Tengstrand, Lisa
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Variationsmängder i barnriktat tal2013In: XIII Nordiska Barnspråkssymposiet - 2013 Stockholms universitet, Sverige, 8-9 november 2013, 2013Conference paper (Other academic)
    Abstract [sv]

    Barnriktat tal har en rad unika egenskaper som alla tycks härröra från föräldrarnas (omedvetna) önskan att som mycket som möjligt underlätta språkinlärningen för barnet. En av dessa egenskaper hos barnriktat tal är dess repetitivitet, till exempel i successiva yttranden som följande:

    Var kan Kucka vara då?

    Var är Kucka?

    Var är kaninen som heter Kucka?

    I det här papperet studerar vi den lokala repetitiviteten i barnriktat tal, som i litteraturen brukar kallas variationsmängder. Dessa är intressanta genom att de visar de ord och konstruktioner som föräldrarna vid varje tillfälle tycks koncentrera sig på att lära sina barn.

    Ett teoretiskt ramverk med bäring på detta är konstruktionsgrammatik, som antar att konstruktioner är inlärningsbara eftersom a) de utgör konventionaliserade form–betydelsepar som b) lärs in gradvis, alltifrån holofraser över schematiska uttryck ("item-based constructions", Tomasello 2003) till vuxenspråkets fullt abstraherbara konstruktioner. Genom att vi har longitudinella data så kan vi fånga i vad mån de successiva konstruktionerna anpassas enligt detta mönster allteftersom barnet blir äldre.

    Flera försök till formalisering av begreppet variationsmängd har föreslagits, till exempel Küntay och Slobin (1996), Brodsky et al. (2007) och Onnis et al. (2008). Vanliga krav på en variationsmängd är att den a) utgör successiva yttranden med upp till två mellanliggande yttranden; b) att minst två av de ingående orden upprepas; och c) att yttrandenas intention är konstant. Vi experimenterar med olika värden på a) och använder stället för b) en strängmatchningsmetod som även tar hänsyn till yttrandelängden.

    I papperet redovisar vi utfallet av konstruktionstyper baserat på data från en longitudinell korpus med barnriktat tal för tretton barn i åldrar mellan 3 och 33 månader, fördelade på 58 sessioner.

  • 50.
    Östling, Robert
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Börstell, Carl
    Stockholm University, Faculty of Humanities, Department of Linguistics, Sign Language.
    Gärdenfors, Moa
    Stockholm University, Faculty of Humanities, Department of Linguistics, Sign Language.
    Wirén, Mats
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Universal Dependencies for Swedish Sign Language2017In: Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa / [ed] Jörg Tiedemann, Linköping: Linköping University Electronic Press, 2017, p. 303-308Conference paper (Refereed)
    Abstract [en]

    We describe the first effort to annotate a signed language with syntactic dependency structure: the Swedish Sign Language portion of the Universal Dependencies treebanks. The visual modality presents some unique challenges in analysis and annotation, such as the possibility of both hands articulating separate signs simultaneously, which has implications for the concept of projectivity in dependency grammars. Our data is sourced from the Swedish Sign Language Corpus, and if used in conjunction these resources contain very richly annotated data: dependency structure and parts of speech, video recordings, signer metadata, and since the whole material is also translated into Swedish the corpus is also a parallel text.

12 1 - 50 of 51
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf