Change search
Refine search result
1234 1 - 100 of 308
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Laskowski, Kornel
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics. Voci Technologies, Inc., USA.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    A Scalable Method for Quantifying the Role of Pitch in Conversational Turn-Taking2019In: 20th Annual Meeting of the Special Interest Group on Discourse and Dialogue: Proceedings of the Conference, Association for Computational Linguistics, 2019, p. 284-292Conference paper (Refereed)
    Abstract [en]

    Pitch has long been held as an important signalling channel when planning and deploying speech in conversation, and myriad studies have been undertaken to determine the extent to which it actually plays this role. Unfortunately, these studies have required considerable human investment in data preparation and analysis, and have therefore often been limited to a handful of specific conversational contexts. The current article proposes a framework which addresses these limitations, by enabling a scalable, quantitative characterization of the role of pitch throughout an entire conversation, requiring only the raw signal and speech activity references. The framework is evaluated on the Switchboard dialogue corpus. Experiments indicate that pitch trajectories of both parties are predictive of their incipient speech activity; that pitch should be expressed on a logarithmic scale and Z-normalized, as well as accompanied by a binary voicing variable; and that only the most recent 400 ms of the pitch trajectory are useful in incipient speech activity prediction.

  • 2. Sundberg, Johan
    et al.
    Salomão, Gláucia Laís
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Scherer, Klaus R.
    Analyzing Emotion Expression in Singing via Flow Glottograms, Long-Term-Average Spectra, and Expert Listener Evaluation2019In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588Article in journal (Refereed)
    Abstract [en]

    Background

    Acoustic aspects of emotional expressivity in speech have been analyzed extensively during recent decades. Emotional coloring is an important if not the most important property of sung performance, and therefore strictly controlled. Hence, emotional expressivity in singing may promote a deeper insight into vocal signaling of emotions. Furthermore, physiological voice source parameters can be assumed to facilitate the understanding of acoustical characteristics.

    Method

    Three highly experienced professional male singers sang scales on the vowel /ae/ or /a/ in 10 emotional colors (Neutral, Sadness, Tender, Calm, Joy, Contempt, Fear, Pride, Love, Arousal, and Anger). Sixteen voice experts classified the scales in a forced-choice listening test, and the result was compared with long-term-average spectrum (LTAS) parameters and with voice source parameters, derived from flow glottograms (FLOGG) that were obtained from inverse filtering the audio signal.

    Results

    On the basis of component analysis, the emotions could be grouped into four “families”, Anger-Contempt, Joy-Love-Pride, Calm-Tender-Neutral and Sad-Fear. Recognition of the intended emotion families by listeners reached accuracy levels far beyond chance level. For the LTAS and FLOGG parameters, vocal loudness had a paramount influence on all. Also after partialing out this factor, some significant correlations were found between FLOGG and LTAS parameters. These parameters could be sorted into groups that were associated with the emotion families.

    Conclusions

    (i) Both LTAS and FLOGG parameters varied significantly with the enactment intentions of the singers. (ii) Some aspects of the voice source are reflected in LTAS parameters. (iii) LTAS parameters affect listener judgment of the enacted emotions and the accuracy of the intended emotional coloring.

  • 3. Suni, Antti
    et al.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Vainio, Martti
    Šimko, Juraj
    Comparative Analysis of Prosodic Characteristics Using WaveNet Embeddings2019In: Proceedings of Interspeech 2019 / [ed] Gernot Kubin, Zdravko Kačič, The International Speech Communication Association (ISCA), 2019, p. 2538-2542Conference paper (Refereed)
    Abstract [en]

    We present a methodology for assessing similarities and differences between language varieties and dialects in terms of prosodic characteristics. A multi-speaker, multi-dialect WaveNet network is trained on low sample-rate signal retaining only prosodic characteristics of the original speech. The network is conditioned on labels related to speakers’ region or dialect. The resulting conditioning embeddings are subsequently used as a multi-dimensional characteristics of different language varieties, with results consistent with dialectological studies. The method and results are illustrated on a Swedia 2000 corpus of Swedish dialectal variation.

  • 4.
    Heldner, Mattias
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Carlsson, Denise
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Does lung volume size affect respiratory rate and utterance duration?2019In: Proceedings from Fonetik 2019, 2019, p. 97-102Conference paper (Other academic)
    Abstract [en]

    This study explored whether lung volume size affects respiratory rate and utterance duration. The lung capacity of four women and four men was estimated with a digital spirometer. These subjects subsequently read a nonsense text aloud while their respiratory movements were registered with a Respiratory Inductance Plethysmography (RIP) system. Utterance durations were measured from the speech recordings, and respiratory cycle durations and respiratory rates were measured from the RIP recordings. This experiment did not show any relationship between lung volume size and respiratory rate or utterance duration.

  • 5. Lieberman, Marion
    et al.
    Lohmander, Anette
    Gustavsson, Lisa
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Parents’ contingent responses in communication with 10-month-old children in a clinical group with typical or late babbling2019In: Clinical Linguistics & Phonetics, ISSN 0269-9206, E-ISSN 1464-5076Article in journal (Refereed)
    Abstract [en]

    Parental responsive behaviour in communication has a positive effect on child speech and language development. Absence of canonical babbling (CB) in 10–month–old infants is considered a risk factor for developmental difficulties, yet little is known about parental responsiveness in this group of children. The purpose of the current study was to examine proportion and type of parental responsive utterances after CB and vocalization utterances respectively in a clinical group of children with otitis media with effusion, with or without cleft palate. Audio-video recordings of interactions in free play situations with 22 parents and their 10-month-old infants were used, where 15 infants had reached the CB stage and 7 infants had not. Fifty consecutive child utterances were annotated and categorized as vocalization utterance or CB utterance. The parent’s following contingent response was annotated and labelled as acknowledgements, follow-in comments, imitations/expansions or directives. The Average intra-judge agreement was 90%, and the average inter-judger agreement was 84%. There was no significant difference in proportion contingent responses after vocalizations and CB, neither when considering all child utterances nor the child’s babbling stage. However, imitations/expansions tended to be more common after CB in the typical babbling group, whereas acknowledgements were more common after CB in the late babbling group. Our findings imply that responsiveness is a supportive strategy that is not fully used by parents of children with late babbling. Implications for further research as well as parent-directed intervention for children in clinical groups with late babbling are suggested.

  • 6.
    Zora, Hatice
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Riad, Tomas
    Stockholm University, Faculty of Humanities, Department of Swedish Language and Multilingualism, Scandinavian Languages.
    Ylinen, Sari
    Prosodically controlled derivations in the mental lexicon2019In: Journal of Neurolinguistics, ISSN 0911-6044, E-ISSN 1873-8052, Vol. 52, article id 100856Article in journal (Refereed)
    Abstract [en]

    Swedish morphemes are classified as prosodically specified or prosodically unspecified, depending on lexical or phonological stress, respectively. Here, we investigate the allomorphy of the suffix -(i)sk, which indicates the distinction between lexical and phonological stress; if attached to a lexically stressed morpheme, it takes a non-syllabic form (-sk), whereas if attached to a phonologically stressed morpheme, an epenthetic vowel is inserted (-isk). Using mismatch negativity (MMN), we explored the neural processing of this allomorphy across lexically stressed and phonologically stressed morphemes. In an oddball paradigm, participants were occasionally presented with congruent and incongruent derivations, created by the suffix -(i)sk, within the repetitive presentation of their monomorphemic stems. The results indicated that the congruent derivation of the lexically stressed stem elicited a larger MMN than the incongruent sequences of the same stem and the derivational suffix, whereas after the phonologically stressed stem a non-significant tendency towards an opposite pattern was observed. We argue that the significant MMN response to the congruent derivation in the lexical stress condition is in line with lexical MMN, indicating a holistic processing of the sequence of lexically stressed stem and derivational suffix. The enhanced MMN response to the incongruent derivation in the phonological stress condition, on the other hand, is suggested to reflect combinatorial processing of the sequence of phonologically stressed stem and derivational suffix. These findings bring a new aspect to the dual-system approach to neural processing of morphologically complex words, namely the specification of word stress.

  • 7.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    RespInPeace: Toolkit for processing respiratory belt data2019In: Proceedings of Fonetik 2019, 2019, p. 115-118Conference paper (Other academic)
    Abstract [en]

    RespInPeace is a Python toolkit for processing respiratory data collected using Respiratory Inductance Plethysmography (RIP). It provides methods for signal normalisation, calibration, parametrisation as well as for detection of respiratory events, such as inhalations, exhalations and breath holds. The paper gives a short overview of the most important functions of the program.

  • 8.
    Wikse Barrow, Carla
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics. Karolinska Institutet, Sweden.
    Nilsson Björkenstam, Kristina
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Strömbergsson, Sofia
    Subjective ratings of age-of-acquisition: exploring issues of validity and rater reliability2019In: Journal of Child Language, ISSN 0305-0009, E-ISSN 1469-7602, Vol. 46, no 2, p. 199-213Article in journal (Refereed)
    Abstract [en]

    This study aimed to investigate concerns of validity and reliability in subjective ratings of age-of-acquisition (AoA), through exploring characteristics of the individual rater. An additional aim was to validate the obtained AoA ratings against two corpora – one of child speech and one of adult speech – specifically exploring whether words over-represented in the child-speech corpus are rated with lower AoA than words characteristic of the adult-speech corpus. The results show that less than one-third of participating informants’ ratings are valid and reliable. However, individuals with high familiarity with preschool-aged children provide more valid and reliable ratings, compared to individuals who do not work with or have children of their own. The results further show a significant, age-adjacent difference in rated AoA for words from the two different corpora, thus strengthening their validity. The study provides AoA data, of high specificity, for 100 child-specific and 100 adult-specific Swedish words.

  • 9.
    Zora, Hatice
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Rudner, Mary
    Magnusson, Anna
    The role of affective and linguistic prosody in the cognitive emotional appraisal of language2019In: Abstract book: Fifth International Conference on Cognitive Hearing Science for Communication, 2019, p. 174-174, article id 60Conference paper (Refereed)
    Abstract [en]

    Prosody offers a unified expression domain for affective and linguistic communication. Affective prosody (e.g., anger vocalization) reflects pre-cognitive processes, whereas linguistic prosody (e.g., lexical tone) is an acquired cognitive skill. In the present study, we explored the interplay between subcortical affective prosody and cortical linguistic cues during emotional appraisal of speech using stereotyped electroencephalography (EEG) responses. We hypothesized that concurrent affective and linguistic prosody with the same valence will evoke a late positive frontal response, reflecting emotional appraisal supported by complex cognitive processing in frontal cortical areas. Using an auditory oddball paradigm, neural responses to a spoken pair of Swedish words that differed in emotional content due to linguistic prosody were investigated as pronounced with an angry and a neutral voice. The results indicate that when co-occurring, affective and linguistic prosody with the same valence elicit a unique late positive response in the frontal region that is distinct from the neural responses of affective and linguistic prosody alone. This study provides experimental evidence that both affective and linguistic prosody contribute synergistically to the cognitive emotional appraisal of language, and highlights the significance of pre-cognitive affective prosody in language processing, having important implications for both language learning and learning through language.

  • 10.
    Heldner, Mattias
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Beňuš, Štefan
    Gravano, Agustín
    Voice Quality as a Turn-Taking Cue2019In: Proceedings of Interspeech 2019 / [ed] Gernot Kubin, Zdravko Kačič, The International Speech Communication Association (ISCA), 2019, p. 4165-4169Conference paper (Refereed)
    Abstract [en]

    This work revisits the idea that voice quality dynamics (VQ) contributes to conveying pragmatic distinctions, with two case studies to further test this idea. First, we explore VQ as a turn-taking cue, and then as a cue for distinguishing between different functions of affirmative cue words. We employ acoustic VQ measures claimed to be better suited for continuous speech than those in own previous work. Both cases indicate that the degree of periodicity (as measured by CPPS) is indeed relevant in the production of the different pragmatic functions. In particular, turn-yielding is characterized by lower periodicity, sometimes accompanied by presence of creaky voice. Periodicity also distinguishes between backchannels, agreements and acknowledgements.

  • 11.
    Gerholm, Tove
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Hörberg, Thomas
    Stockholm University, Faculty of Humanities, Department of Linguistics, General Linguistics. Stockholm University, Faculty of Social Sciences, Department of Psychology, Perception and psychophysics.
    Tonér, Signe
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Kallioinen, Petter
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Frankenberg, Sofia
    Stockholm University, Faculty of Social Sciences, Department of Child and Youth Studies.
    Kjällander, Susanne
    Stockholm University, Faculty of Social Sciences, Department of Child and Youth Studies.
    Palmer, Anna
    Stockholm University, Faculty of Social Sciences, Department of Child and Youth Studies.
    Lenz Taguchi, Hillevi
    Stockholm University, Faculty of Social Sciences, Department of Child and Youth Studies.
    A protocol for a three-arm cluster randomized controlled superiority trial investigating the effects of two pedagogical methodologies in Swedish preschool settings on language and communication, executive functions, auditive selective attention, socioemotional skills and early maths skills2018In: BMC Psychology, E-ISSN 2050-7283, Vol. 6, article id 29Article in journal (Refereed)
    Abstract [en]

    Background

    During the preschool years, children develop abilities and skills in areas crucial for later success in life. These abilities include language, executive functions, attention, and socioemotional skills. The pedagogical methods used in preschools hold the potential to enhance these abilities, but our knowledge of which pedagogical practices aid which abilities, and for which children, is limited. The aim of this paper is to describe an intervention study designed to evaluate and compare two pedagogical methodologies in terms of their effect on the above-mentioned skills in Swedish preschool children.

    Method

    The study is a randomized control trial (RCT) where two pedagogical methodologies were tested to evaluate how they enhanced children’s language, executive functions and attention, socioemotional skills, and early maths skills during an intensive 6-week intervention. Eighteen preschools including 28 units and 432 children were enrolled in a municipality close to Stockholm, Sweden. The children were between 4;0 and 6;0 years old and each preschool unit was randomly assigned to either of the interventions or to the control group. Background information on all children was collected via questionnaires completed by parents and preschools. Pre- and post-intervention testing consisted of a test battery including tests on language, executive functions, selective auditive attention, socioemotional skills and early maths skills. The interventions consisted of 6 weeks of intensive practice of either a socioemotional and material learning paradigm (SEMLA), for which group-based activities and interactional structures were the main focus, or an individual, digitally implemented attention and math training paradigm, which also included a set of self-regulation practices (DIL). All preschools were evaluated with the ECERS-3.

    Discussion

    If this intervention study shows evidence of a difference between group-based learning paradigms and individual training of specific skills in terms of enhancing children’s abilities in fundamental areas like language, executive functions and attention, socioemotional skills and early math, this will have big impact on the preschool agenda in the future. The potential for different pedagogical methodologies to have different impacts on children of different ages and with different backgrounds invites a wider discussion within the field of how to develop a preschool curriculum suited for all children.

  • 12.
    Cortes, Elisabet Eir
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Šimko, Juraj
    Articulatory Consequences of Vocal Effort Elicitation Method2018In: Proceedings of Interspeech 2018 / [ed] B. Yegnanarayana, The International Speech Communication Association (ISCA), 2018, p. 1521-1525Conference paper (Refereed)
    Abstract [en]

    Articulatory features from two datasets, Slovak and Swedish, were compared to see whether different methods of eliciting loud speech (ambient noise vs. visually presented loudness target) result in different articulatory behavior. The features studied were temporal and kinematic characteristics of lip separation within the closing and opening gestures of bilabial consonants, and of the tongue body movement from /i/ to /a/ through a bilabial consonant. The results indicate larger hyper - articulation in the speech elicited with visually presented target. While individual articulatory strategies are evident, t he speaker groups agree on increasing the kinematic features consistently within each gesture in response to the increased vocal effort. Another concerted strategy is keeping the tongue response considerably smaller than that of the lips, presumably to preserve acoustic prerequisites necessary for the adequate vowel identity. While the method of visually presented loudness target elicits larger span of vocal effort, the two elicitation methods achieve comparable consistency per loudness conditions.

  • 13.
    Aare, Kätlin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Lippus, Pärtel
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Creak in the respiratory cycle2018In: Proceedings of Interspeech 2018 / [ed] B. Yegnanarayana, The International Speech Communication Association (ISCA), 2018, p. 1408-1412Conference paper (Refereed)
    Abstract [en]

    Creakiness is a well-known turn-taking cue and has been observed to systematically accompany phrase and turn ends in several languages. In Estonian, creaky voice is frequently used by all speakers without any obvious evidence for its systematic use as a turn-taking cue. Rather, it signals a lack of prominence and is favored by lengthening and later timing in phrases. In this paper, we analyze the occurrence of creak with respect to properties of the respiratory cycle. We show that creak is more likely to accompany longer exhalations. Furthermore, the results suggest there is little difference in lung volume values regardless of the presence of creak, indicating that creaky voice might be employed to preserve air over the course of longer utterances. We discuss the results in connection to processes of speech planning in spontaneous speech.

  • 14.
    Heldner, Mattias
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wagner, Petra
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Deep throat as a source of information2018In: Proceedings Fonetik 2018 / [ed] Åsa Abelin, Yasuko Nagano-Madsen, Gothenburg: University of Gothenburg, 2018, p. 33-38Conference paper (Other academic)
    Abstract [en]

    In this pilot study we explore the signal from an accelerometer placed on the tracheal wall (below the glottis) for obtaining robust voice quality estimates. We investigate cepstral peak prominence smooth, H1-H2 and alpha ratio for distinguishing between breathy, modal and pressed phonation across six (sustained) vowel qualities produced by four speakers and including a systematic variation of pitch. We show that throat signal spectra are unaffected by vocal tract resonances, F0 and speaker variation while retaining sensitivity to voice quality dynamics. We conclude that the throat signal is a promising tool for studying communicative functions of voice prosody in speech communication.

  • 15.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Exhalatory turn-taking cues2018In: Proceedings 9th International Conference on Speech Prosody 2018 / [ed] Katarzyna Klessa, Jolanta Bachan, Agnieszka Wagner, Maciej Karpiński, Daniel Śledziński, Poznań, Poland: The International Speech Communication Association (ISCA), 2018, p. 334-338Conference paper (Refereed)
    Abstract [en]

    The paper is a study of kinematic features of the exhalation which signal that the speaker is done speaking and wants to yield the turn. We demonstrate that the single most prominent feature is the presence of inhalation directly following the exhalation. However, several features of the exhalation itself are also found to significantly distinguish between turn holds and yields, such as slower exhalation rate and higher lung level at exhalation onset. The results complement existing body evidence on respiratory turn-taking cues which has so far involved mainly inhalatory features. We also show that respiration allows discovering pause interruptions thus allowing access to unrealised turn-taking intentions.

  • 16.
    Zora, Hatice
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Magnusson, Anna
    Rudner, Mary
    MMN signatures of symbolic and affective prosody2018Conference paper (Other academic)
  • 17.
    Zora, Hatice
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Neural correlates of symbolic and affective prosody2018Conference paper (Other academic)
  • 18.
    Schwarz, Iris-Corinna
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Clausnitzer, Ann-Christin
    Marklund, Ulrika
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Marklund, Ellen
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Phonetic correlates of perceived affect in mothers’ and fathers’ speech to Swedish 12-month-olds2018In: Abstract Book: Day 1, Sunday, July 1st, 2018, p. 262-263Conference paper (Refereed)
    Abstract [en]

    Infants prefer to listen to infant-directed speech (IDS) over adult-directed speech (ADS). IDS contains a greater amount of affect than ADS (Singh, Morgan & Best, 2002). Affect in infant-directed speech has been said to foster social bonds, maintain attention and teach language. In order to identify phonetic correlates of affect, prosodic features such as fundamental frequency, pitch range, pitch contour, vowel duration and rhythm have been tried (Katz, Cohn & Moore, 1996; Trainor, Austin & Desjardins, 2000). However, affect ratings are typically carried out on low-pass filtered speech in order to obscure semantic cues to affect. It is possible that more than semantic meaning is distorted by the filtering process. In the present study, acoustic-phonetic correlates to affect were studied in un-filtered short speech segments. One-syllable speech segments were rated on a scale ranging from highly negative via neutral to highly positive affect. Formant (F1, F2, F3), pitch (mean, maximum, minimum, range, contour), and vowel duration measures were obtained from the speech samples, and relations between acoustic measures and rated affect were analyzed. The speech samples were the syllables /mo/, /na/, and /li/ produced by Swedish mothers (n = 29) and fathers (n = 21) when talking to their 12-month-old children. Recordings of IDS took place during free play in a laboratory setting, and the syllables were the names of soft toys that the parents were asked to use when interacting with their child. Parents and children participated in a longitudinal interaction study, and this was their fourth visit at the laboratory, so they were familiar with task, setting and toys. ADS exemplars of the syllables were also selected from a sub-sample of the mothers (n = 14), recorded at their first visit to the laboratory. Participants in the perceptual rating experiment (n = 35; 21 female; mean age = 28.6 years; age range = 19-45 years) were presented with one syllable at a time and asked to rate the affect conveyed on a scale from -4 (high negative affect) to +4 (high positive affect), with 0 as midpoint (neutral affect). The experiment was self-paced, and participants could listen to each syllable as many times as they liked. Each experiment session lasted between 30 and 50 minutes. A mixed-effects model was designed with AffectRating as dependent variable, Rater as random effects variable, and RaterGender, RaterHasChildren, F1, F2, F3, MeanPitch, PitchRange as well as VowelDuration as fixed effects variables. Minimum pitch, maximum pitch and pitch contour were excluded from the analysis since they were correlated with pitch range. Significant results were found for F1, F3, MeanPitch, PitchRange and VowelDuration. Higher F1 and/or F3 resulted in more negative perceived affect whereas higher mean pitch, greater pitch range, and/or longer vowel duration resulted in more positive perceived affect. The relation between perceived affect and formant values could be related to differences in perceived affect for different vowels, rather than variations in the formant values per se. It would be interesting to look at variation within separate vowel categories. The relation between positive affect and prosodic exaggerations suggests that some acoustic characteristics of IDS could be a result of parents conveying positive affect to their children.

  • 19.
    Schwarz, Iris-Corinna
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Lam-Cassettari, Christa
    Marklund, Ulrika
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Marklund, Ellen
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Positive affect in Swedish and Australian mothers’ speech to their 3- to 12-month-old infants2018Conference paper (Refereed)
    Abstract [en]

    Affect is an important feature of infant-directed speech (IDS). IDS towards infants during the first year of life varies in degree of affect. In Australian English (AuE), positive affect in mothers’ IDS increases over age from birth to twelve months, with a dip at nine months (Kitamura & Burnham, 2003).

    This study investigates whether affect in Swedish (Swe) mothers’ IDS towards their infants develops in a similar pattern compared to the Australian English data. It also introduces a cross-linguistic perspective of affect perception in IDS as Swedish native speakers rate both the Swe and AuE IDS samples.

    The adult raters (N=16; 8 female, mean age 36.4 years; SD = 10.1) assessed affect polarity and affect degree in low-pass filtered IDS samples on a scale from -4 to +4 (highly negative to highly positive). The 25 s long samples were cut from interactions between mothers and their infants at three, six, nine and twelve months and low-pass filtered. The Australian material was sampled from the same dataset as used in Kitamura and Burnham (2003); the Swedish material was recorded at Stockholm Babylab (Gerholm et al., 2015).

    Separate repeated measures ANOVAs were conducted on the mean affect ratings of AuE and Swe IDS, with infant age as within-subject factor, followed up with polynomial contrasts. For AuE IDS, a significant main effect was found for age (F(45,3)=10.356; p<.001), with a linear (F(15,1)=20.542; p<.001) and a cubic trend (F(15,1)=7.780; p=.014). For Swe IDS, a significant main effect was found for age (F(45,3)=4.186; p=.011), with a linear (F(15,1)=10.993; p=.005) and a quadratic trend (F(15,1)=6.124; p=.026). In both languages, positive affect decreases over age.

    While cross-linguistic affect perception of AuE IDS is still similar to the original, Kitamura and Burnham’s data show a more pronounced cubic trend and a general increase of affect in IDS over the first year. In this study, affect development in AuE IDS shows a steep increase from three to six months, followed by a decrease from six to nine months and a slight recovery from nine to twelve months. Affect in Swe IDS follows a different developmental trajectory, as it decreases from three to nine months to recover with an increase from nine to twelve months. This is a first indication for language-specific differences in IDS affect over the first year. Future ratings of the same material with AuE native speakers will show if the difference in the AuE results is an effect of rater language.

  • 20.
    Zora, Hatice
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Riad, Tomas
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Ylinen, Sari
    Prosodically controlled suffix alternation in the mental lexicon2018Conference paper (Refereed)
  • 21.
    Traunmüller, Hartmut
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Towards a More Well-Founded Cosmology2018In: Zeitschrift fur Naturforschung A-A Journal of Physical Sciences, ISSN 0932-0784, E-ISSN 1865-7109, Vol. 73, no 11, p. 1005-1023Article in journal (Refereed)
    Abstract [en]

    First, this paper broaches the definition of science and the epistemic yield of tenets and approaches: phenomenological (descriptive only), well founded (solid first principles, conducive to deep understanding), provisional (falsifiable if universal, verifiable if existential), and imaginary (fictitious entities or processes, conducive to empirically unsupported beliefs). The Big Bang paradigm and the ΛCDM `concordance model' involve such beliefs: the emanation of the universe out of a non-physical stage, cosmic inflation (hardly testable), Λ (fictitious energy), and `exotic' dark matter. They fail in the confidence check that empirical science requires. They also face a problem in delimiting what expands from what does not. In the more well-founded cosmology that emerges, energy is conserved, the universe is persistent (not transient), and the `perfect cosmological principle' holds. Waves and other field perturbations that propagate at c (the escape velocity of the universe) expand exponentially with distance. This results from gravitation. The galaxy web does not expand. Potential Φ varies as -H/(cz) instead of -1/r. Inertial forces reflect gradients present in comoving frames of accelerated bodies (interaction with the rest of the universe - not with space). They are increased where the universe appears blue-shifted and decreased more than proportionately at very low accelerations. A cut-off acceleration a0 = 0.168 cH is deduced. This explains the successful description of galaxy rotation curves by "Modified Newtonian Dynamics". A fully elaborated physical theory is still pending. The recycling of energy via a cosmic ocean filled with photons (the cosmic microwave background), neutrinos and gravitons, and the wider implications for science are briefly discussed.

  • 22.
    Marklund, Ellen
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Lacerda, Francisco
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Schwarz, Iris-Corinna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Using rotated speech to approximate the acoustic mismatch negativity response to speech2018In: Brain and Language, ISSN 0093-934X, E-ISSN 1090-2155, Vol. 176, p. 26-35Article in journal (Refereed)
    Abstract [en]

    The mismatch negativity (MMN) response is influenced by the magnitude of the acoustic difference between standard and deviant, and the response is typically larger to linguistically relevant changes than to linguistically irrelevant changes. Linguistically relevant changes between standard and deviant typically co-occur with differences between the two acoustic signals. It is therefore not straightforward to determine the contribution of each of those two factors to the MMN response. This study investigated whether spectrally rotated speech can be used to determine the impact of the acoustic difference on the MMN response to a combined linguistic and acoustic change between standard and deviant. Changes between rotated vowels elicited an MMN of comparable amplitude to the one elicited by a within-category vowel change, whereas the between-category vowel change resulted in an MMN amplitude of greater magnitude. A change between rotated vowels resulted in an MMN ampltude more similar to that of a within-vowel change than a complex tone change did. This suggests that the MMN amplitude reflecting the acoustic difference between two speech sounds can be well approximated by the MMN amplitude elicited in response to their rotated counterparts, in turn making it possible to estimate the part of the response specific to the linguistic difference.

  • 23. Sundberg, Johan
    et al.
    Salomão, Gláucia Laís
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Scherer, Klaus R.
    What does LTAS tell about the voice source?2018In: 47th Annual Symposium: Care of the Professional Voice: Program Abstracts, 2018, p. 15-15Conference paper (Refereed)
    Abstract [en]

    Objective: The long-term-average spectrum, or LTAS has been extensively used in voice research. It provides an overall measure of voice characteristics allowing to derive a large number of parameters. A minimalistic set of parameters has been identified which offers the most essential properties [Eyben et al., 2015; 2016; Scherer et al., 2017]. LTAS analysis is typically applied to audio signals of running speech or continuous singing. It reflects the combination of formant frequency and voice source characteristics. Often, e.g. in clinical settings, it is relevant to distinguish between these two sources Voice source analysis can be performed by means of inverse filtering. The aim of the present work was to analyse the relationships between LTAS and voice source properties.

    Method: Three internationally touring male singers sang scales in eleven different emotional colours. This material was analysed by inverse filtering as well as in terms of LTAS. The correlations between the averages across the scale tones of the flow glottogram parameters and minimalistic set of LTAS parameters were analysed.

    Results/Conclusions: A strong negative correlation was found between spectral slope and the flow glottogram’s maximum flow declination rate MFDR, and a strong positive correlation between proportion of spectral energy below 1000Hz and H1-H2. Somewhat surprisingly, a strong negative correlation was found between equivalent sound level and the normalized and un-normalized amplitude quotients (the ratio between AC peak-to-peak amplitude of the flow glottogram and MFDR). Thus, these LTAS parameters seem particularly informative with respect to voice source characteristics.

  • 24. Ćwiek, Aleksandra
    et al.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wagner, Petra
    Acoustics and discourse function of two types of breathing signals2017In: Nordic Prosody: Proceedings of the XIIth Conference, Trondheim 2016 / [ed] Jardar Eggesbö Abrahamsen, Jacques Koreman, Wim van Dommelen, Peter Lang Publishing Group, 2017, p. 83-91Conference paper (Refereed)
    Abstract [en]

    Breathing is fundamental for living and speech, and it has been a subject of linguistic research for years. Recently, there has been a renewed interest in tackling the question of possible communicative functions of breathing (e.g. Rochet-Capellan & Fuchs, 2014; Aare, Włodarczak & Heldner, 2014; Włodarczak & Heldner, 2015; Włodarczak, Heldner, & Edlund, 2015). The present study set out to determine acoustic markedness and communicative functions of pauses accompanied and non-accompanied by breathing. We hypothesised that an articulatory reset occurring in breathing pauses and an articulatory freeze in non-breathing pauses differentiates between the two types. A production experiment was conducted and some evidence in favour of such a phenomenon was found. Namely, in case of non-breathing pauses, we observed more coarticulation evidenced by a more frequent omission of plosive releases. Our findings thus give some evidence in favour of the communicative function of breathing.

  • 25.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Capturing respiratory sounds with throat microphones2017In: Nordic Prosody: Proceedings of the XIIth Conference, Trondheim 2016 / [ed] Jardar Eggesbö Abrahamsen, Jacques Koreman, Wim van Dommelen, Peter Lang Publishing Group, 2017, p. 181-190Conference paper (Refereed)
    Abstract [en]

    This paper presents the results of a pilot study using throat microphones for recording respiratory sounds. We demonstrate that inhalation noises are louder before longer stretches of speech than before shorter utterances (< 1 s) and in silent breathing. We thus replicate the results from our earlier study which used close-talking head-mounted microphones, without the associated data loss due to cross-talk. We also show that inhalations are louder within than before a speaking turn. Hence, the study provides another piece of evidence in favour of communicative functions of respiratory noises serving as potential turn-taking (for instance, turn-holding) cues. 

  • 26.
    Marklund, Ellen
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Pagmar, David
    Stockholm University, Faculty of Humanities, Department of Linguistics, General Linguistics.
    Gerholm, Tove
    Stockholm University, Faculty of Humanities, Department of Linguistics, General Linguistics.
    Gustavsson, Lisa
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Computational simulations of temporal vocalization behavior in adult-child interaction2017In: Proceedings of Interspeech 2017 / [ed] Francisco Lacerda, David House, Mattias Heldner, Joakim Gustafson, Sofia Strömbergsson, Marcin Włodarczak, The International Speech Communication Association (ISCA), 2017, p. 2208-2212Conference paper (Refereed)
    Abstract [en]

    The purpose of the present study was to introduce a computational simulation of timing in child-adult interaction. The simulation uses temporal information from real adult-child interactions as default temporal behavior of two simulated agents. Dependencies between the agents’ behavior are added, and how the simulated interactions compare to real interaction data as a result is investigated. In the present study, the real data consisted of transcriptions of a mother interacting with her 12- month-old child, and the data simulated was vocalizations. The first experiment shows that although the two agents generate vocalizations according to the temporal characteristics of the interlocutors in the real data, simulated interaction with no contingencies between the two agents’ behavior differs from real interaction data. In the second experiment, a contingency was introduced to the simulation: the likelihood that the adult agent initiated a vocalization if the child agent was already vocalizing. Overall, the simulated data is more similar to the real interaction data when the adult agent is less likely to start speaking while the child agent vocalizes. The results are in line with previous studies on turn-taking in parent-child interaction at comparable ages. This illustrates that computational simulations are useful tools when investigating parent-child interactions.

  • 27.
    Schwarz, Iris-Corinna
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Marklund, Ulrika
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Marklund, Ellen
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Contingency differences in parent-infant turn-taking between primary and secondary caregivers in relation to turn-taking experience2017In: Many Paths to Language (MPaL), 2017, p. 59-60Conference paper (Refereed)
    Abstract [en]

    Contingent turn-taking between parents and infants is positively correlated with child language outcome (Tamis-LeMonda, Bornstein & Baumwell, 2001; Marklund, Marklund, Lacerda & Schwarz, 2015). Many studies focus exclusively on mothers (e.g., Sung, Fausto-Sterling, Garcia Coll & Seifer, 2013). However, infants in Western countries acquire language with input both from mothers and fathers in varying degree, depending on how the family chooses to organize their parental leave. Sweden is an ideal country to study both mothers and fathers as caregivers for infants.

    Parental contingency is often reported as response frequency within a time window after infant vocalizations (e.g., Johnson, Caskey, Rand, Tucker & Vohr, 2014). In this study, turn-taking contingency is measured by the duration of parent-child and child-parent switching pauses around infant vocalization with potential communicative intent. Fourteen (7 girls) infants and their primary and secondary caregivers were recorded in the family home when the infant was six months (M = 5 months 29 days, range: 5 months 3 days – 6 months 16 days). The audio recordings were collected two different days and lasted approximately ten minutes each. One of the days was a typical weekday on which the primary caregiver – in all cases the mother – was at home with the infant. The other day was a typical weekend day on which also the secondary caregiver – in all cases the father – was at home and spent time with the infant. On each of these days, a daylong LENA recording was also made to estimate the amount of exposure to female and male speech input on a typical day. Using Wavesurfer 1.8.5 (Sjölander & Beskow, 2010), on- and offset of all infant vocalizations were tagged as well as on- and offset for the surrounding switching pauses. If parent utterance and infant vocalization overlapped, switching pause duration received a negative value.

    Two repeated measures ANOVAs were used to determine the effects of caregiver type (primary/secondary) and infant sex (girl/boy) on pause duration in infant-parent and parent-infant switching pauses. A main effect was found for caregiver type in infant-parent switching pauses (F(12,1) = 5.214; p = .041), as primary caregivers responded on average about 500 ms faster to infant vocalizations than secondary caregivers, with no effect of or interaction with infant sex. In parent-infant switching pauses, the main effect for caregiver type was almost significant (F(12,1) = 4.574; p = .054), with no effect of or interaction with infant sex. It is therefore fair to say that turn-taking between primary caregivers and 6-month-olds is more contingent than turn-taking between secondary caregivers and 6-month-olds.

    Four linear regressions were then used to predict parent-infant and infant-parent switching pause duration from the average duration of female speech exposure and the average duration of male speech exposure across the two days, with the assumption that female speech duration equals speech input from the primary caregiver and male speech duration the secondary caregiver. None of the regression analyses turned out to be significant. However, it is likely that the greater contingency between primary caregivers and the infant is a function of greater turn-taking experience, that is, conversational turns rather than mere exposure to speech. Therefore, we will look next at the number of conversational turns for each caregiver separately and investigate whether they predict parental response contingency.

    The present study shows that vocal turn-taking is more contingent between infants and primary caregivers than with secondary caregivers. Primary caregivers respond significantly faster to infant vocalizations than secondary caregivers and in turn, infants have a tendency to respond faster to primary caregivers. It is likely that this relationship is mediated by turn-taking experience, although this could not be shown with regression analyses using LENA estimates of total duration of speech exposure to primary and secondary caregiver.

     

     

  • 28. Šimko, Juraj
    et al.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Suni, Antti
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Vainio, Martti
    Coordination between f0, intensity and breathing signals2017In: Nordic Prosody: Proceedings of the XIIth Conference, Trondheim 2016 / [ed] Jardar Eggesbö Abrahamsen, Jacques Koreman, Wim van Dommelen, Peter Lang Publishing Group, 2017, p. 147-156Conference paper (Refereed)
    Abstract [en]

    The present paper presents preliminary results on temporal coordination of breathing, intensity and fundamental frequency signals using continuous wavelet transform. We have found tendencies towards phase-locking at time scales corresponding to several prosodic units such as vowel-to-vowel intervals and prosodic words. The proposed method should be applicable to a wide range of problems in which the goal is finding a stable phase relationship in a pair of hierarchically organised signals.

  • 29. Lam-Cassettari, Christa
    et al.
    Marklund, Ellen
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Schwarz, Iris-Corinna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Daddy counts: Australian and Swedish fathers? early speech input reflects infants? receptive vocabulary at 12 months 2017Conference paper (Other academic)
    Abstract [en]

    Parental input is known to predict language development. This study uses the LENA input duration estimates for female and male voices in two infant language environments, Australian English and Swedish, to predict receptive vocabulary size at 12 months. The Australian English learning infants were 6 months (N = 18, 8 girls), the Swedish learning infants were 8 months (N = 12, 6 girls). Their language environment was recorded on two days: one weekday in the primary care of the mother, and one weekend day when also the father spent time with the family. At 12 months, parents filled in a CDI form, the OZI for Australian English and the SECDI‐I for Swedish. In multiple regressions across languages, only male speech input duration predicted vocabulary scores significantly (β = .56;p = .01). Analysing boys and girls separately, male speech input predicts only boys’ vocabulary (β =.79 ; p= .01). Analysing languages separately for boys, the Australian English results are similar (β =.74 ; p= .02). Discussed in terms of differences in infant age, sample size, sex distribution and language, these findings can still contribute to the growing list of benefits of talker variability for early language acquisition.

  • 30.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Laskowski, Kornel
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Aare, Kätlin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Improving Prediction of Speech Activity Using Multi-Participant Respiratory State2017In: Proceedings of Interspeech 2017 / [ed] Francisco Lacerda, David House, Mattias Heldner, Joakim Gustafson, Sofia Strömbergsson, Marcin Włodarczak, Stockholm: The International Speech Communication Association (ISCA), 2017, p. 1666-1670Conference paper (Refereed)
    Abstract [en]

    One consequence of situated face-to-face conversation is the co- observability of participants’ respiratory movements and sounds. We explore whether this information can be exploited in pre- dicting incipient speech activity. Using a methodology called stochastic turn-taking modeling, we compare the performance of a model trained on speech activity alone to one additionally trained on static and dynamic lung volume features. The method- ology permits automatic discovery of temporal dependencies across participants and feature types. Our experiments show that respiratory information substantially lowers cross-entropy rates, and that this generalizes to unseen data. 

  • 31.
    Marklund, Ellen
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Cortes, Elísabet Eir
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Sjons, Johan
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    MMN responses in adults after exposure to bimodal and unimodal frequency distributions of rotated speech2017In: Proceedings of Interspeech 2017 / [ed] Francisco Lacerda, David House, Mattias Heldner, Joakim Gustafson, Sofia Strömbergsson, Marcin Włodarczak, The International Speech Communication Association (ISCA), 2017, p. 1804-1808Conference paper (Refereed)
    Abstract [en]

    The aim of the present study is to further the understanding of the relationship between perceptual categorization and exposure to different frequency distributions of sounds. Previous studies have shown that speech sound discrimination proficiency is in- fluenced by exposure to different distributions of speech sound continua varying along one or several acoustic dimensions, both in adults and in infants. In the current study, adults were presented with either a bimodal or a unimodal frequency distri- bution of spectrally rotated sounds along a continuum (a vowel continuum before rotation). Categorization of the sounds, quantified as amplitude of the event-related potential (ERP) component mismatch negativity (MMN) in response to two of the sounds, was measured before and after exposure. It was expected that the bimodal group would have a larger MMN amplitude after exposure whereas the unimodal group would have a smaller MMN amplitude after exposure. Contrary to expectations, the MMN amplitude was smaller overall after exposure, and no difference was found between groups. This suggests that either the previously reported sensitivity to frequency distributions of speech sounds is not present for non-speech sounds, or the MMN amplitude is not a sensitive enough measure of categorization to detect an influence from passive exposure, or both.

  • 32.
    Marklund, Ellen
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Schwarz, Iris-Corinna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    MMR categorization effect at 8 months is related toreceptive vocabulary size at 12 to 14 months2017In: Many Paths to Language (MPaL), 2017, p. 91-92Conference paper (Refereed)
  • 33.
    Wlodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Respiratory Constraints in Verbal and Non-verbal Communication2017In: Frontiers in Psychology, ISSN 1664-1078, E-ISSN 1664-1078, Vol. 8, article id 708Article in journal (Refereed)
    Abstract [en]

    In the present paper we address the old question of respiratory planning in speech production. We recast the problem in terms of speakers' communicative goals and propose that speakers try to minimize respiratory effort in line with the H&H theory. We analyze respiratory cycles coinciding with no speech (i.e., silence), short verbal feedback expressions (SFE's) as well as longer vocalizations in terms of parameters of the respiratory cycle and find little evidence for respiratory planning in feedback production. We also investigate timing of speech and SFEs in the exhalation and contrast it with nods. We find that while speech is strongly tied to the exhalation onset, SFEs are distributed much more uniformly throughout the exhalation and are often produced on residual air. Given that nods, which do not have any respiratory constraints, tend to be more frequent toward the end of an exhalation, we propose a mechanism whereby respiratory patterns are determined by the trade-off between speakers' communicative goals and respiratory constraints.

  • 34.
    Zora, Hatice
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Magnusson, Anna
    Rudner, Mary
    The effect of visual deprivation on prosodic processing2017Conference paper (Refereed)
  • 35.
    Schwarz, Iris-Corinna
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Botros, Noor
    Lord, Alekzandra
    Marcusson, Amelie
    Tidelius, Henrik
    Marklund, Ellen
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    The LENATM system applied to Swedish: Reliability of the Adult Word Count estimate2017In: Proceedings of Interspeech 2017 / [ed] Francisco Lacerda, David House, Mattias Heldner, Joakim Gustafson, Sofia Strömbergsson, Marcin Włodarczak, The International Speech Communication Association (ISCA), 2017, p. 2088-2092Conference paper (Refereed)
    Abstract [en]

    The Language Environment Analysis system LENATM is used to capture day-long recordings of children’s natural audio environment. The system performs automated segmentation of the recordings and provides estimates for various measures. One of those measures is Adult Word Count (AWC), an approximation of the number of words spoken by adults in close proximity to the child. The LENA system was developed for and trained on American English, but it has also been evaluated on its performance when applied to Spanish, Mandarin and French. The present study is the first evaluation of the LENA system applied to Swedish, and focuses on the AWC estimate. Twelve five-minute segments were selected at random from each of four day-long recordings of 30-month-old children. Each of these 48 segments was transcribed by two transcribers,and both number of words and number of vowels were calculated (inter-transcriber reliability for words: r = .95,vowels: r = .93). Both counts correlated with the LENA system’s AWC estimate for the same segments (words: r = .67, vowels: r = .66). The reliability of the AWC as estimated by the LENA system when applied to Swedish is therefore comparableto its reliability for Spanish, Mandarin and French.

  • 36.
    Marklund, Ellen
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Schwarz, Iris-Corinna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Lacerda, Francisco
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Vowel categorization correlates with speech exposure in 8-month-olds2017Conference paper (Refereed)
    Abstract [en]

    During the first year of life, infants ability to discriminate non-native speech contrasts attenuates, whereas their ability to discriminate native contrasts improves. This transition reflects the development of speech sound categorization, and is hypothesized to be modulated by exposure to spoken language. The ERP mismatch response has been used to quantify discrimination ability in infants, and its amplitude has been shown to be sensitive to amount of speech exposure on group level (Rivera-Gaxiola et al., 2011). In the present ERP-study, the difference in mismatch response amplitudes for spoken vowels and for spectrally rotated vowels, quantifies categorization in 8-month-old infants (N=15, 7 girls). This categorization measure was tested for correlation with infants? daily exposure to male speech, female speech, and the sum of male and female speech, as measured by all-day home recordings and analyzed using LENA software. A positive correlation was found between the categorization measure and total amount of daily speech exposure (r = .526, p = .044). The present study is the first to report a relation between speech exposure and speech sound categorization in infants on subject level, and the first to compensate for the acoustic part of the mismatch response in this context.

  • 37.
    Salomão, Gláucia Laís
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Expressão vocal de emoções [Vocal expression of emotions]: metáfora sonora, fala e canto [Sound metaphors, speech and singing]2016In: Sonoridades [Sonorities]: A expressividade da fala, no canto e na declamação [Expressivity in speech, singing, and reciting] / [ed] Jayme Preto, Beatriz Gabriel, Pontíficia Universidade Católica de São Paulo , 2016, p. 31-43Chapter in book (Refereed)
    Abstract [en]

    The communication of emotions is crucial to social relationships and plays a fundamental role in maintaining the social order between people. In this chapter we are looking at the communication of emotions through two expressive modalities that make use of sound as a mean of communication, i.e. speech and singing. Throughout the text we argue in favor of the idea that the vocal expression of emotions reflects physiological aspects associated to the emotion itself that is expressed; that there are many similarities between the expressive patterns found in speech and in singing; and that the singing is expressive bacause it has traces of expressive patterns of speech.

  • 38.
    Zora, Hatice
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Riad, Tomas
    Stockholm University, Faculty of Humanities, Department of Swedish Language and Multilingualism, Scandinavian Languages.
    Schwarz, Iris-Corinna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Lexical Specification of Prosodic Information in Swedish: Evidence from Mismatch Negativity2016In: Frontiers in Neuroscience, ISSN 1662-4548, E-ISSN 1662-453X, Vol. 10, article id 533Article in journal (Refereed)
    Abstract [en]

    Like that of many other Germanic languages, the stress system of Swedish has mainly undergone phonological analysis. Recently, however, researchers have begun to recognize the central role of morphology in these systems. Similar to the lexical specification of tonal accent, the Swedish stress system is claimed to be morphologically determined and morphemes are thus categorized as prosodically specified and prosodically unspecified. Prosodically specified morphemes bear stress information as part of their lexical representations and are classified as tonic (i.e., lexically stressed), pretonic and posttonic, whereas prosodically unspecified morphemes receive stress through a phonological rule that is right-edge oriented, but is sensitive to prosodic specification at that edge. The presence of prosodic specification is inferred from vowel quality and vowel quantity; if stress moves elsewhere, vowel quality and quantity change radically in phonologically stressed morphemes, whereas traces of stress remain in lexically stressed morphemes. The present study is the first to investigate whether stress is a lexical property of Swedish morphemes by comparing mismatch negativity (MMN) responses to vowel quality and quantity changes in phonologically stressed and lexically stressed words. In a passive oddball paradigm, 15 native speakers of Swedish were presented with standards and deviants, which differed from the standards in formant frequency and duration. Given that vowel quality and quantity changes are associated with morphological derivations only in phonologically stressed words, MMN responses are expected to be greater in phonologically stressed words than in lexically stressed words that lack such an association. The results indicated that the processing differences between phonologically and lexically stressed words were reflected in the amplitude and topography of MMN responses. Confirming the expectation, MMN amplitude was greater for the phonologically stressed word than for the lexically stressed word and showed a more widespread topographic distribution. The brain did not only detect vowel quality and quantity changes but also used them to activate memory traces associated with derivations. The present study therefore implies that morphology is directly involved in the Swedish stress system and that changes in phonological shape due to stress shift cue upcoming stress and potential addition of a morpheme.

  • 39.
    Wirén, Mats
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Nilsson Björkenstam, Kristina
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Grigonytė, Gintarė
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Cortes, Elisabet Eir
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Longitudinal Studies of Variation Sets in Child-directed Speech2016In: The 54th Annual Meeting of the Association for Computational Linguistics: Proceedings of the 7th Workshop on Cognitive Aspects of Computational Language Learning, Stroudsburg, PA, USA: Association for Computational Linguistics, 2016, p. 44-52Conference paper (Refereed)
    Abstract [en]

    One of the characteristics of child-directed speech is its high degree of repetitiousness. Sequences of repetitious utterances with a constant intention, variation sets, have been shown to be correlated with children’s language acquisition. To obtain a baseline for the occurrences of variation sets in Swedish, we annotate 18 parent–child dyads using a generalised definition according to which the varying form may pertain not just to the wording but also to prosody and/or non-verbal cues. To facilitate further empirical investigation, we introduce a surface algorithm for automatic extraction of variation sets which is easily replicable and language-independent. We evaluate the algorithm on the Swedish gold standard, and use it for extracting variation sets in Croatian, English and Russian. We show that the proportion of variation sets in child-directed speech decreases consistently as a function of children's age across Swedish, Croatian, English and Russian.

  • 40.
    Zora, Hatice
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Schwarz, Iris-Corinna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Perceptual correlates of Turkish word stress and their contribution to automatic lexical access: Evidence from early ERP components2016In: Frontiers in Neuroscience, ISSN 1662-4548, E-ISSN 1662-453X, Vol. 10, article id 7Article in journal (Refereed)
    Abstract [en]

    Perceptual correlates of Turkish word stress and their contribution to lexical access were studied using the mismatch negativity (MMN) component in event-related potentials (ERPs). The MMN was expected to indicate if segmentally identical Turkish words were distinguished on the sole basis of prosodic features such as fundamental frequency (f0), spectral emphasis (SE) and duration. The salience of these features in lexical access was expected to be reflected in the amplitude of MMN responses. In a multi-deviant oddball paradigm, neural responses to changes in f0, SE, and duration individually, as well as to all three features combined, were recorded for words and pseudowords presented to 14 native speakers of Turkish. The word and pseudoword contrast was used to differentiate language-related effects from acoustic-change effects on the neural responses. First and in line with previous findings, the overall MMN was maximal over frontal and central scalp locations. Second, changes in prosodic features elicited neural responses both in words and pseudowords, confirming the brain’s automatic response to any change in auditory input. However, there were processing differences between the prosodic features, most significantly in f0: While f0 manipulation elicited a slightly right-lateralized frontally-maximal MMN in words, it elicited a frontal P3a in pseudowords. Considering that P3a is associated with involuntary allocation of attention to salient changes, the manipulations of f0 in the absence of lexical processing lead to an intentional evaluation of pitch change. f0 is therefore claimed to be lexically specified in Turkish. Rather than combined features, individual prosodic features differentiate language-related effects from acoustic-change effects. The present study confirms that segmentally identical words can be distinguished on the basis of prosodic information alone, and establishes the salience of f0 in lexical access.

  • 41.
    Schwarz, Iris-Corinna
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Marklund, Ellen
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Dybäck, Matilda
    Wallgren, Johanna
    Uhlén, Inger
    Pupil dilation indicates auditory signal detection - towards an objective hearing test based on eye-tracking2016Conference paper (Refereed)
    Abstract [en]

    Purpose: The long-term objective of this project is to develop an objective hearing threshold test that can be used in early infancy, using pupildilation as an indicator of hearing. The study purposes are 1) to identify relevant time-windows for analysis of pupillary responses to various auditory stimuli in adults, and 2) to evaluate a trial-minus-baseline approach to deal with unrelated pupillary responses in adults. Method: Participants’ pupil size is recorded using a Tobii T120 Eye-tracker. In the first test, participants fixate on a blank screen while sound stimuli are presented. From this data, typical pupillary responses and the relevant analysis time-window is determined and used in future tests. In the second test, participants watch movie clips while sound stimuli are presented. Visually identical sound and no-sound trials will be compared in order to isolate the pupillary changes tied to hearing sound from those related to changes in brightness in the visual stimuli. Results and conclusion: Data is currently being collected. Results from the pilot study indicate that the pupillary response related to sound detection occurs at around 900 ms after stimulus onset, and that a trial-minus-baseline approach is a viable option to eliminate unrelated pupillary responses.

  • 42.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Respiratory belts and whistles: A preliminary study of breathing acoustics for turn-taking2016In: Proceedings of Interspeech 2016 / [ed] Nelson Morgan, International Speech Communication Association, 2016, p. 510-514Conference paper (Refereed)
    Abstract [en]

    This paper presents first results on using acoustic intensity of inhalations as a cue to speech initiation in spontaneous multiparty conversations. We demonstrate that inhalation intensity significantly differentiates between cycles coinciding with no speech activity, shorter (< 1 s) and longer stretches of speech. While the model fit is relatively weak, it is comparable to the fit of a model using kinematic features collected with Respiratory Inductance Plethysmography. We also show that incorpo- rating both kinematic and acoustic features further improves the model. Given the ease of capturing breath acoustics, we consider the results to be a promising first step towards studying communicative functions of respiratory sounds. We discuss possible extensions to the data collection procedure with a view to improving predictive power of the model. 

  • 43.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Respiratory turn-taking cues2016In: Proceedings of Interspeech 2016 / [ed] Nelson Morgan, The International Speech Communication Association (ISCA), 2016, p. 1275-1279Conference paper (Refereed)
    Abstract [en]

    This paper investigates to what extent breathing can be used as a cue to turn-taking behaviour. The paper improves on existing accounts by considering all possible transitions between speaker states (silent, speaking, backchanneling) and by not relying on global speaker models. Instead, all features (including breathing range and resting expiratory level) are estimated in an incremental fashion using the left-hand context. We identify several inhalatory features relevant to turn-management, and assess the fit of models with these features as predictors of turn-taking behaviour.

  • 44.
    Kallioinen, Petter
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics. Lund University, Sweden.
    Olofsson, Jonas
    Stockholm University, Faculty of Social Sciences, Department of Psychology, Perception and psychophysics.
    Nakeva von Mentzer, Cecilia
    Lindgren, Magnus
    Ors, Marianne
    Sahlén, Birgitta S.
    Lyxell, Björn
    Engström, Elisabet
    Uhlén, Inger
    Semantic Processing in Deaf and Hard-of-Hearing Children: Large N400 Mismatch Effects in Brain Responses, Despite Poor Semantic Ability2016In: Frontiers in Psychology, ISSN 1664-1078, E-ISSN 1664-1078, Vol. 7, article id 1146Article in journal (Refereed)
    Abstract [en]

    Difficulties in auditory and phonological processing affect semantic processing in speech comprehension for deaf and hard-of-hearing (DHH) children. However, little is known about brain responses related to semantic processing in this group. We investigated event-related potentials (ERPs) in DHH children with cochlear implants (CIs) and/or hearing aids (HAs), and in normally hearing controls (NH). We used a semantic priming task with spoken word primes followed by picture targets. In both DHH children and controls, cortical response differences between matching and mismatching targets revealed a typical N400 effect associated with semantic processing. Children with CI had the largest mismatch response despite poor semantic abilities overall; Children with CI also had the largest ERP differentiation between mismatch types, with small effects in within-category mismatch trials (target from same category as prime) and large effects in between-category mismatch trials (where target is from a different category than prime), compared to matching trials. Children with NH and HA had similar responses to both mismatch types. While the large and differentiated ERP responses in the CI group were unexpected and should be interpreted with caution, the results could reflect less precision in semantic processing among children with CI, or a stronger reliance on predictive processing.

  • 45.
    Eriksson, Anders
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Bertinetto, Pier Marco
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Nodari, Rosalba
    Lenoci, Giovanna
    The Acoustics of Lexical Stress in Italian as a Function of Stress Level and Speaking Style2016In: Proceedings of Interspeech 2016 / [ed] Nelson Morgan, The International Speech Communication Association (ISCA), 2016, p. 1059-1063Conference paper (Refereed)
    Abstract [en]

    The study is part of a series of studies, describing the acoustics of lexical stress in a way that should be applicable to any language. The present database of recordings includes Brazilian Portuguese, English, Estonian, German, French, Italian and Swedish. The acoustic parameters examined are F0-level, F0- variation, Duration, and Spectral Emphasis. Values for these parameters, computed for all vowels (a little over 24000 vowels for Italian), are the data upon which the analyses are based. All parameters are examined with respect to their correlation with Stress (primary, secondary, unstressed) and speaking Style (wordlist reading, phrase reading, spontaneous speech) and Sex of the speaker (female, male). For Italian Duration was found to be the dominant factor by a wide margin, in agreement with previous studies. Spectral Emphasis was the second most important factor. Spectral Emphasis has not been studied previously for Italian but intensity, a related parameter, has been shown to correlate with stress. F0-level was also significantly correlated but not to the same degree. Speaker Sex turned out as significant in many comparisons. The differences were, however, mainly a function of the degree to which a given parameter was used, not how it was used to signal lexical stress contrasts. 

  • 46.
    Gerholm, Tove
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    The relation between modalities in spoken language acquisition: Preliminary results from the Swedish MINT-project2016Conference paper (Other academic)
  • 47.
    Forssén Renner, Lena
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wlodarzcak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    The surprised pupil: New perspectives in semantic processing research2016In: ISSBD 2016, 2016Conference paper (Refereed)
    Abstract [en]

    In the research on semantic processing and brain activity, the N400-paradigm has been long known to reflect a reaction to unexpected events, for instance the incongruence between visual and verbal information when subjects are presented with a picture and a mismatching word. In the present study, we investigate whether an N400-like reaction to unexpected events can be captured with pupillometry. While earlier research has firmly established a connection between changes in pupil diameter and arousal, the findings have not been so far extended to the domain of semantic processing. Consequently, we measured pupil size change in reaction to a match or a mismatch between a picture and an auditorily presented word. We presented 120 trials to ten native speakers of Swedish. In each trial a picture was displayed for six seconds, and 2.5 seconds into the trial the word was played through loudspeakers. The picture and the word were matching in half of the trials, and all stimuli were common high-frequency monosyllabic Swedish words. For the analysis, the baseline pupil size at the sound playback onset was compared against the maximum pupil size in the following time window of 3.5 seconds. The results show a statistically significant difference (t(746)=-2.8, p < 0.01) between the conditions. In line with the hypothesis, the pupil was observed to dilate more in the incongruent condition (on average by 0.03 mm). While the results are preliminary, they suggest that pupillometry could be a viable alternative to existing methods in the field of language processing, for instance across different ages and clinical groups. In the future, we intend to validate the results on a larger sample of participants as well as expand the analysis with a view to locating temporal regions of greatest differences between the conditions. In the future, we intend to validate the results on a larger sample of participants as well as expand the analysis with a functional analysis accounting for temporal changes in the data. This will allow locating temporal regions of greatest differences between the conditions.

  • 48.
    Renner, Lena
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Kallioinen, Petter
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Markelius, Marie
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Sundberg, Ulla
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Brain responses to typical mispronunciations among toddlers2015Conference paper (Refereed)