Change search
Refine search result
123 1 - 100 of 290
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Gerholm, Tove
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Hörberg, Thomas
    Stockholm University, Faculty of Humanities, Department of Linguistics, General Linguistics. Stockholm University, Faculty of Social Sciences, Department of Psychology, Perception and psychophysics.
    Tonér, Signe
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Kallioinen, Petter
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Frankenberg, Sofia
    Stockholm University, Faculty of Social Sciences, Department of Child and Youth Studies.
    Kjällander, Susanne
    Stockholm University, Faculty of Social Sciences, Department of Child and Youth Studies.
    Palmer, Anna
    Stockholm University, Faculty of Social Sciences, Department of Child and Youth Studies.
    Lenz Taguchi, Hillevi
    Stockholm University, Faculty of Social Sciences, Department of Child and Youth Studies.
    A protocol for a three-arm cluster randomized controlled superiority trial investigating the effects of two pedagogical methodologies in Swedish preschool settings on language and communication, executive functions, auditive selective attention, socioemotional skills and early maths skills2018In: BMC Psychology, E-ISSN 2050-7283, Vol. 6, article id 29Article in journal (Refereed)
    Abstract [en]

    Background

    During the preschool years, children develop abilities and skills in areas crucial for later success in life. These abilities include language, executive functions, attention, and socioemotional skills. The pedagogical methods used in preschools hold the potential to enhance these abilities, but our knowledge of which pedagogical practices aid which abilities, and for which children, is limited. The aim of this paper is to describe an intervention study designed to evaluate and compare two pedagogical methodologies in terms of their effect on the above-mentioned skills in Swedish preschool children.

    Method

    The study is a randomized control trial (RCT) where two pedagogical methodologies were tested to evaluate how they enhanced children’s language, executive functions and attention, socioemotional skills, and early maths skills during an intensive 6-week intervention. Eighteen preschools including 28 units and 432 children were enrolled in a municipality close to Stockholm, Sweden. The children were between 4;0 and 6;0 years old and each preschool unit was randomly assigned to either of the interventions or to the control group. Background information on all children was collected via questionnaires completed by parents and preschools. Pre- and post-intervention testing consisted of a test battery including tests on language, executive functions, selective auditive attention, socioemotional skills and early maths skills. The interventions consisted of 6 weeks of intensive practice of either a socioemotional and material learning paradigm (SEMLA), for which group-based activities and interactional structures were the main focus, or an individual, digitally implemented attention and math training paradigm, which also included a set of self-regulation practices (DIL). All preschools were evaluated with the ECERS-3.

    Discussion

    If this intervention study shows evidence of a difference between group-based learning paradigms and individual training of specific skills in terms of enhancing children’s abilities in fundamental areas like language, executive functions and attention, socioemotional skills and early math, this will have big impact on the preschool agenda in the future. The potential for different pedagogical methodologies to have different impacts on children of different ages and with different backgrounds invites a wider discussion within the field of how to develop a preschool curriculum suited for all children.

  • 2.
    Cortes, Elisabet Eir
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Simko, Juraj
    University of Helsinki.
    Articulatory Consequences of Vocal Effort Elicitation Method2018In: Proceedings of Interspeech 2018, Hyderabad, India: The International Speech Communication Association (ISCA), 2018, p. 1521-1525Conference paper (Refereed)
    Abstract [en]

    Articulatory features from two datasets, Slovak and Swedish, were compared to see whether different methods of eliciting loud speech (ambient noise vs. visually presented loudness target) result in different articulatory behavior. The features studied were temporal and kinematic characteristics of lip separation within the closing and opening gestures of bilabial consonants, and of the tongue body movement from /i/ to /a/ through a bilabial consonant. The results indicate larger hyper - articulation in the speech elicited with visually presented target. While individual articulatory strategies are evident, t he speaker groups agree on increasing the kinematic features consistently within each gesture in response to the increased vocal effort. Another concerted strategy is keeping the tongue response considerably smaller than that of the lips, presumably to preserve acoustic prerequisites necessary for the adequate vowel identity. While the method of visually presented loudness target elicits larger span of vocal effort, the two elicitation methods achieve comparable consistency per loudness conditions.

  • 3.
    Aare, Kätlin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Lippus, Pärtel
    University of Tartu.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Creak in the respiratory cycle2018In: Proceedings of Interspeech 2018, Hyderabad, India: The International Speech Communication Association (ISCA), 2018, p. 1408-1412-Conference paper (Refereed)
    Abstract [en]

    Creakiness is a well-known turn-taking cue and has been observed to systematically accompany phrase and turn ends in several languages. In Estonian, creaky voice is frequently used by all speakers without any obvious evidence for its systematic use as a turn-taking cue. Rather, it signals a lack of prominence and is favored by lengthening and later timing in phrases. In this paper, we analyze the occurrence of creak with respect to properties of the respiratory cycle. We show that creak is more likely to accompany longer exhalations. Furthermore, the results suggest there is little difference in lung volume values regardless of the presence of creak, indicating that creaky voice might be employed to preserve air over the course of longer utterances. We discuss the results in connection to processes of speech planning in spontaneous speech.

  • 4.
    Heldner, Mattias
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wagner, Petra
    Bielefeld University.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Deep throat as a source of information2018In: Proceedings Fonetik 2018, Gothenburg, 2018, p. 33-38Conference paper (Other academic)
    Abstract [en]

    In this pilot study we explore the signal from an accelerometer placed on the tracheal wall (below the glottis) for obtaining robust voice quality estimates. We investigate cepstral peak prominence smooth, H1-H2 and alpha ratio for distinguishing between breathy, modal and pressed phonation across six (sustained) vowel qualities produced by four speakers and including a systematic variation of pitch. We show that throat signal spectra are unaffected by vocal tract resonances, F0 and speaker variation while retaining sensitivity to voice quality dynamics. We conclude that the throat signal is a promising tool for studying communicative functions of voice prosody in speech communication.

  • 5.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Exhalatory markers of turn completion2018In: Proceedings Speech Prosody 2018, Poznań, Poland: The International Speech Communication Association (ISCA), 2018, p. 334-338Conference paper (Refereed)
    Abstract [en]

    This paper is a study of kinematic features of the exhalation which signal that the speaker is done speaking and wants to yield the turn. We demonstrate that the single most prominent feature is the presence of inhalation directly following the exhalation. However, several features of the exhalation itself are also found to significantly distinguish between turn holds and yields, such as slower exhalation rate and higher lung level at exhalation onset. The results complement the existing body of evidence on respiratory turn-taking cues which has so far involved mainly inhalatory features. We also show that respiration allows discovering pause interruptions thus allowing access to unrealised turn-taking intentions.

  • 6.
    Traunmüller, Hartmut
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Towards a More Well-Founded Cosmology2018In: Zeitschrift fur Naturforschung A-A Journal of Physical Sciences, ISSN 0932-0784, E-ISSN 1865-7109, Vol. 73, no 11, p. 1005-1023Article in journal (Refereed)
    Abstract [en]

    First, this paper broaches the definition of science and the epistemic yield of tenets and approaches: phenomenological (descriptive only), well founded (solid first principles, conducive to deep understanding), provisional (falsifiable if universal, verifiable if existential), and imaginary (fictitious entities or processes, conducive to empirically unsupported beliefs). The Big Bang paradigm and the ΛCDM ‘concordance model’ involve such beliefs: the emanation of the universe out of a non-physical stage, cosmic inflation (hardly testable), Λ (fictitious energy), and ‘exotic’ dark matter. They fail in the confidence check that empirical science requires. They also face a problem in delimiting what expands from what does not. In the more well-founded cosmology that emerges, energy is conserved, the universe is persistent (not transient), and the ‘perfect cosmological principle’ holds. Waves and other field perturbations that propagate at c (the escape velocity of the universe) expand exponentially with distance. This results from gravitation. The galaxy web does not expand. Potential Φ varies as −H/(cz) instead of −1/r. Inertial forces reflect gradients present in comoving frames of accelerated bodies (interaction with the rest of the universe – not with space). They are increased where the universe appears blue-shifted and decreased more than proportionately at very low accelerations. A cut-off acceleration a0 = 0.168 cH is deduced. This explains the successful description of galaxy rotation curves by “Modified Newtonian Dynamics”. A fully elaborated physical theory is still pending. The recycling of energy via a cosmic ocean filled with photons (the cosmic microwave background), neutrinos and gravitons, and the wider implications for science are briefly discussed.

  • 7.
    Marklund, Ellen
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Lacerda, Francisco
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Schwarz, Iris-Corinna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Using rotated speech to approximate the acoustic mismatch negativity response to speech2018In: Brain and Language, ISSN 0093-934X, E-ISSN 1090-2155, Vol. 176, p. 26-35Article in journal (Refereed)
    Abstract [en]

    The mismatch negativity (MMN) response is influenced by the magnitude of the acoustic difference between standard and deviant, and the response is typically larger to linguistically relevant changes than to linguistically irrelevant changes. Linguistically relevant changes between standard and deviant typically co-occur with differences between the two acoustic signals. It is therefore not straightforward to determine the contribution of each of those two factors to the MMN response. This study investigated whether spectrally rotated speech can be used to determine the impact of the acoustic difference on the MMN response to a combined linguistic and acoustic change between standard and deviant. Changes between rotated vowels elicited an MMN of comparable amplitude to the one elicited by a within-category vowel change, whereas the between-category vowel change resulted in an MMN amplitude of greater magnitude. A change between rotated vowels resulted in an MMN ampltude more similar to that of a within-vowel change than a complex tone change did. This suggests that the MMN amplitude reflecting the acoustic difference between two speech sounds can be well approximated by the MMN amplitude elicited in response to their rotated counterparts, in turn making it possible to estimate the part of the response specific to the linguistic difference.

  • 8.
    Ćwiek, Aleksandra
    et al.
    Bielefeld University.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wagner, Petra
    Bielefeld University.
    Acoustics and discourse function of two types of breathing signals2017In: Nordic Prosody: Proceedings of the XIIth Conference, Trondheim 2016 / [ed] Eggesbø Abrahamsen, Jardar Koreman, Jacques van Dommelen, Wim A., Frankfurt am Main: Peter Lang Publishing Group, 2017, p. 83-91Conference paper (Refereed)
    Abstract [en]

    Breathing is fundamental for living and speech, and it has been a subject of linguistic research for years. Recently, there has been a renewed interest in tackling the question of possible communicative functions of breathing (e.g. Rochet-Capellan & Fuchs, 2014; Aare, Włodarczak & Heldner, 2014; Włodarczak & Heldner, 2015; Włodarczak, Heldner, & Edlund, 2015). The present study set out to determine acoustic markedness and communicative functions of pauses accompanied and non-accompanied by breathing. We hypothesised that an articulatory reset occurring in breathing pauses and an articulatory freeze in non-breathing pauses differentiates between the two types. A production experiment was conducted and some evidence in favour of such a phenomenon was found. Namely, in case of non-breathing pauses, we observed more coarticulation evidenced by a more frequent omission of plosive releases. Our findings thus give some evidence in favour of the communicative function of breathing.

  • 9.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Capturing respiratory sounds with throat microphones2017In: Nordic Prosody: Proceedings of the XIIth Conference, Trondheim 2016 / [ed] Eggesbø Abrahamsen, Jardar Koreman, Jacques van Dommelen, Wim A., Frankfurt am Main: Peter Lang Publishing Group, 2017, p. 181-190Conference paper (Refereed)
    Abstract [en]

    This paper presents the results of a pilot study using throat microphones for recording respiratory sounds. We demonstrate that inhalation noises are louder before longer stretches of speech than before shorter utterances (< 1 s) and in silent breathing. We thus replicate the results from our earlier study which used close-talking head-mounted microphones, without the associated data loss due to cross-talk. We also show that inhalations are louder within than before a speaking turn. Hence, the study provides another piece of evidence in favour of communicative functions of respiratory noises serving as potential turn-taking (for instance, turn-holding) cues. 

  • 10.
    Marklund, Ellen
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Pagmar, David
    Stockholm University, Faculty of Humanities, Department of Linguistics, General Linguistics.
    Gerholm, Tove
    Stockholm University, Faculty of Humanities, Department of Linguistics, General Linguistics.
    Gustavsson, Lisa
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Computational simulations of temporal vocalization behavior in adult-child interaction2017In: Proceedings of Interspeech 2017, 2017, p. 2208-2212Conference paper (Refereed)
    Abstract [en]

    The purpose of the present study was to introduce a computational simulation of timing in child-adult interaction. The simulation uses temporal information from real adult-child interactions as default temporal behavior of two simulated agents. Dependencies between the agents’ behavior are added, and how the simulated interactions compare to real interaction data as a result is investigated. In the present study, the real data consisted of transcriptions of a mother interacting with her 12- month-old child, and the data simulated was vocalizations. The first experiment shows that although the two agents generate vocalizations according to the temporal characteristics of the interlocutors in the real data, simulated interaction with no contingencies between the two agents’ behavior differs from real interaction data. In the second experiment, a contingency was introduced to the simulation: the likelihood that the adult agent initiated a vocalization if the child agent was already vocalizing. Overall, the simulated data is more similar to the real interaction data when the adult agent is less likely to start speaking while the child agent vocalizes. The results are in line with previous studies on turn-taking in parent-child interaction at comparable ages. This illustrates that computational simulations are useful tools when investigating parent-child interactions.

  • 11.
    Schwarz, Iris-Corinna
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Marklund, Ulrika
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Marklund, Ellen
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Contingency differences in parent-infant turn-taking between primary and secondary caregivers in relation to turn-taking experience2017In: Many Paths to Language (MPaL), 2017, p. 59-60Conference paper (Refereed)
    Abstract [en]

    Contingent turn-taking between parents and infants is positively correlated with child language outcome (Tamis-LeMonda, Bornstein & Baumwell, 2001; Marklund, Marklund, Lacerda & Schwarz, 2015). Many studies focus exclusively on mothers (e.g., Sung, Fausto-Sterling, Garcia Coll & Seifer, 2013). However, infants in Western countries acquire language with input both from mothers and fathers in varying degree, depending on how the family chooses to organize their parental leave. Sweden is an ideal country to study both mothers and fathers as caregivers for infants.

    Parental contingency is often reported as response frequency within a time window after infant vocalizations (e.g., Johnson, Caskey, Rand, Tucker & Vohr, 2014). In this study, turn-taking contingency is measured by the duration of parent-child and child-parent switching pauses around infant vocalization with potential communicative intent. Fourteen (7 girls) infants and their primary and secondary caregivers were recorded in the family home when the infant was six months (M = 5 months 29 days, range: 5 months 3 days – 6 months 16 days). The audio recordings were collected two different days and lasted approximately ten minutes each. One of the days was a typical weekday on which the primary caregiver – in all cases the mother – was at home with the infant. The other day was a typical weekend day on which also the secondary caregiver – in all cases the father – was at home and spent time with the infant. On each of these days, a daylong LENA recording was also made to estimate the amount of exposure to female and male speech input on a typical day. Using Wavesurfer 1.8.5 (Sjölander & Beskow, 2010), on- and offset of all infant vocalizations were tagged as well as on- and offset for the surrounding switching pauses. If parent utterance and infant vocalization overlapped, switching pause duration received a negative value.

    Two repeated measures ANOVAs were used to determine the effects of caregiver type (primary/secondary) and infant sex (girl/boy) on pause duration in infant-parent and parent-infant switching pauses. A main effect was found for caregiver type in infant-parent switching pauses (F(12,1) = 5.214; p = .041), as primary caregivers responded on average about 500 ms faster to infant vocalizations than secondary caregivers, with no effect of or interaction with infant sex. In parent-infant switching pauses, the main effect for caregiver type was almost significant (F(12,1) = 4.574; p = .054), with no effect of or interaction with infant sex. It is therefore fair to say that turn-taking between primary caregivers and 6-month-olds is more contingent than turn-taking between secondary caregivers and 6-month-olds.

    Four linear regressions were then used to predict parent-infant and infant-parent switching pause duration from the average duration of female speech exposure and the average duration of male speech exposure across the two days, with the assumption that female speech duration equals speech input from the primary caregiver and male speech duration the secondary caregiver. None of the regression analyses turned out to be significant. However, it is likely that the greater contingency between primary caregivers and the infant is a function of greater turn-taking experience, that is, conversational turns rather than mere exposure to speech. Therefore, we will look next at the number of conversational turns for each caregiver separately and investigate whether they predict parental response contingency.

    The present study shows that vocal turn-taking is more contingent between infants and primary caregivers than with secondary caregivers. Primary caregivers respond significantly faster to infant vocalizations than secondary caregivers and in turn, infants have a tendency to respond faster to primary caregivers. It is likely that this relationship is mediated by turn-taking experience, although this could not be shown with regression analyses using LENA estimates of total duration of speech exposure to primary and secondary caregiver.

     

     

  • 12.
    Šimko, Juraj
    et al.
    University of Helsinki.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Suni, Antti
    University of Helsinki.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Vainio, Martti
    University of Helsinki.
    Coordination between f0, intensity and breathing signals2017In: Nordic Prosody: Proceedings of the XIIth Conference, Trondheim 2016 / [ed] Eggesbø Abrahamsen, Jardar Koreman, Jacques van Dommelen, Wim A., Frankfurt am Main: Peter Lang Publishing Group, 2017, p. 147-156Conference paper (Refereed)
    Abstract [en]

    The present paper presents preliminary results on temporal coordination of breathing, intensity and fundamental frequency signals using continuous wavelet transform. We have found tendencies towards phase-locking at time scales corresponding to several prosodic units such as vowel-to-vowel intervals and prosodic words. The proposed method should be applicable to a wide range of problems in which the goal is finding a stable phase relationship in a pair of hierarchically organised signals.

  • 13. Lam-Cassettari, Christa
    et al.
    Marklund, Ellen
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Schwarz, Iris-Corinna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Daddy counts: Australian and Swedish fathers? early speech input reflects infants? receptive vocabulary at 12 months 2017Conference paper (Other academic)
    Abstract [en]

    Parental input is known to predict language development. This study uses the LENA input duration estimates for female and male voices in two infant language environments, Australian English and Swedish, to predict receptive vocabulary size at 12 months. The Australian English learning infants were 6 months (N = 18, 8 girls), the Swedish learning infants were 8 months (N = 12, 6 girls). Their language environment was recorded on two days: one weekday in the primary care of the mother, and one weekend day when also the father spent time with the family. At 12 months, parents filled in a CDI form, the OZI for Australian English and the SECDI‐I for Swedish. In multiple regressions across languages, only male speech input duration predicted vocabulary scores significantly (β = .56;p = .01). Analysing boys and girls separately, male speech input predicts only boys’ vocabulary (β =.79 ; p= .01). Analysing languages separately for boys, the Australian English results are similar (β =.74 ; p= .02). Discussed in terms of differences in infant age, sample size, sex distribution and language, these findings can still contribute to the growing list of benefits of talker variability for early language acquisition.

  • 14.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Laskowski, Kornel
    Carnegie Mellon University.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Aare, Kätlin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Improving Prediction of Speech Activity Using Multi-Participant Respiratory State2017In: Proceedings of the 18th Annual Conference of the International Speech Communication Association (INTERSPEECH 2017) / [ed] Włodarczak, Marcin, Stockholm: The International Speech Communication Association (ISCA), 2017, p. 1666-1670Conference paper (Refereed)
    Abstract [en]

    One consequence of situated face-to-face conversation is the co- observability of participants’ respiratory movements and sounds. We explore whether this information can be exploited in pre- dicting incipient speech activity. Using a methodology called stochastic turn-taking modeling, we compare the performance of a model trained on speech activity alone to one additionally trained on static and dynamic lung volume features. The method- ology permits automatic discovery of temporal dependencies across participants and feature types. Our experiments show that respiratory information substantially lowers cross-entropy rates, and that this generalizes to unseen data. 

  • 15.
    Marklund, Ellen
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Cortes, Elísabet Eir
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Sjons, Johan
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    MMN responses in adults after exposure to bimodal and unimodal frequency distributions of rotated speech2017In: Proceedings of Interspeech 2017, The International Speech Communication Association (ISCA), 2017, p. 1804-1808Conference paper (Refereed)
    Abstract [en]

    The aim of the present study is to further the understanding of the relationship between perceptual categorization and exposure to different frequency distributions of sounds. Previous studies have shown that speech sound discrimination proficiency is in- fluenced by exposure to different distributions of speech sound continua varying along one or several acoustic dimensions, both in adults and in infants. In the current study, adults were presented with either a bimodal or a unimodal frequency distri- bution of spectrally rotated sounds along a continuum (a vowel continuum before rotation). Categorization of the sounds, quantified as amplitude of the event-related potential (ERP) component mismatch negativity (MMN) in response to two of the sounds, was measured before and after exposure. It was expected that the bimodal group would have a larger MMN amplitude after exposure whereas the unimodal group would have a smaller MMN amplitude after exposure. Contrary to expectations, the MMN amplitude was smaller overall after exposure, and no difference was found between groups. This suggests that either the previously reported sensitivity to frequency distributions of speech sounds is not present for non-speech sounds, or the MMN amplitude is not a sensitive enough measure of categorization to detect an influence from passive exposure, or both.

  • 16.
    Marklund, Ellen
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Schwarz, Iris-Corinna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    MMR categorization effect at 8 months is related toreceptive vocabulary size at 12 to 14 months2017In: Many Paths to Language (MPaL), 2017, p. 91-92Conference paper (Refereed)
  • 17.
    Wlodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Respiratory Constraints in Verbal and Non-verbal Communication2017In: Frontiers in Psychology, ISSN 1664-1078, E-ISSN 1664-1078, Vol. 8, article id 708Article in journal (Refereed)
    Abstract [en]

    In the present paper we address the old question of respiratory planning in speech production. We recast the problem in terms of speakers' communicative goals and propose that speakers try to minimize respiratory effort in line with the H&H theory. We analyze respiratory cycles coinciding with no speech (i.e., silence), short verbal feedback expressions (SFE's) as well as longer vocalizations in terms of parameters of the respiratory cycle and find little evidence for respiratory planning in feedback production. We also investigate timing of speech and SFEs in the exhalation and contrast it with nods. We find that while speech is strongly tied to the exhalation onset, SFEs are distributed much more uniformly throughout the exhalation and are often produced on residual air. Given that nods, which do not have any respiratory constraints, tend to be more frequent toward the end of an exhalation, we propose a mechanism whereby respiratory patterns are determined by the trade-off between speakers' communicative goals and respiratory constraints.

  • 18.
    Schwarz, Iris-Corinna
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Botros, Noor
    Karolinska Institutet.
    Lord, Alekzandra
    Karolinska Institutet.
    Marcusson, Amelie
    Karolinska Institutet.
    Tidelius, Henrik
    Karolinska Institutet.
    Marklund, Ellen
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    The LENATM system applied to Swedish: Reliability of the Adult Word Count estimate2017In: Proceedings of the 18th Annual Conference of the International Speech Communication Association (INTERSPEECH 2017) / [ed] Marcin Wlodarczak, STockholm: The International Speech Communication Association (ISCA), 2017, p. 2088-2092, article id 1287Conference paper (Refereed)
    Abstract [en]

    The Language Environment Analysis system LENATM is used to capture day-long recordings of children’s natural audio environment. The system performs automated segmentation of the recordings and provides estimates for various measures. One of those measures is Adult Word Count (AWC), an approximation of the number of words spoken by adults in close proximity to the child. The LENA system was developed for and trained on American English, but it has also been evaluated on its performance when applied to Spanish, Mandarin and French. The present study is the first evaluation of the LENA system applied to Swedish, and focuses on the AWC estimate. Twelve five-minute segments were selected at random from each of four day-long recordings of 30-month-old children. Each of these 48 segments was transcribed by two transcribers,and both number of words and number of vowels were calculated (inter-transcriber reliability for words: r = .95,vowels: r = .93). Both counts correlated with the LENA system’s AWC estimate for the same segments (words: r = .67, vowels: r = .66). The reliability of the AWC as estimated by the LENA system when applied to Swedish is therefore comparableto its reliability for Spanish, Mandarin and French.

  • 19.
    Marklund, Ellen
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Schwarz, Iris-Corinna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Lacerda, Francisco
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Vowel categorization correlates with speech exposure in 8-month-olds2017Conference paper (Refereed)
    Abstract [en]

    During the first year of life, infants ability to discriminate non-native speech contrasts attenuates, whereas their ability to discriminate native contrasts improves. This transition reflects the development of speech sound categorization, and is hypothesized to be modulated by exposure to spoken language. The ERP mismatch response has been used to quantify discrimination ability in infants, and its amplitude has been shown to be sensitive to amount of speech exposure on group level (Rivera-Gaxiola et al., 2011). In the present ERP-study, the difference in mismatch response amplitudes for spoken vowels and for spectrally rotated vowels, quantifies categorization in 8-month-old infants (N=15, 7 girls). This categorization measure was tested for correlation with infants? daily exposure to male speech, female speech, and the sum of male and female speech, as measured by all-day home recordings and analyzed using LENA software. A positive correlation was found between the categorization measure and total amount of daily speech exposure (r = .526, p = .044). The present study is the first to report a relation between speech exposure and speech sound categorization in infants on subject level, and the first to compensate for the acoustic part of the mismatch response in this context.

  • 20.
    Zora, Hatice
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Riad, Tomas
    Stockholm University, Faculty of Humanities, Department of Swedish Language and Multilingualism.
    Schwarz, Iris-Corinna
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Lexical Specification of Prosodic Information in Swedish: Evidence from Mismatch Negativity2016In: Frontiers in Neuroscience, ISSN 1662-4548, E-ISSN 1662-453X, Vol. 10, article id 533Article in journal (Refereed)
    Abstract [en]

    Like that of many other Germanic languages, the stress system of Swedish has mainly undergone phonological analysis. Recently, however, researchers have begun to recognize the central role of morphology in these systems. Similar to the lexical specification of tonal accent, the Swedish stress system is claimed to be morphologically determined and morphemes are thus categorized as prosodically specified and prosodically unspecified. Prosodically specified morphemes bear stress information as part of their lexical representations and are classified as tonic (i.e., lexically stressed), pretonic and posttonic, whereas prosodically unspecified morphemes receive stress through a phonological rule that is right-edge oriented, but is sensitive to prosodic specification at that edge. The presence of prosodic specification is inferred from vowel quality and vowel quantity; if stress moves elsewhere, vowel quality and quantity change radically in phonologically stressed morphemes, whereas traces of stress remain in lexically stressed morphemes. The present study is the first to investigate whether stress is a lexical property of Swedish morphemes by comparing mismatch negativity (MMN) responses to vowel quality and quantity changes in phonologically stressed and lexically stressed words. In a passive oddball paradigm, 15 native speakers of Swedish were presented with standards and deviants, which differed from the standards in formant frequency and duration. Given that vowel quality and quantity changes are associated with morphological derivations only in phonologically stressed words, MMN responses are expected to be greater in phonologically stressed words than in lexically stressed words that lack such an association. The results indicated that the processing differences between phonologically and lexically stressed words were reflected in the amplitude and topography of MMN responses. Confirming the expectation, MMN amplitude was greater for the phonologically stressed word than for the lexically stressed word and showed a more widespread topographic distribution. The brain did not only detect vowel quality and quantity changes but also used them to activate memory traces associated with derivations. The present study therefore implies that morphology is directly involved in the Swedish stress system and that changes in phonological shape due to stress shift cue upcoming stress and potential addition of a morpheme.

  • 21.
    Wirén, Mats
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Nilsson Björkenstam, Kristina
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Grigonytė, Gintarė
    Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.
    Cortes, Elisabet Eir
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Longitudinal Studies of Variation Sets in Child-directed Speech2016In: The 54th Annual Meeting of the Association for Computational Linguistics: Proceedings of the 7th Workshop on Cognitive Aspects of Computational Language Learning, Stroudsburg, PA, USA: Association for Computational Linguistics, 2016, p. 44-52Conference paper (Refereed)
    Abstract [en]

    One of the characteristics of child-directed speech is its high degree of repetitiousness. Sequences of repetitious utterances with a constant intention, variation sets, have been shown to be correlated with children’s language acquisition. To obtain a baseline for the occurrences of variation sets in Swedish, we annotate 18 parent–child dyads using a generalised definition according to which the varying form may pertain not just to the wording but also to prosody and/or non-verbal cues. To facilitate further empirical investigation, we introduce a surface algorithm for automatic extraction of variation sets which is easily replicable and language-independent. We evaluate the algorithm on the Swedish gold standard, and use it for extracting variation sets in Croatian, English and Russian. We show that the proportion of variation sets in child-directed speech decreases consistently as a function of children's age across Swedish, Croatian, English and Russian.

  • 22.
    Zora, Hatice
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Schwarz, Iris-Corinna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Perceptual correlates of Turkish word stress and their contribution to automatic lexical access: Evidence from early ERP components2016In: Frontiers in Neuroscience, ISSN 1662-4548, E-ISSN 1662-453X, Vol. 10, article id 7Article in journal (Refereed)
    Abstract [en]

    Perceptual correlates of Turkish word stress and their contribution to lexical access were studied using the mismatch negativity (MMN) component in event-related potentials (ERPs). The MMN was expected to indicate if segmentally identical Turkish words were distinguished on the sole basis of prosodic features such as fundamental frequency (f0), spectral emphasis (SE) and duration. The salience of these features in lexical access was expected to be reflected in the amplitude of MMN responses. In a multi-deviant oddball paradigm, neural responses to changes in f0, SE, and duration individually, as well as to all three features combined, were recorded for words and pseudowords presented to 14 native speakers of Turkish. The word and pseudoword contrast was used to differentiate language-related effects from acoustic-change effects on the neural responses. First and in line with previous findings, the overall MMN was maximal over frontal and central scalp locations. Second, changes in prosodic features elicited neural responses both in words and pseudowords, confirming the brain’s automatic response to any change in auditory input. However, there were processing differences between the prosodic features, most significantly in f0: While f0 manipulation elicited a slightly right-lateralized frontally-maximal MMN in words, it elicited a frontal P3a in pseudowords. Considering that P3a is associated with involuntary allocation of attention to salient changes, the manipulations of f0 in the absence of lexical processing lead to an intentional evaluation of pitch change. f0 is therefore claimed to be lexically specified in Turkish. Rather than combined features, individual prosodic features differentiate language-related effects from acoustic-change effects. The present study confirms that segmentally identical words can be distinguished on the basis of prosodic information alone, and establishes the salience of f0 in lexical access.

  • 23.
    Schwarz, Iris-Corinna
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Marklund, Ellen
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Dybäck, Matilda
    Royal Institute of Technology.
    Wallgren, Johanna
    Royal Institute of Technology.
    Uhlén, Inger
    Karolinska University Hospital.
    Pupil dilation indicates auditory signal detection: towards an objective hearing test based on eye-tracking2016Conference paper (Refereed)
    Abstract [en]

    Purpose: The long-term objective of this project is to develop an objective hearing threshold test that can be used in early infancy, using pupildilation as an indicator of hearing. The study purposes are 1) to identify relevant time-windows for analysis of pupillary responses to various auditory stimuli in adults, and 2) to evaluate a trial-minus-baseline approach to deal with unrelated pupillary responses in adults. Method: Participants’ pupil size is recorded using a Tobii T120 Eye-tracker. In the first test, participants fixate on a blank screen while sound stimuli are presented. From this data, typical pupillary responses and the relevant analysis time-window is determined and used in future tests. In the second test, participants watch movie clips while sound stimuli are presented. Visually identical sound and no-sound trials will be compared in order to isolate the pupillary changes tied to hearing sound from those related to changes in brightness in the visual stimuli. Results and conclusion: Data is currently being collected. Results from the pilot study indicate that the pupillary response related to sound detection occurs at around 900 ms after stimulus onset, and that a trial-minus-baseline approach is a viable option to eliminate unrelated pupillary responses.

  • 24.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Respiratory belts and whistles: A preliminary study of breathing acoustics for turn-taking2016In: Proceedings Interspeech 2016, International Speech Communication Association, 2016, p. 510-514Conference paper (Refereed)
    Abstract [en]

    This paper presents first results on using acoustic intensity of inhalations as a cue to speech initiation in spontaneous multiparty conversations. We demonstrate that inhalation intensity significantly differentiates between cycles coinciding with no speech activity, shorter (< 1 s) and longer stretches of speech. While the model fit is relatively weak, it is comparable to the fit of a model using kinematic features collected with Respiratory Inductance Plethysmography. We also show that incorpo- rating both kinematic and acoustic features further improves the model. Given the ease of capturing breath acoustics, we consider the results to be a promising first step towards studying communicative functions of respiratory sounds. We discuss possible extensions to the data collection procedure with a view to improving predictive power of the model. 

  • 25.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Respiratory turn-taking cues2016In: Proceedings Interspeech 2016: International Speech Communication Association, 2016 / [ed] ISCA, ISCA , 2016, p. 1275-1279Conference paper (Refereed)
    Abstract [en]

    This paper investigates to what extent breathing can be used as a cue to turn-taking behaviour. The paper improves on ex- isting accounts by considering all possible transitions between speaker states (silent, speaking, backchanneling) and by not re- lying on global speaker models. Instead, all features (including breathing range and resting expiratory level) are estimated in an incremental fashion using the left-hand context. We identify several inhalatory features relevant to turn-management, and as- sess the fit of models with these features as predictors of turn- taking behaviour. 

  • 26.
    Kallioinen, Petter
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics. Lund University, Sweden.
    Olofsson, Jonas
    Stockholm University, Faculty of Social Sciences, Department of Psychology, Perception and psychophysics.
    Nakeva von Mentzer, Cecilia
    Lindgren, Magnus
    Ors, Marianne
    Sahlén, Birgitta S.
    Lyxell, Björn
    Engström, Elisabet
    Uhlén, Inger
    Semantic Processing in Deaf and Hard-of-Hearing Children: Large N400 Mismatch Effects in Brain Responses, Despite Poor Semantic Ability2016In: Frontiers in Psychology, ISSN 1664-1078, E-ISSN 1664-1078, Vol. 7, article id 1146Article in journal (Refereed)
    Abstract [en]

    Difficulties in auditory and phonological processing affect semantic processing in speech comprehension for deaf and hard-of-hearing (DHH) children. However, little is known about brain responses related to semantic processing in this group. We investigated event-related potentials (ERPs) in DHH children with cochlear implants (CIs) and/or hearing aids (HAs), and in normally hearing controls (NH). We used a semantic priming task with spoken word primes followed by picture targets. In both DHH children and controls, cortical response differences between matching and mismatching targets revealed a typical N400 effect associated with semantic processing. Children with CI had the largest mismatch response despite poor semantic abilities overall; Children with CI also had the largest ERP differentiation between mismatch types, with small effects in within-category mismatch trials (target from same category as prime) and large effects in between-category mismatch trials (where target is from a different category than prime), compared to matching trials. Children with NH and HA had similar responses to both mismatch types. While the large and differentiated ERP responses in the CI group were unexpected and should be interpreted with caution, the results could reflect less precision in semantic processing among children with CI, or a stronger reliance on predictive processing.

  • 27.
    Eriksson, Anders
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Bertinetto, Pier Marco
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Nodari, Rosalba
    Lenoci, Giovanna
    The Acoustics of Lexical Stress in Italian as a Function of Stress Level and Speaking Style2016In: Proceedings Interspeech 2016, International Speech Communication Association, 2016, p. 1059-1063Conference paper (Refereed)
    Abstract [en]

    The study is part of a series of studies, describing the acoustics of lexical stress in a way that should be applicable to any language. The present database of recordings includes Brazilian Portuguese, English, Estonian, German, French, Italian and Swedish. The acoustic parameters examined are F0-level, F0- variation, Duration, and Spectral Emphasis. Values for these parameters, computed for all vowels (a little over 24000 vowels for Italian), are the data upon which the analyses are based. All parameters are examined with respect to their correlation with Stress (primary, secondary, unstressed) and speaking Style (wordlist reading, phrase reading, spontaneous speech) and Sex of the speaker (female, male). For Italian Duration was found to be the dominant factor by a wide margin, in agreement with previous studies. Spectral Emphasis was the second most important factor. Spectral Emphasis has not been studied previously for Italian but intensity, a related parameter, has been shown to correlate with stress. F0-level was also significantly correlated but not to the same degree. Speaker Sex turned out as significant in many comparisons. The differences were, however, mainly a function of the degree to which a given parameter was used, not how it was used to signal lexical stress contrasts. 

  • 28.
    Gerholm, Tove
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, General Linguistics. Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Pagmar, David
    Stockholm University, Faculty of Humanities, Department of Linguistics, General Linguistics.
    The MINT-project: Modeling infant language acquisition from parent-child interction2016Conference paper (Refereed)
  • 29.
    Gerholm, Tove
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    The relation between modalities in spoken language acquisition: Preliminary results from the Swedish MINT-project2016Conference paper (Other academic)
  • 30.
    Forssén Renner, Lena
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wlodarzcak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    The surprised pupil: New perspectives in semantic processing research2016In: ISSBD 2016, 2016Conference paper (Refereed)
    Abstract [en]

    In the research on semantic processing and brain activity, the N400-paradigm has been long known to reflect a reaction to unexpected events, for instance the incongruence between visual and verbal information when subjects are presented with a picture and a mismatching word. In the present study, we investigate whether an N400-like reaction to unexpected events can be captured with pupillometry. While earlier research has firmly established a connection between changes in pupil diameter and arousal, the findings have not been so far extended to the domain of semantic processing. Consequently, we measured pupil size change in reaction to a match or a mismatch between a picture and an auditorily presented word. We presented 120 trials to ten native speakers of Swedish. In each trial a picture was displayed for six seconds, and 2.5 seconds into the trial the word was played through loudspeakers. The picture and the word were matching in half of the trials, and all stimuli were common high-frequency monosyllabic Swedish words. For the analysis, the baseline pupil size at the sound playback onset was compared against the maximum pupil size in the following time window of 3.5 seconds. The results show a statistically significant difference (t(746)=-2.8, p < 0.01) between the conditions. In line with the hypothesis, the pupil was observed to dilate more in the incongruent condition (on average by 0.03 mm). While the results are preliminary, they suggest that pupillometry could be a viable alternative to existing methods in the field of language processing, for instance across different ages and clinical groups. In the future, we intend to validate the results on a larger sample of participants as well as expand the analysis with a view to locating temporal regions of greatest differences between the conditions. In the future, we intend to validate the results on a larger sample of participants as well as expand the analysis with a functional analysis accounting for temporal changes in the data. This will allow locating temporal regions of greatest differences between the conditions.

  • 31.
    Renner, Lena
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Kallioinen, Petter
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Markelius, Marie
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Sundberg, Ulla
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Brain responses to typical mispronunciations among toddlers2015Conference paper (Refereed)
    Abstract [en]

    In first language acquisition research, investigations on the semantics and lexicon of the child are often conducted by measuring brain activity at the surface of the scalp (EEG). Such EEG studies have shown different brain reactions to matching and mismatching pairs of pictures and words from 19-month-olds (Friedrich & Friederici, 2005). Similarly, results from 20-month-olds exposed to auditory stimuli only indicated different brain reactions to correct pronunciations and mispronunciations (Mills et al., 2004). However, these studies do not take the typical production patterns in that specific age into account.

    In the present study, we measured brain reactions of 13 24-month-olds exposed to pairs of pictures and words in four different conditions: correctly pronounced words, two different kinds of mispronounced words, and novel words. The first type of mispronunciations (M1) consisted in minor mispronunciations consistent with typical production patterns in first language acquisition, e.g. ‘ko’ instead of ‘sko’ (shoe). The second type (M2) was characterized by phonological changes that are not expected at 24 months, e.g. ‘fo’ instead of ‘sko’ (shoe). The novel words consisted of phonotactically possible Swedish non-words.

    A principal component analysis (PCA) decomposition of the EEG data showed two patterns of posterior negativity typical of lexical-semantic processing: one for novel words in comparison to the other conditions, and the other for novel and M2 word forms compared to M1 and correct word forms. These results indicate that M1 are processed similar as correct word forms, and that M2 and novel words are processed alike. However, while these patterns were visually salient in successive components, the results were not statistically significant. We suspect that the non-significant results were due to the small dataset. Nevertheless, this study contributes to the discussion on the relationship between perception and production in first language acquisition.

  • 32.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Edlund, Jens
    Breathing in Conversation: An Unwritten History2015In: Proceedings of the 2nd European and the 5th Nordic Symposium on Multimodal Communication / [ed] Kristiina Jokinen, Martin Vels, Linköping, 2015, p. 107-112Conference paper (Refereed)
    Abstract [en]

    This paper attempts to draw attention of the multimodal communication research community to what we consider a long overdue topic, namely respiratory activity in conversation. We submit that a turn towards spontaneous interaction is a natural extension of the recent interest in speech breathing, and is likely to offer valuable insights into mechanisms underlying organisation of interaction and collaborative human action in general, as well as to make advancement in existing speech technology applications. Particular focus is placed on the role of breathing as a perceptually and interactionally salient turn-taking cue. We also present the recording setup developed in the Phonetics Laboratory at Stockholm University with the aim of studying communicative functions of physiological and audio-visual breathing correlates in spontaneous multiparty interactions

  • 33.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Edlund, Jens
    Communicative needs and respiratory constraints2015In: 16th Annual Conference of the International Speech Communication Association (INTERSPEECH 2015): Speech Beyond Speech Towards a Better Understanding of the Most Important Biosignal, 2015, p. 3051-3055Conference paper (Refereed)
    Abstract [en]

    This study investigates timing of communicative behaviour with respect to speaker’s respiratory cycle. The data is drawn from a corpus of multiparty conversations in Swedish. We find that while longer utterances (> 1 s) are tied, predictably, primarily to exhalation onset, shorter vocalisations are spread more uni- formly across the respiratory cycle. In addition, nods, which are free from any respiratory constraints, are most frequently found around exhalation offsets, where respiratory requirements for even a short utterance are not satisfied. We interpret the results to reflect the economy principle in speech production, whereby respiratory effort, associated primarily with starting a new respiratory cycle, is minimised within the scope of speaker’s communicative goals. 

  • 34. Lim, Sung-Joo
    et al.
    Lacerda, Francisco
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Holt, Lori L.
    Discovering Functional Units in Continuous Speech2015In: Journal of Experimental Psychology: Human Perception and Performance, ISSN 0096-1523, E-ISSN 1939-1277, Vol. 41, no 4, p. 1139-1152Article in journal (Refereed)
    Abstract [en]

    Language learning requires that listeners discover acoustically variable functional units like phonetic categories and words from an unfamiliar, continuous acoustic stream. Although many category learning studies have examined how listeners learn to generalize across the acoustic variability inherent in the signals that convey the functional units of language, these studies have tended to focus upon category learning across isolated sound exemplars. However, continuous input presents many additional learning challenges that may impact category learning. Listeners may not know the timescale of the functional unit, its relative position in the continuous input, or its relationship to other evolving input regularities. Moving laboratory-based studies of isolated category exemplars toward more natural input is important to modeling language learning, but very little is known about how listeners discover categories embedded in continuous sound. In 3 experiments, adult participants heard acoustically variable sound category instances embedded in acoustically variable and unfamiliar sound streams within a video game task. This task was inherently rich in multisensory regularities with the to-be-learned categories and likely to engage procedural learning without requiring explicit categorization, segmentation, or even attention to the sounds. After 100 min of game play, participants categorized familiar sound streams in which target words were embedded and generalized this learning to novel streams as well as isolated instances of the target words. The findings demonstrate that even without a priori knowledge, listeners can discover input regularities that have the best predictive control over the environment for both non-native speech and nonspeech signals, emphasizing the generality of the learning.

  • 35. Eyben, Florian
    et al.
    Salomão, Gláucia Laís
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics. KTH (Royal Institute of Technology), Sweden.
    Sundberg, Johan
    Scherer, Klaus R.
    Schuller, Björn W.
    Emotion in the singing voice—a deeper look at acoustic features in the light of automatic classification2015In: EURASIP Journal on Audio, Speech, and Music Processing, ISSN 1687-4714, E-ISSN 1687-4722, article id 19Article in journal (Refereed)
    Abstract [en]

    We investigate the automatic recognition of emotions in the singing voice and study the worth and role of a variety of relevant acoustic parameters. The data set contains phrases and vocalises sung by eight renowned professional opera singers in ten different emotions and a neutral state. The states are mapped to ternary arousal and valence labels. We propose a small set of relevant acoustic features basing on our previous findings on the same data and compare it with a large-scale state-of-the-art feature set for paralinguistics recognition, the baseline feature set of the Interspeech 2013 Computational Paralinguistics ChallengE (ComParE). A feature importance analysis with respect to classification accuracy and correlation of features with the targets is provided in the paper. Results show that the classification performance with both feature sets is similar for arousal, while the ComParE set is superior for valence. Intra singer feature ranking criteria further improve the classification accuracy in a leave-one-singer-out cross validation significantly.

  • 36.
    Thöny, Luzius
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics. Universität Zürich, Schweiz.
    Etymologisches Wörterbuch des Althochdeutschen, unter der Leitung von Rosemarie Lühr erarbeitet von Harald Bichlmeier, Maria Kozianka und Roland Schuhmann mit Beiträgen von Albert L. Lloyd unter Mitarbeit von Karen K. Purdy, Bd. V: iba – luzzilo, Göttingen 20142015In: Zeitschrift für deutsches Altertum und deutsche Literatur, ISSN 0044-2518, Vol. 144, no 3, p. 384-389Article, book review (Other academic)
  • 37.
    Aare, Kätlin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Inhalation amplitude and turn-taking in spontaneous Estonian conversations2015In: Proceedings from Fonetik 2015 Lund, June 8-10, 2015 / [ed] Malin Svensson Lundmark, Gilbert Ambrazaitis, Joost van de Weijer, Lund: Lund University , 2015, p. 1-5Conference paper (Other academic)
    Abstract [en]

    This study explores the relationship between inhalation amplitude and turn management in four approximately 20 minute long spontaneous multiparty conversations in Estonian. The main focus of interest is whether inhalation amplitude is greater before turn onset than in the following inhalations within the same speaking turn. The results show that inhalations directly before turn onset are greater in amplitude than those later in the turn. The difference seems to be realized by ending the inhalation at a greater lung volume value, whereas the initial lung volume before inhalation onset remains roughly the same across a single turn. The findings suggest that the increased inhalation amplitude could function as a cue for claiming the conversational floor.

  • 38.
    Traunmüller, Hartmut
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    La plej vaste uzataj vortoj2015Conference paper (Other academic)
  • 39.
    Zora, Hatice
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Schwarz, Iris-Corinna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Neural correlates of lexical stress: mismatch negativity reflects fundamental frequency and intensity2015In: NeuroReport, ISSN 0959-4965, E-ISSN 1473-558X, Vol. 26, no 13, p. 791-796Article in journal (Refereed)
    Abstract [en]

    Neural correlates of lexical stress were studied using the mismatch negativity (MMN) component in event-related potentials. The MMN responses were expected to reveal the encoding of stress information into long-term memory and the contributions of prosodic features such as fundamental frequency (F0) and intensity toward lexical access. In a passive oddball paradigm, neural responses to changes in F0, intensity, and in both features together were recorded for words and pseudowords. The findings showed significant differences not only between words and pseudowords but also between prosodic features. Early processing of prosodic information in words was indexed by an intensity-related MMN and an F0-related P200. These effects were stable at right-anterior and mid-anterior regions. At a later latency, MMN responses were recorded for both words and pseudowords at the mid-anterior and posterior regions. The P200 effect observed for F0 at the early latency for words developed into an MMN response. Intensity elicited smaller MMN for pseudowords than for words. Moreover, a larger brain area was recruited for the processing of words than for the processing of pseudowords. These findings suggest earlier and higher sensitivity to prosodic changes in words than in pseudowords, reflecting a language-related process. The present study, therefore, not only establishes neural correlates of lexical stress but also confirms the presence of long-term memory traces for prosodic information in the brain.

  • 40.
    Marklund, Ulrika
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Marklund, Ellen
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Lacerda, Francisco
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Schwarz, Iris-Corinna
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Pause and utterance duration in child-directed speech in relation to child vocabulary size2015In: Journal of Child Language, ISSN 0305-0009, E-ISSN 1469-7602, Vol. 42, no 5, p. 1158-1171Article in journal (Refereed)
    Abstract [en]

    This study compares parental pause and utterance duration in conversations with Swedish speaking children at age 1;6 who have either a large, typical, or small expressive vocabulary, as measured by the Swedish version of the McArthur-Bates CDI. The adjustments that parents do when they speak to children are similar across all three vocabulary groups; they use longer utterances than when speaking to adults, and respond faster to children than they do to other adults. However, overall pause duration varies with the vocabulary size of the children, and as a result durational aspects of the language environment to which the children are exposed differ between groups. Parents of children in the large vocabulary size group respond faster to child utterances than do parents of children in the typical vocabulary size group, who in turn respond faster to child utterances than do parents of children in the small vocabulary size group.

  • 41.
    Renner, Lena
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Strandberg, Andrea
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Phonological templates in Swedish 18-month-old children in relation to vocabulary size2015Conference paper (Refereed)
    Abstract [en]

    The relationship between phonology and lexicon in first language acquisition has been of interest for many researchers in the last years (for a review see  [1]). Both perception and production studies have been conducted to investigate each of these areas. Among the speech production studies, phonological templates have been proposed as an account of how children acquire words. Phonological templates are child-specific word form patterns such as consonant harmony, which children frequently use. In projecting the phonological template onto adult word forms the child adapts new words to fit to his or her own preferred production pattern [2].

    In the present study, we investigate phonological templates in spontaneous speech from 12 Swedish 18-month-old children. The phonological templates are also related to each child’s vocabulary size, based on  the Swedish version of the McArthur-Bates Communicative Development Inventory (CDI) [3]. The participants included four children with a vocabulary size above 100 words, three with a vocabulary size between 50 and 100 words and five children with a vocabulary size below 50 words. The tentative findings indicate that only those children with a vocabulary size above 100 words show phonological templates, pointing to a relationship between lexical and phonological development in speech production. The results are discussed in relation to the existence of phonological templates in general and to the increased probability of the occurrence of phonological templates in a specific window of vocabulary size.

     

  • 42.
    Heldner, Mattias
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Pitch Slope and End Point as Turn-Taking Cues in Swedish2015In: Proceedings of the 18th International Congress of Phonetic Sciences / [ed] Maria Wolters, Judy Livingstone, Bernie Beattie, Rachel Smith, Mike MacMahon, Jane Stuart-Smith, Jim Scobbie, Glasgow: University of Glasgow , 2015Conference paper (Refereed)
    Abstract [en]

    This paper examines the relevance of parameters related to slope and end-point of pitch segments for indicating turn-taking intentions in Swedish. Perceptually motivated stylization in Prosogram was used to characterize the last pitch segment in talkspurts involved in floor-keeping and turn- yielding events. The results suggest a limited contribution of pitch pattern direction and position of its endpoint in the speaker’s pitch range to signaling turn-taking intentions in Swedish. 

  • 43. Šimko, Juraj
    et al.
    Aalto, Daniel
    Lippus, Pärtel
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Vainio, Martti
    Pith, perceived duration and auditory biases: Comparison among languages2015In: Proceedings of the 18th International Congress of Phonetic Sciences / [ed] Maria Wolters, Judy Livingstone, Bernie Beattie, Rachel Smith, Mike MacMahon, Jane Stuart-Smith, Jim Scobbie, Glasgow: University of Glasgow , 2015Conference paper (Refereed)
    Abstract [en]

    In addition to fundamental frequency height, its movement is also generally assumed to lengthen the perceived duration of syllable-like sounds. The lengthening effect has been observed for some languages (US English, French, SwissGerman, Japanese) but reported to be absent for another (Thai, Latin American Spanish, German). In this work, native speakers of Estonian, Finnish, Mandarin and Swedish performed a two-alternative forced choice duration discrimination experiment with pairs of complex tones varying in several acoustic dimensions. According to a logistic regression analysis, the duration judgements are affected by intensity, f0 level, and f0 movement for all languages, but the strength of these influences varies across languages and a pattern revealed by the relative strengths correlates with phonological properties of the languages. The findings are discussed in the light of current hypotheses of the origin of pitch modulation of perceived duration.

  • 44.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Respiratory Properties of Backchannels in Spontaneous Multiparty Conversation2015In: Proceedings of the 18th International Congress of Phonetic Sciences / [ed] Maria Wolters, Judy Livingstone, Bernie Beattie, Rachel Smith, Mike MacMahon, Jane Stuart-Smith, Jim Scobbie, Glasgow: University of Glasgow , 2015Conference paper (Refereed)
    Abstract [en]

    In this paper we report on first results of a newly started project focussing on interactional functions of breathing in spontaneous multiparty conversation. Specifically, we investigate respiratory patterns associated with backchannels (short feedback expressions), and compare them with breathing cycles observed during longer stretches of speech or while listening to interlocutor’s speech. Overall, inhalations preceding backchannels were found to resemble those in quiet breathing to a large degree. The results are discussed in terms of temporal organisation and respiratory planning in these utterances. 

  • 45. Von Mentzer, Cecilia Nakeva
    et al.
    Lyxell, Björn
    Sahlén, Birgitta
    Dahlström, Örjan
    Lindgren, Magnus
    Ors, Marianne
    Kallioinen, Petter
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Engström, Elisabet
    Uhlén, Inger
    Segmental and suprasegmental properties in nonword repetition - An explorative study of the associations with nonword decoding in children with normal hearing and children with bilateral cochlear implants2015In: Clinical Linguistics & Phonetics, ISSN 0269-9206, E-ISSN 1464-5076, Vol. 29, no 3, p. 216-235Article in journal (Refereed)
    Abstract [en]

    This study explored nonword repetition (NWR) and nonword decoding in normal-hearing (NH) children and in children with bilateral cochlear implants (CI). Participants were 11 children, with CI, 5:0-7:11 years (M = 6.5 years), and 11 NH children, individually age-matched to the children with CI. This study fills an important gap in research, since it thoroughly describes detailed aspects of NWR and nonword decoding and their possible associations. All children were assessed after having practiced with a computer-assisted reading intervention with a phonics approach during four weeks. Results showed that NH children outperformed children with CI on the majority of aspects of NWR. The analysis of syllable number in NWR revealed that children with CI made more syllable omissions than did the NH children, and predominantly in prestressed positions. In addition, the consonant cluster analysis in NWR showed significantly more consonant omissions and substitutions in children with CI suggesting that reaching fine-grained levels of phonological processing was particularly difficult for these children. No significant difference was found for nonword-decoding accuracy between the groups, as measured by whole words correct and phonemes correct, but differences were observed regarding error patterns. In children with CI phoneme, deletions occurred significantly more often than in children with NH. The correlation analysis revealed that the ability to repeat consonant clusters in NWR had the strongest associations to nonword decoding in both groups. The absence of as frequent significant associations between NWR and nonword decoding in children with CI compared to children with NH suggest that these children partly use other decoding strategies to compensate for less precise phonological knowledge, for example, lexicalizations in nonword decoding, specifically, making a real word of a nonword.

  • 46.
    Eriksson, Anders
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Syllable prominence: An experimental study2015In: Lingue e linguaggio, ISSN 1720-9331, Vol. XIV, no 1, p. 43-60Article in journal (Refereed)
    Abstract [en]

    There are many studies of word stress (or lexical stress) in different languages. One problem if one wants to compare the acoustics of word stress in different languages is that the studies are often made in such a way that the results are not immediately comparable. One goal of the project described here is to develop a framework for analysing the acoustics of word stress that can be applied in the same way to any language. A second goal is to examine the perception of syllable prominence as a cue to lexical stress perception. The acoustic properties are obviously a factor to be considered, but we have reasons to believe, based on results from a previous experiment (Eriksson et al. 2002), that the native language of the listener may also influence perceived prominence and thus lexical stress perception. The languages included in the study so far are Brazilian Portuguese, English, Estonian, French, German, Italian and Swedish. At present only the Swedish material has been analysed using the complete set of recordings. In this paper I will therefore only give a full presentation of the Swedish result. Results based on subsets of the data from the other languages (usually 10 speakers) will be referred to as “preliminary results”. Some of these results have been presented in more detail in conference proceedings (see references).

  • 47. Hammarsten, Jonna
    et al.
    Harris, Roxanne
    Henriksson, Nilla
    Pano, Isabelle
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Temporal aspects of breathing and turn-taking in Swedish multiparty conversations2015In: Proceedings from Fonetik 2015 / [ed] Malin Svensson Lundmark, Gilbert Ambrazaitis, Joost van de Weijer, Lund: Centre for Languages and Literature, 2015, p. 47-50Conference paper (Other academic)
    Abstract [en]

    Interlocutors use various signals to make conversations flow smoothly. Recent research has shown that respiration is one of the signals used to indicate the intention to start speaking. In this study, we investigate whether inhalation duration and speech onset delay within one’s own turn differ from when a new turn is initiated. Respiratory activity was recorded in two three-party conversations using Respiratory Inductance Plethysmography. Inhalations were categorised depending on whether they coincided with within-speaker silences or with between- speaker silences. Results showed that within-turn inhalation durations were shorter than inhalations preceding new turns. Similarly, speech onset delays were shorter within turns than before new turns. Both these results suggest that speakers ‘speed up’ preparation for speech inside turns, probably to indicate that they intend to continue. 

  • 48.
    Eriksson, Anders
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    The acoustics of word stress in English as a function of stress level and speaking style2015In: 16th Annual Conference of the International Speech Communication Association (INTERSPEECH 2015): Speech Beyond Speech Towards a Better Understanding of the Most Important Biosignal, 2015, p. 41-45Conference paper (Refereed)
    Abstract [en]

    This study of lexical stress in English is part of a series of studies, the goal of which is to describe the acoustics of lexical stress for a number of typologically different languages. When fully developed the methodology should be applicable to any language. The database of recordings so far includes Brazilian Portuguese, English (U.K.), Estonian, German, French, Italian and Swedish. The acoustic parameters examined are f0-level, f0-variation, Duration, and Spectral Emphasis. Values for these parameters, computed for all vowels, are the data upon which the analyses are based. All parameters are tested with respect to their correlation with stress level (primary, secondary, unstressed) and speaking style (wordlist reading, phrase reading, spontaneous speech). For the English data, the most robust results concerning stress level are found for Duration and Spectral Emphasis. f0-level is also significantly correlated but not quite to the same degree. The acoustic effect of phonological secondary stress was significantly different from primary stress only for Duration. In the statistical tests, speaker sex turned out as significant in most cases. Detailed examination showed, however, that the difference was mainly in the degree to which a given parameter was used, not how it was used to signal lexical stress contrasts. 

  • 49.
    Gerholm, Tove