Change search
Link to record
Permanent link

Direct link
Publications (10 of 95) Show all publications
Włodarczak, M., Ludusan, B., Sundberg, J. & Heldner, M. (2025). Classification of voice quality using neck-surface acceleration: Comparison with glottal flow and radiated sound. Journal of Voice, 39(1), 10-24
Open this publication in new window or tab >>Classification of voice quality using neck-surface acceleration: Comparison with glottal flow and radiated sound
2025 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 39, no 1, p. 10-24Article in journal (Refereed) Published
Abstract [en]

Objectives: The aim of the present study is to investigate the usefulness of features extracted from miniature accelerometers attached to speaker's tracheal wall below the glottis for classification of phonation type. The performance of the accelerometer features is evaluated relative to features obtained from inverse filtered and radiated sound. While the former is a good proxy for the voice source, obtaining robust voice source features from the latter is considered difficult since it also contains information about the vocal tract filter. By contrast, the accelerometer signal is largely unaffected by the vocal tract and although it is shaped by subglottal resonances and the transfer properties of the neck tissue, these properties remain constant within a speaker. For this reason, we expect it to provide a better approximation of the voice source than the raw audio. We also investigate which aspects of the voice source are derivable from the accelerometer and microphone signals. Methods: Five trained singers (two females and three males) were recorded producing the syllable [pæ:] in three voice qualities (neutral, breathy and pressed) and at three pitch levels as determined by the participants’ personal preference. Features extracted from the three signals were used for classification of phonation type using a random forest classifier. In addition, accelerometer and microphone features with highest correlation with the voice source features were identified. Results: The three signals showed comparable classification error rates, with considerable differences across speakers both with respect to the overall performance and the importance of individual features. The speaker-specific differences notwithstanding, variation of phonation type had consistent effects on the voice source, accelerometer and audio signals. With regard to the voice source, AQ, NAQ, L1L2 and CQ all showed a monotonic variation along the breathy – neutral – pressed continuum. Several features were also found to vary systematically in the accelerometer and audio signals: HRF, L1L2 and CPPS (both the accelerometer and the audio), as well as the sound level (for the audio). The random forest analysis revealed that all of these features were also among the most important for the classification of voice quality. Conclusion: Both the accelerometer and the audio signals were found to discriminate between phonation types with an accuracy approaching that of the voice source. Thus, the accelerometer signal, which is largely uncontaminated by vocal tract resonances, offered no advantage over the signal collected with a normal microphone. 

Keywords
accelerometer, audio, phonation type classification, voice source
National Category
Natural Language Processing
Identifiers
urn:nbn:se:su:diva-212725 (URN)10.1016/j.jvoice.2022.06.034 (DOI)001414592600001 ()36028369 (PubMedID)2-s2.0-85136510333 (Scopus ID)
Available from: 2023-01-11 Created: 2023-01-11 Last updated: 2025-02-20Bibliographically approved
Wikse Barrow, C., Strömbergsson, S., Włodarczak, M. & Heldner, M. (2024). Individual variation in the realisation and contrast of Swedish children’s word-initial voiceless fricatives. Journal of Phonetics, 106, Article ID 101351.
Open this publication in new window or tab >>Individual variation in the realisation and contrast of Swedish children’s word-initial voiceless fricatives
2024 (English)In: Journal of Phonetics, ISSN 0095-4470, E-ISSN 1095-8576, Vol. 106, article id 101351Article in journal (Refereed) Published
Abstract [en]

In this study, we explore individual variation and contrast in Swedish children’s voiceless fricatives. Thirty-one children between three and eight years of age participated in a picture-prompted word repetition task, wherein they repeated fricative-initial words in a variety of vowel contexts. The fricatives were transcribed and acoustically analysed, using spectral moments 1–4, spectral peak and spectral balance measures. Random forests were used to estimate the relative importance of each spectral feature in the classification of correct fricative productions, as well as to measure robustness of the late-emerging contrast between sibilants [s] and [ɕ] in individual children. Transcription analysis revealed that substitutions involving a more anterior place of articulation were common. Acoustic analysis showed individual differences in variability and contrast in the children’s fricative systems across and within age groups. Cue weighting of spectral characteristics in classification was similar in all age groups for correct productions, while the magnitude of the acoustic contrast between sibilants increased with age. This paper provides a description of individual variation in Swedish children’s acquisition of fricatives which can inform future large-scale speech-acquisition research.

 

Keywords
Speech acquisition, Fricatives, Acoustic analysis, Speech-language development, Phonological development, Swedish
National Category
General Language Studies and Linguistics
Identifiers
urn:nbn:se:su:diva-231894 (URN)10.1016/j.wocn.2024.101351 (DOI)001288755000001 ()2-s2.0-85200234851 (Scopus ID)
Available from: 2024-07-03 Created: 2024-07-03 Last updated: 2025-01-15Bibliographically approved
Zora, H., Bowin, H., Heldner, M., Riad, T. & Hagoort, P. (2024). The role of pitch accent in discourse comprehension and the markedness of Accent 2 in Central Swedish. In: Proceedings Speech Prosody 2024: . Paper presented at Speech Prosody 2024, Leiden, The Netherlands, July 2-5, 2024 (pp. 921-925). Leiden: The International Speech Communication Association (ISCA)
Open this publication in new window or tab >>The role of pitch accent in discourse comprehension and the markedness of Accent 2 in Central Swedish
Show others...
2024 (English)In: Proceedings Speech Prosody 2024, Leiden: The International Speech Communication Association (ISCA), 2024, p. 921-925Conference paper, Published paper (Refereed)
Abstract [en]

In Swedish, words are associated with either of two pitch contours known as Accent 1 and Accent 2. Using a psychometric test, we investigated how listeners judge pitch accent violations while interpreting discourse. Forty native speakers of Central Swedish were presented with auditory dialogues, where test words were appropriately or inappropriately accented in a given context, and asked to judge the correctness of sentences containing the test words. Data indicated a statistically significant effect of wrong accent pattern on the correctness judgment. Both Accent 1 and Accent 2 violations interfered with the coherent interpretation of discourse and were judged as incorrect by the listeners. Moreover, there was a statistically significant difference in the perceived correctness between the accent patterns. Accent 2 violations led to a lower correctness score compared to Accent 1 violations, indicating that the listeners were more sensitive to pitch accent violations in Accent 2 words than in Accent 1 words. This result is in line with the notion that Accent 2 is marked and lexically represented in Central Swedish. Taken together, these findings indicate that listeners use both Accent 1 and Accent 2 to arrive at the correct interpretation of the linguistic input, while assigning varying degrees of relevance to them depending on their markedness.

Place, publisher, year, edition, pages
Leiden: The International Speech Communication Association (ISCA), 2024
National Category
Comparative Language Studies and Linguistics
Research subject
Phonetics
Identifiers
urn:nbn:se:su:diva-243404 (URN)10.21437/SpeechProsody.2024-186 (DOI)2-s2.0-105008057303 (Scopus ID)
Conference
Speech Prosody 2024, Leiden, The Netherlands, July 2-5, 2024
Available from: 2025-05-21 Created: 2025-05-21 Last updated: 2025-08-11Bibliographically approved
Włodarczak, M. & Heldner, M. (2022). Contribution of voice quality to prediction of turn-taking events. In: S. Frota, M. Cruz, & M. Vigário (Ed.), Proceedings of Speech Prosody 2022: . Paper presented at Speech Prosody 2022, Lisbon, Portugal (pp. 485-489).
Open this publication in new window or tab >>Contribution of voice quality to prediction of turn-taking events
2022 (English)In: Proceedings of Speech Prosody 2022 / [ed] S. Frota, M. Cruz, & M. Vigário, 2022, p. 485-489Conference paper, Published paper (Refereed)
Abstract [en]

This paper evaluates the contribution of acoustic voice quality measures to prediction of upcoming floor change and retention. In order to minimize the influence of vocal tract resonances, the measures were calculated from miniature accelerometers attached to the tracheal wall. Overall, speaker changes accom- panied by silence were characterized by lower periodicity and steeper spectral slope than turn-holds and speaker changes in- volving overlapping speech. When used on their own, voice quality features contributed to prediction of turn-taking category, this was particularly true of smoothed cepstral peak prominence (CPPS). At the same time, their importance was limited when used in combination with fundamental frequency and intensity, especially compared to the joint effect of these two predictors.

Keywords
spontaneous conversation, turn-taking, voice qual- ity, accelerometer
National Category
General Language Studies and Linguistics
Research subject
Phonetics
Identifiers
urn:nbn:se:su:diva-205271 (URN)10.21437/SpeechProsody.2022-99 (DOI)
Conference
Speech Prosody 2022, Lisbon, Portugal
Projects
Prosodic functions of voice quality dynamics
Funder
Swedish Research Council, 2019-02932
Available from: 2022-05-31 Created: 2022-05-31 Last updated: 2022-06-15Bibliographically approved
Wikse Barrow, C., Włodarczak, M., Thörn, L. & Heldner, M. (2022). Static and dynamic spectral characteristics of Swedish voiceless fricatives. Journal of the Acoustical Society of America, 152(5), 2588-2600
Open this publication in new window or tab >>Static and dynamic spectral characteristics of Swedish voiceless fricatives
2022 (English)In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 152, no 5, p. 2588-2600Article in journal (Refereed) Published
Abstract [en]

Descriptions of the acoustic characteristics of Swedish voiceless fricatives are scarce and are limited to static measures derived from the speech of a small number of speakers. The current study provides an updated acoustic description of the static (spectral, temporal, and intensity) characteristics of word-initial voiceless fricatives in Central Standard Swedish. In addition, temporal variation of spectral centre of gravity is modelled using a generalized additive mixed model. Results show that fricatives were differentiated in terms of spectral properties, duration, and intensity level, such that sibilant fricatives were generally longer and more intense than non-sibilant fricatives. Spectral centre of gravity differentiated between all places of articulation apart from labio-dental /f/. Gender differences were found for centre of gravity in /s/ but overall, sex/gender differences were small. Dynamic analyses revealed differences in curvature as well as overall level of spectral centre of gravity across the duration of the fricative, associated with place of articulation and mediated by vowel context, fricative duration, and speaker specific patterns. The results from the present study are valuable for future cross-linguistic research, and as reference for investigations concerning children's acquisition of Swedish voiceless fricatives.

Keywords
frikativor, svenska, akustisk analys
National Category
General Language Studies and Linguistics
Research subject
Phonetics
Identifiers
urn:nbn:se:su:diva-213790 (URN)10.1121/10.0014947 (DOI)36456287 (PubMedID)2-s2.0-85143184693 (Scopus ID)
Note

For erratum, see: Wikse Barrow, C. , Włodarczak, M. , Thörn, L. , and Heldner, M. Erratum: Static and dynamic spectral characteristics of Swedish voiceless fricatives J. Acoust. Soc. Am. 153, 1933 (2023) https://doi.org/10.1121/10.0017651

Available from: 2023-01-17 Created: 2023-01-17 Last updated: 2024-12-06Bibliographically approved
Heldner, M., Riad, T., Sundberg, J., Włodarczak, M. & Zora, H. (2021). Pride and prominence. In: Working Papers in Linguistics: Proceedings of Fonetik 2021. Paper presented at Fonetik 2021, Lund, Sverige, June 8-9, 2021 (pp. 1-6).
Open this publication in new window or tab >>Pride and prominence
Show others...
2021 (English)In: Working Papers in Linguistics: Proceedings of Fonetik 2021, 2021, p. 1-6Conference paper, Published paper (Other academic)
Abstract [en]

Given the importance of the entire voice source in prominence expression, this paper aims to explore whether the word accent distinction can be defined by the voice quality dynamics moving beyond the tonal movements.To this end, a list of word accent pairs in Central Swedish were recorded and analysed based on a set of acoustic features extracted from the accelerometer signal. The results indicate that the tonal movements are indeed accompanied by the voice quality dynamics such as intensity, periodicity, harmonic richness and spectral tilt, and suggest that these parameters might contribute to the perception of one vs. two peaks associated with the word accent distinction in this regional variant of Swedish. These results, although based on limited data, are of crucial importance for the designation of voice quality variation as a prosodic feature per se.

Series
Working Papers in Linguistics, ISSN 0280-526X ; 56
National Category
General Language Studies and Linguistics
Research subject
Phonetics
Identifiers
urn:nbn:se:su:diva-204847 (URN)
Conference
Fonetik 2021, Lund, Sverige, June 8-9, 2021
Funder
Swedish Research Council, 2019-02932
Available from: 2022-05-20 Created: 2022-05-20 Last updated: 2022-05-24Bibliographically approved
Włodarczak, M. & Heldner, M. (2021). Turn-taking in conversation from the larynx down. In: Barthel, Mathias (Ed.), The Role of the Current Speaker in Conversational Turn Taking – Theoretical, Experimental, and Corpus Linguistic Perspectives on Speaker Contributions to Aligned Turn-Timing: . Paper presented at The Role of the Current Speaker in Conversational Turn Taking – Theoretical, Experimental, and Corpus Linguistic Perspectives on Speaker Contributions to Aligned Turn-Timing, 2021.
Open this publication in new window or tab >>Turn-taking in conversation from the larynx down
2021 (English)In: The Role of the Current Speaker in Conversational Turn Taking – Theoretical, Experimental, and Corpus Linguistic Perspectives on Speaker Contributions to Aligned Turn-Timing / [ed] Barthel, Mathias, 2021Conference paper, Oral presentation with published abstract (Refereed)
Abstract [en]

In this talk, we will give an overview of some of our results, both old and new, about respiratory and phonatory turn-taking cues. Both of these aspects of turn coordination are rarely addressed in literature, which focuses primarily on its articulatory and prosodic characteristics.

In the respiratory part of the presentation, we will discuss a new categorisation of turn-taking events which combines the criterion of speaker change with whether the original speaker inhales be- fore producing the next talkspurt. We will demonstrate that the latter criterion could be potentially used as a proxy for pragmatic completeness of the previous utterance (and, by extension, of the inter- ruptive character of the incoming speech). Specifically, respiratory properties of silences accompanied by speaker change in which the original speaker continues talking without breathing in are similar to those in within-speaker, turn-holding silences. We will also present evidence that the likelihood of speaker change is higher during pauses accompanied by a respiratory hold, suggesting that breath holds are used in reaction to incoming talk rather than as a turn-holding cue. In addition to analysing dimensions which are routinely omitted in studies of interactional functions of breathing (exhalations, presence of overlapping speech, breath holds), we will analyse patterns of breath holds in silent breath- ing and show that breath holds are sometimes produced towards the beginning (and towards the top) of silent exhalations, potentially indicating an abandoned intention to take the turn. We claim that the breathing signal can thus be successfully used for uncovering hidden turn-taking events, which are otherwise obscured by silence-based representations of interaction.

Moving up from the lungs to the larynx, in the second part of the talk we will focus on our ongoing work on voice quality variation in spontaneous interactions, a topic which has received little attention so far, not least because of the technical difficulties associated with recording phonation in continuous speech. In order to circumvent these problems, we are using miniature accelerometers attached to the skin of the tracheal wall below the glottis (“throat microphones”). Tue method, which has been used for some time in ambulatory postoperative voice monitoring, provides a good approximation of the voice source without the need for glottal inverse-filtering. We will demonstrate that the accelerometer signal can be successfully used to differentiate between voice qualities in isolated vowels while being unaffected by vocal tract resonances, fo and speaker variation. We will also present some preliminary result comparing several voice quality measures in speech intervals preceding silences accompanied by speaker change or followed by more speech from the same person. We demonstrate that utterances ending in speaker changes are characterised by lower periodicity and higher rates of creaky voice. Tue findings are thus consistent with the “trailing-off” character of these silences, as suggested in literature.

National Category
General Language Studies and Linguistics
Research subject
Phonetics
Identifiers
urn:nbn:se:su:diva-204845 (URN)
Conference
The Role of the Current Speaker in Conversational Turn Taking – Theoretical, Experimental, and Corpus Linguistic Perspectives on Speaker Contributions to Aligned Turn-Timing, 2021
Funder
Swedish Research Council, 2019-02932
Note

Prosodicfunctions of voice quality dynamics

Available from: 2022-05-20 Created: 2022-05-20 Last updated: 2022-05-24Bibliographically approved
Aare, K., Gilmartin, E., Włodarczak, M., Lippus, P. & Heldner, M. (2020). Breath holds in chat and chunk phases of multiparty casual conversation. In: Proceedings of Speech Prosody 2020: . Paper presented at Speech Prosody 2020, Tokyo, Japan, 25-28 May, 2020 (pp. 779-783).
Open this publication in new window or tab >>Breath holds in chat and chunk phases of multiparty casual conversation
Show others...
2020 (English)In: Proceedings of Speech Prosody 2020, 2020, p. 779-783Conference paper, Published paper (Refereed)
National Category
General Language Studies and Linguistics
Research subject
Phonetics
Identifiers
urn:nbn:se:su:diva-194793 (URN)10.21437/SpeechProsody.2020-159 (DOI)
Conference
Speech Prosody 2020, Tokyo, Japan, 25-28 May, 2020
Projects
Breathing in conversation (VR 2014-1072)Hidden events in turn-taking (MAW 2017.0034)
Funder
Swedish Research Council, 2014- 1072
Available from: 2021-07-05 Created: 2021-07-05 Last updated: 2022-02-02Bibliographically approved
Wlodarczak, M. & Heldner, M. (2020). Breathing in Conversation. Frontiers in Psychology, 11, Article ID 575566.
Open this publication in new window or tab >>Breathing in Conversation
2020 (English)In: Frontiers in Psychology, E-ISSN 1664-1078, Vol. 11, article id 575566Article in journal (Refereed) Published
Abstract [en]

This work revisits the problem of breathing cues used for management of speaking turns in multiparty casual conversation. We propose a new categorization of turn-taking events which combines the criterion of speaker change with whether the original speaker inhales before producing the next talkspurt. We demonstrate that the latter criterion could be potentially used as a good proxy for pragmatic completeness of the previous utterance (and, by extension, of the interruptive character of the incoming speech). We also present evidence that breath holds are used in reaction to incoming talk rather than as a turn-holding cue. In addition to analysing dimensions which are routinely omitted in studies of interactional functions of breathing (exhalations, presence of overlapping speech, breath holds), the present study also looks at patterns of breath holds in silent breathing and shows that breath holds are sometimes produced toward the beginning (and toward the top) of silent exhalations, potentially indicating an abandoned intention to take the turn. We claim that the breathing signal can thus be successfully used for uncovering hidden turn-taking events, which are otherwise obscured by silence-based representations of interaction.

Keywords
turn-taking, multiparty casual conversation, respiratory inductance plethysmography, breathing, interaction chronography
National Category
Psychology
Identifiers
urn:nbn:se:su:diva-188243 (URN)10.3389/fpsyg.2020.575566 (DOI)000584388700001 ()33162915 (PubMedID)
Available from: 2020-12-28 Created: 2020-12-28 Last updated: 2022-02-25Bibliographically approved
Wikse Barrow, C., Strömbergsson, S. & Heldner, M. (2019). A multidimensional investigation of covert contrast in Swedish acquiring children's speech - a project description. In: Mattias Heldner (Ed.), Proceedings from FONETIK 2019 Stockholm, June 10–12, 2019: . Paper presented at FONETIK 2019, Stockholm, Sweden, June 10-12, 2019 (pp. 79-83). Stockholm: Stockholm University
Open this publication in new window or tab >>A multidimensional investigation of covert contrast in Swedish acquiring children's speech - a project description
2019 (English)In: Proceedings from FONETIK 2019 Stockholm, June 10–12, 2019 / [ed] Mattias Heldner, Stockholm: Stockholm University, 2019, p. 79-83Conference paper, Published paper (Other (popular science, discussion, etc.))
Abstract [en]

This paper provides a description of a current PhD project in phonetics at the Department of Linguistics at Stockholm University. A short background is pro- vided, the intended experiments are pre- sented and the potential contributions of the results are outlined.

Place, publisher, year, edition, pages
Stockholm: Stockholm University, 2019
Series
PERILUS, ISSN 0282-6690 ; XXVII
National Category
General Language Studies and Linguistics
Research subject
Phonetics
Identifiers
urn:nbn:se:su:diva-185082 (URN)10.5281/zenodo.3246006 (DOI)9789177979845 (ISBN)
Conference
FONETIK 2019, Stockholm, Sweden, June 10-12, 2019
Available from: 2020-09-16 Created: 2020-09-16 Last updated: 2022-02-25Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-0034-0924

Search in DiVA

Show all publications