SpiN 2022 :: programme

P13 The effect of voice familiarity via training on voice cue sensitivity and listening effort during voice discrimination of vocoder degraded speech

Ada Biçer, Thomas Koelewijn, Deniz Başkent
University Medical Center Groningen, University of Groningen, Netherlands

(a) Presenting
(b) Attending

Understanding speech in real life can be challenging, such as in multiple talker listening conditions. Fundamental frequency (F0) and vocal-tract length (VTL) voice cues can help listeners segregate between talkers, enhancing speech perception in adverse listening conditions. Previous research showed that degradations of cochlear implant (CI) hearing reduce sensitivity to F0+VTL voice cues compared to normal hearing (NH), and in some listening situations, familiarity with a talker could provide an advantage. In this study, we investigated how voice familiarity could affect perceptual discrimination of voice cues, as well as listening effort, with or without vocoder degradations.

To establish voice familiarity, we implemented an implicit short-term voice training. Participants listened to a recording of a book segment that was presented for approximately 30 minutes, and to ensure engagement, they had to answer text-related questions. Following voice training, just-noticeable-differences (JNDs) for F0+VTL were measured with an odd-one-out task implemented as a 3 alternative forced choice adaptive paradigm. During the procedure, the reference voice either belonged to the trained voice or an unfamiliar voice, presented in both unprocessed and vocoder-degraded (12-band with low spread of excitation) versions. Effects of voice familiarity (trained and untrained voice), vocoding (non-vocoded and vocoded) and item variability (fixed or variable consonant-vowel triplets presented across three items) on voice cue sensitivity (F0+VTL JNDs) and listening effort (pupillometry measurements) were analyzed.

Results showed that F0+VTL JNDs were significantly larger for vocoded conditions than for non-vocoded conditions. With variable item presentations JNDs were significantly larger than fixed item presentations. Contrary to our expectations, voice training did not have a significant effect on voice cue discrimination. Peak Pupil Dilation response was significantly larger for vocoded conditions compared to non-vocoded conditions. Over the time course of pupil dilation response, analyzed with GAMM, there was a significant difference between untrained and trained voices while listening to vocoded speech. Specifically, pupil dilation was significantly larger during voice discrimination while listening to unfamiliar, vocoded voices than listening to trained, vocoded voices. However, there was no significant difference between conditions of voice training on the pupil dilation while listening to non-vocoded voices. These findings imply that, even in the absence of a clear benefit in behavioral measures of JNDs, voice discrimination among vocoded voices was less effortful with short-term voice training.

Funding: VICI grant 918-17-603 from the Netherlands Organization for Scientific Research (NWO) and the Netherlands Organization for Health Research and Development (ZonMw), the Heinsius Houbolt Foundation, and a Rosalind Franklin Fellowship.

Last modified 2022-01-24 16:11:02