P68 The effects of lexical content, sentence context, and vocoding on voice cue perception
Speech perception in a cocktail party like situation can be challenging, especially for cochlear-implant (CI) users. Perceiving differences in voice cues, such as fundamental frequency (F0) and/or vocal-tract length (VTL), in difficult listening conditions, can facilitate speech communication tremendously. In a recent study, we showed the effect of lexical content on just-noticeable-differences (JNDs) in F0 and VTL voice cue perception. Specifically, within the context of high acoustic and linguistic variability, when presented with words, participants showed smaller VTL JNDs compared to time-reversed words, and this observation did not change when vocoding was applied. For F0 JNDs only in the non-vocoded condition with low variability, a lexical content benefit (words vs time-reversed words) was shown. These outcomes inspired two follow-up studies.
The first study expanded on the lexical content benefit effect on VTL perception, by comparing words, time-reversed words, and non-words. The purpose was to investigate if the lexical content benefit is related to lexical (words) and/or phonemic (non-words) content. In the second study, we investigated the effect of additional acoustic speech information and/or semantic context on F0 and VTL voice cue perception, by comparing words and sentences. In both experiments non-vocoded and vocoded auditory stimuli were presented, while participants performed an adaptive 3AFC task to determine the voice JNDs.
The outcomes of the first study showed a replication of the detrimental effect reversed words have on VTL perception. Additionally, VTL JNDs with non-words did not significantly differ from VTL JNDs with words, suggesting linguistic content benefits VTL perception at the phonemic level. Although there was a main effect of vocoding this did not interact with item type. Study 2 showed a benefit in processing a full sentence compared to a single word in both F0 and VTL perception, suggesting that the amount of acoustic speech information and/or semantic context led to increased voice cue sensitivity. There was a main vocoder effect and an interaction between voice cue and vocoding indicating a stronger negative effect of vocoding on F0 compared to VTL perception.
In addition to previous findings suggesting a lexical advantage effect, the current results show more specifically, that phonemic content available in both words and non-words improves VTL perception. Both F0 and VTL perception benefit from more content in sentences (possibly lexical or semantic) compared to words. These results may improve our understanding of speech and voice perception processes, and result in rehabilitation tools for populations with limited access to voice information, such as CI listeners.