13th Speech in Noise Workshop, 20-21 January 2022, Virtual Conference 13th Speech in Noise Workshop, 20-21 January 2022, Virtual Conference

P15 Children’s use of target/masker F0 contour differences during speech-in-speech recognition

Mary Flaherty
University of Illinois at Urbana-Champaign, US

(a) Presenting

This study investigated the extent to which children (n = 98, ages 5-17 yrs) can take advantage of differences in fundamental frequency (F0) contour to improve speech-in-speech recognition. F0 contour refers to the natural variation, or rise and fall, of F0 within an utterance. While talker differences in mean F0 can improve speech-in-speech recognition for adults, children’s ability to use mean F0 differences remains immature into adolescence. One explanation for this age effect is that children may rely more than adults on dynamic, time-varying acoustic cues that contain redundant information. Examining F0 benefit in children, Flaherty et al. [2019, Ear Hear. 40(4):927; 2021, JSHLR 64(1):206] carefully controlled voice characteristics to isolate effects of mean F0, leaving F0 contours of the utterances unaltered. In natural speech, F0 co-varies with duration and intensity. Children’s ability to segregate target from masker speech was expected to improve as a function of the magnitude of time-varying differences in F0. In the present study, sentence recognition was measured adaptively in a two-female-talker speech masker. Both target and masker sentences were recorded with either neutral, flat, or exaggerated F0 contours. Adults (n=30) were also tested as a measure of mature performance. The results revealed that children’s sentence recognition was impacted by differences in F0 contour depth between competing talkers, but the pattern of results differed between children and adults. While both children and adults benefitted to a similar degree when the target speaking style was Flat, age effects were observed in conditions with neutral and exaggerated speaking styles. Contrary to our hypothesis, children did not show a consistent benefit when the speaking style was exaggerated, but instead often showed a decrement in performance. This may reflect that the sentences with an exaggerated F0 contour have F0 trajectories that are less predictable, thus increasing stimulus uncertainty and making speech-in-speech recognition more difficult for children. Overall, the observed age effects in the current study do not appear to be due to limitations in children’s ability to use F0 contour differences in general, but are likely related to the perceptual salience of the target contour relative to the masker. This suggests that the use of F0 contour depth differences as a segregation cue during speech recognition develops relatively early compared to the use of mean F0 difference between target/masker speech (Flaherty et al., 2019; 2021). However, the results indicate that the magnitude of benefit for children depends on the predictability of the F0 contours in question.

Last modified 2022-01-24 16:11:02