SpiN 2022 :: programme

P50 Encoding of naturalistic speech with simulated hearing loss in fMRI Responses

Arkan Al-Zubaidi, Leo Michalke, Jochem Rieger
Applied Neurocognitive Psychology Lab and Cluster of Excellence Hearing4all, Oldenburg University, Oldenburg, Germany

(a) Presenting

In neuroimaging, voxel-wise encoding models are popular tools to characterize and predict brain activity from a given stimulus representation. In the auditory domain, hearing a naturalistic auditory stimulus leads to changes in brain activity that subserves sensory processing. Yet, little is known about cortical auditory processing of speech under clear and degraded naturalistic stimuli. We used a German audio description of the movie "Forrest Gump" soundtrack as a clear naturalistic stimulus (CS). However, the movie's soundtracks were adjusted by hearing loss simulators to build two types of degraded naturalistic stimuli. One with low degradation (S2, steep sloping) and one with high (N4, moderately sloping) at higher frequencies. This study uses a data-driven approach to investigate how the acoustic information related to the CS, S2 and N4 stimuli is processed in the auditory cortex. We recorded the fMRI in 10 normal-hearing participants distributed over three sessions. The participants listened to the full movie in each session but in eight segments, i.e. eight runs. In each session, the eight movie segments were presented in chronological order. Each session began (run1) and ended (run8) with CS stimuli, while runs from 2 to 7 were presented as CS, S2, or N4 stimuli. The order of degradations was randomized over scan sessions, but the number of presentations was balanced across degradation levels. After every movie segment, participants were asked to rate their speech perception and answer two questions on the content of the preceding segment.

After preprocessing the fMRI data, we performed voxel-wise encoding models to predict BOLD activity elicited by an auditory movie envelope. We estimated encoding models for the sound envelope of CS, S2, and N4 stimuli separately and predicted left out runs in cross-validation schemes to test the generalization of the encoding models. The speech perception is best for CS, intermediate for S2, and worst for N4 stimuli. In contrast, the encoding models best predicted the BOLD responses in N4 compared to CS and S2 conditions. Primarily, in the N4 condition, we found the highest correlations between predicted and observed BOLD-responses concentrated in the core auditory areas on the superior temporal cortex (Heschl's gyrus). However, high correlations were less focal in the CS and S2 conditions. Thus, our results concur with the assumption that degraded stimuli require an increase of attention which enhances activation in early auditory areas. Alternatively, the increased activation may indicate an improved prediction error in processing degraded stimuli.

Last modified 2022-01-24 16:11:02