13th Speech in Noise Workshop, 20-21 January 2022, Virtual Conference 13th Speech in Noise Workshop, 20-21 January 2022, Virtual Conference

P55 A data-driven distance metric for evaluating the effects of dynamic range compression in adverse conditions

Niels Overby, Torsten Dau, Tobias May
Technical University of Denmark

(a) Presenting

Dynamic range compression is one of the most essential building blocks in modern hearing aids and aims at restoring audibility for hearing-impaired listeners. However, the choice of suitable compression parameters, such as the time constants associated with the level estimation stage, depends on the acoustic conditions and the perceptual benefit of different parameter configurations is still controversial. Listening tests can provide an accurate assessment of the perceptual effects of compression in a limited set of acoustic conditions, but they are time-consuming and can therefore not be used to optimize the various compression parameters across experimental conditions. While several studies have attempted to link the perceptual outcomes of dynamic range compression to a set of objective metrics, there is no agreement on how to objectively quantify the effects of compression. In the current study, a data-driven distance metric based on objective metrics was developed to analyze different compression systems. This analysis included slow-acting, fast-acting, and ‘scene-aware’ compression that adaptively switched between fast- and slow-acting compression depending on the target source activity. In addition, a hypothesized ‘ideal’ system, termed ‘source-independent compression’, was used as a reference that had access to the individual signals and applied fast-acting compression to the target speech signal and slow-acting compression to the noise and reverberation. A comprehensive list of objective metrics was considered to evaluate the effect of the compression systems in a wide variety of acoustic conditions, including both interfering noise and room reverberation. Sparse principal component analysis was then applied to derive a compact set of interpretable features that explained the effects of compression as linear combinations of sparsely selected objective metrics. The reduced set of features corresponded to the amount of distortion of the noise and reverberation, the amount of compression of the target speech signal, and the relative amount of amplification of the target speech compared to the noise and reverberation. The Euclidean distance, within the reduced dimensionality representation, was used to compare the similarity between the compression systems. In this comparison, the adaptive ‘scene-aware’ compression system was consistently more similar to the ‘source-independent’ system compared to fast- and slow-acting compression for speech signals in both noise and reverberation. This newly developed distance metric allows a systematic analysis and optimization of the parameters of dynamic range compression systems by minimizing the Euclidean distance with respect to the source-independent compression system.

Last modified 2022-01-24 16:11:02