P54 The first clarity enhancement challenge for hearing aid processing
In 2021, the Clarity project ran the first ever open machine learning challenge for hearing aids (https://claritychallenge.github.io/clarity_CEC1_doc/). It was aimed at improving the processing of speech-in-noise. This paper will briefly outline this Enhancement Challenge. Competitors were tasked with improving speech-in-noise for cases where there was one target speaker, one noise interferer, a room with low reverberation, and the audiogram of the listener was known. The particular difficulties in running challenges for hearing aid processing with remote listening panels will be discussed. Innovations such as listening test subjects saying what they heard out loud, before scoring with ASR (automatic speech recognition) will be discussed, and compared to the more traditional approach of human transcription. Entrants were scored in two ways by: (i) Passing the improved speech-in-noise through a hearing loss model and then evaluating using the objective intelligibility metric MBSTOI (Modified Binaural Short-Time Objective Intelligibility). (ii) Listening tests of speech intelligibility using a panel of people with a hearing loss. For (i) the objective evaluation, 13 systems were evaluated, for (ii) the listening tests, 10 systems were used. The results from both evaluations will be presented. These demonstrated weaknesses in the objective evaluation. Consequently, a Perception Challenge (https://claritychallenge.github.io/clarity_CPC1_doc/) is currently running to improve the prediction of speech intelligibility, especially for people with a hearing loss listening through a hearing aid. For the Enhancement Challenge, a mixture of approaches were seen in the entrants. The static scenes and the limited range of positions for the target talker, made beamforming a good approach; this was used by six teams. The second enhancement challenge starting in Spring 2022 will have head movements to make the scenarios more realistic and more challenging for beam formers. A variety of approaches for (i) noise removal using Deep Neural Networks (DNNs), and (ii) Hearing Loss Compensation were used. The highest scoring systems were the ones that were robust across a wider range of signal-to-noise ratios.