Odette Scharenborg (Radboud University Nijmegen, The Netherlands)
Martin Cooke (University of Sheffield, UK)
Sponsored by: The PASCAL Network
Listeners outperform automatic speech recognition systems at every level of speech recognition, including the very basic level of consonant recognition. What is not clear is where the human advantage originates. Does the fault lie in the acoustic representations of speech or in the recogniser architecture, or in a lack of compatibility between the two? There have been relatively few studies comparing human and automatic speech recognition on the same task, and, of these, overall identification performance is the dominant metric. However, there are many insights which might be gained by carrying out a far more detailed comparison.
The purpose of this Special Session is to promote focused human-computer comparisons on a task involving consonant identification in noise, with all participants using the same training and test data. Training and test data and native listener and baseline recogniser results will be provided by the organisers, but participants are encouraged to also contribute listener responses.
Contributions are sought in (but not limited to) the following areas:
The results of the Challenge will be presented at a Special Session of Interspeech’08 in Brisbane, Australia.
Although the Interspeech 2008 deadline has passed, the Consonant Challenge remains open and we are happy to host new results. Please send us an e-mail if you have any contributions you want put on this website.
NEW: Papers related to the Consonant Challenge can be found and downloaded here .