CSL-EMG_Array: An Open Access Corpus for EMG-to-Speech Conversion
by , ,
Abstract:
We present a new open access corpus for the training and evaluation of EMG-to-Speech conversion systems based on array electromyographic recordings. The corpus is recorded with a recording paradigm closely mirroring realistic EMG-to-Speech usage scenarios, and includes evaluation data recorded from both audible as well as silent speech. The corpus consists of 9.5 hours of data, split into 12 sessions recorded from 8 speakers. Based on this corpus, we present initial benchmark results with a realistic online EMG-to-Speech conversion use case, both for the audible and silent speech subsets. We also present a method for drastically improving EMG-to-Speech system stability and performance in the presence of time-related artifacts.
Reference:
CSL-EMG_Array: An Open Access Corpus for EMG-to-Speech Conversion (Lorenz Diener, Mehrdad Roustay Vishkasougheh, Tanja Schultz), In INTERSPEECH 2020 - 21st Annual Conference of the International Speech Communication Association, 2020 (to appear).
Bibtex Entry:
@inproceedings{diener2020cslemgarray,
    title={{CSL-EMG\_Array: An Open Access Corpus for EMG-to-Speech Conversion}},
    author={Diener, Lorenz and  Roustay Vishkasougheh, Mehrdad and Schultz, Tanja},
    booktitle={{INTERSPEECH} 2020 - 21st Annual Conference of the International Speech Communication Association},
    year={2020 (to appear)},
    url={https://www.csl.uni-bremen.de/cms/images/documents/publications/Diener_IS2020_CSLEMGArray.pdf},
    abstract={We present a new open access corpus for the training and evaluation of EMG-to-Speech conversion systems based on array electromyographic recordings. The corpus is recorded with a recording paradigm closely mirroring realistic EMG-to-Speech usage scenarios, and includes evaluation data recorded from both audible as well as silent speech. The corpus consists of 9.5 hours of data, split into 12 sessions recorded from 8 speakers. Based on this corpus, we present initial benchmark results with a realistic online EMG-to-Speech conversion use case, both for the audible and silent speech subsets. We also present a method for drastically improving EMG-to-Speech system stability and performance in the presence of time-related artifacts.},
}