Enhancement of EMG-based Thai Number Words Classification using Frame-based Time Domain Feature with Stacking Filter
by , , , , ,
Abstract:
In order to overcome a problem existing in a classical automatic speech recognition (e.g. ambient noise and loss of privacy), Electromyography (EMG) from speech production muscles was used in place of a human speech signal. We aim to investigate the EMG speech recognition based on Thai language. The earlier work, we used five channels of the EMG from the facial and neck muscles to classify 11 Thai number words based on Neural Network Classification. 15 features in time domain and frequency domain were employed for feature extraction. We obtained an average accuracy rate of 89.45% for audible speech and 78.55% for silent speech. However, it needs to be enhanced to get the best result. This paper proposes to improve an accuracy rate of EMG-based Thai number words classification. The ten subjects uttered 11 words in both an audible and a silent speech while five channels of the EMG signal were captured. Frame-based time domain features with a stacking filter was performed for feature extraction stage. After that, LDA was used to lessen a dimension of the feature vector. Hidden Markov Model (HMM) was employed in classification stage. The results show that using above techniques of feature extraction, feature dimensionality reduction and classification can improve an average accuracy rate by 3% absolute for audible speech when were compared to earlier work. We achieved an average classification rate of 92.45% and 75.73% for audible and silent speech respectively.
Reference:
Enhancement of EMG-based Thai Number Words Classification using Frame-based Time Domain Feature with Stacking Filter (Niyawadee Jib Srisuwan, Chusak Limsakul, Pornchai Phukpattaranont, Tanja Schultz, Michael Wand, Matthias Janke), In Proceedings of 2014 APSIPA Annual Summit and Conference, 2014.
Bibtex Entry:
@inproceedings{srisuwan2014enhancement,
  title={Enhancement of EMG-based Thai Number Words Classification using Frame-based Time Domain Feature with Stacking Filter},
  year={2014},
  booktitle={Proceedings of 2014 APSIPA Annual Summit and Conference},
  abstract={In order to overcome a problem existing in a classical automatic speech recognition (e.g. ambient noise and loss of privacy), Electromyography (EMG) from speech production muscles was used in place of a human speech signal. We aim to investigate the EMG speech recognition based on Thai language. The earlier work, we used five channels of the EMG from the facial and neck muscles to classify 11 Thai number words based on Neural Network Classification. 15 features in time domain and frequency domain were employed for feature extraction. We obtained an average accuracy rate of 89.45% for audible speech and 78.55% for silent speech. However, it needs to be enhanced to get the best result. This paper proposes to improve an accuracy rate of EMG-based Thai number words classification. The ten subjects uttered 11 words in both an audible and a silent speech while five channels of the EMG signal were captured. Frame-based time domain features with a stacking filter was performed for feature extraction stage. After that, LDA was used to lessen a dimension of the feature vector. Hidden Markov Model (HMM) was employed in classification stage. The results show that using above techniques of feature extraction, feature dimensionality reduction and classification can improve an average accuracy rate by 3% absolute for audible speech when were compared to earlier work. We achieved an average classification rate of 92.45% and 75.73% for audible and silent speech respectively.},
  author={Srisuwan, Niyawadee Jib and Limsakul, Chusak and Phukpattaranont, Pornchai and Schultz, Tanja and Wand, Michael and Janke, Matthias}
}