Estimation of Fundamental Frequency from Surface Electromyographic Data
by , , ,
Abstract:
In this paper, we present our recent studies of $F_0$ estimation from the surface electromyographic (EMG) data using a Gaussian mixture model (GMM)-based voice conversion (VC) technique, referred to as EMG-to-$F_0$. In our approach, a support vector machine recognizes individual frames as unvoiced and voiced (U/V), and voiced $F_0$ contours are discriminated by the trained GMM based on the manner of minimum mean-square error. EMG-to-$F_0$ is experimentally evaluated using three data sets of different speakers. Each data set includes almost 500 utterances. Objective experiments demonstrate that we achieve a correlation coefficient of up to 0.49 between estimated and target $F_0$ contours with more than 84% U/V decision accuracy, although the results have large variations.
Reference:
Estimation of Fundamental Frequency from Surface Electromyographic Data (Keigo Nakamura, Matthias Janke, Michael Wand, Tanja Schultz), In IEEE International Conference on Acoustics, Speech and Signal Processing, 2011.
Bibtex Entry:
@inproceedings{nakamura2011estimation,
  year={2011},
  title={Estimation of Fundamental Frequency from Surface Electromyographic Data},
  booktitle={IEEE International Conference on Acoustics, Speech and Signal Processing},
  url={https://www.csl.uni-bremen.de/cms/images/documents/publications/NakamuraSchultz_ICASSP2011.pdf},
  abstract={In this paper, we present our recent studies of $F_0$ estimation from the surface electromyographic (EMG) data using a Gaussian mixture model (GMM)-based voice conversion (VC) technique, referred to as EMG-to-$F_0$. In our approach, a support vector machine recognizes individual frames as unvoiced and voiced (U/V), and voiced $F_0$ contours are discriminated by the trained GMM based on the manner of minimum mean-square error. EMG-to-$F_0$ is experimentally evaluated using three data sets of different speakers. Each data set includes almost 500 utterances.  Objective experiments demonstrate that we achieve a correlation coefficient of up to 0.49 between estimated and target $F_0$ contours with more than 84% U/V decision accuracy, although the results have large variations.},
  keywords={Electromyography, Voice conversion, Fundamental frequency, Feature estimation},
  author={Nakamura, Keigo and Janke, Matthias and Wand, Michael and Schultz, Tanja}
}