An algorithm involving MFCCs and SVMs is provided to perform speaker gender recognition. For each signal, the mean vector of MFCCs matrix is used as an input vector in the SVM algorithm. A sample of 246 signals, containing 124 female voice and 122 male voice, is analyzed based on this algorithm. With only the first 13 MFCCs, the average prediction error is as low as 7% in a cross-validation of size 500. It is shown that this error drops down below 1% as the number of MFCCs increases to 27. Also, the RBF kernel is compared with polynomial kernel and considered as a better kernel function in this gender recognition task.
Department, Program, or Center
The John D. Hromi Center for Quality and Applied Statistics (KGCOE)
Fokoue, Ernest and Ma, Zichen, "Speaker Gender Recognition via MFCCs and SVMs" (2013). Accessed from
RIT – Main Campus
Note: imported from RIT's Digital Media Library running on DSpace to RIT Scholar Works on April 2014.