Abstract

An algorithm involving MFCCs and SVMs is provided to perform speaker gender recognition. For each signal, the mean vector of MFCCs matrix is used as an input vector in the SVM algorithm. A sample of 246 signals, containing 124 female voice and 122 male voice, is analyzed based on this algorithm. With only the first 13 MFCCs, the average prediction error is as low as 7% in a cross-validation of size 500. It is shown that this error drops down below 1% as the number of MFCCs increases to 27. Also, the RBF kernel is compared with polynomial kernel and considered as a better kernel function in this gender recognition task.

Publication Date

2013

Comments

Note: imported from RIT's Digital Media Library running on DSpace to RIT Scholar Works on April 2014.

Document Type

Technical Report

Department, Program, or Center

The John D. Hromi Center for Quality and Applied Statistics (KGCOE)

Campus

RIT – Main Campus

Share

COinS