Abstract

Speech compression has become an integral component in all modern telecommunications networks. Numerous codecs have been developed and deployed for efficiently transmitting voice signals while maintaining high perceptual quality. Because of the diversity of speech codecs used by different carriers and networks, the ability to distinguish between different codecs lends itself to a wide variety of practical applications, including determining call provenance, enhancing network diagnostic metrics, and improving automated speaker recognition. However, few research efforts have attempted to provide a methodology for identifying amongst speech codecs in an audio signal. In this research, we demonstrate a novel approach for accurately determining the presence of several contemporary speech codecs in a non-intrusive manner. The methodology developed in this research demonstrates techniques for analyzing an audio signal such that the subtle noise components introduced by the codec processing are accentuated while most of the original speech content is eliminated. Using these techniques, an audio signal may be profiled to gather a set of values that effectively characterize the codec present in the signal. This procedure is first applied to a large data set of audio signals from known codecs to develop a set of trained profiles. Thereafter, signals from unknown codecs may be similarly profiled, and the profiles compared to each of the known training profiles in order to decide which codec is the best match with the unknown signal. Overall, the proposed strategy generates extremely favorable results, with codecs being identified correctly in nearly 95% of all test signals. In addition, the profiling process is shown to require a very short analysis length of less than 4 seconds of audio to achieve these results. Both the identification rate and the small analysis window represent dramatic improvements over previous efforts in speech codec identification.

Library of Congress Subject Headings

Sound--Recording and reproducing--Digital techniques; Signal processing--Digital techniques; Telecommunication systems--Data processing

Publication Date

11-1-2011

Document Type

Thesis

Student Type

- Please Select One -

Department, Program, or Center

Computer Engineering (KGCOE)

Advisor

Kwasinski, Andres

Comments

Note: imported from RIT’s Digital Media Library running on DSpace to RIT Scholar Works in December 2013.

Recommended Citation

Jenner, Frank, "Non-intrusive identification of speech codecs in digital audio signals" (2011). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/3183

Campus

RIT – Main Campus

Plan Codes

CMPE-MS

Download

COinS

Theses

Non-intrusive identification of speech codecs in digital audio signals

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Department, Program, or Center

Advisor

Comments

Recommended Citation

Campus

Plan Codes

Search

Browse

Author Corner

RIT Links

Theses

Non-intrusive identification of speech codecs in digital audio signals

Author

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Department, Program, or Center

Advisor

Comments

Recommended Citation

Campus

Plan Codes

Share

Search

Browse

Author Corner

RIT Links