Abstract

Music is a means of reflecting and expressing emotion. Personal preferences in music vary between individuals, influenced by situational and environmental factors. Inspired by attempts to develop alternative feature extraction methods for audio signals, this research analyzes the use of deep network structures for extracting features from musical audio data represented in the frequency domain. Image-based network models are designed to be robust and accurate learners of image features. As such, this research develops image-based ImageNet deep network models to learn feature data from music audio spectrograms. This research also explores the use of an audio source separation tool for preprocessing the musical audio before training the network models. The use of source separation allows the network model to learn features that highlight individual contributions to the audio track, and use those features to improve classification results.

The features extracted from the data are used to highlight characteristics of the audio tracks, which are then used to train classifiers that categorize the musical data for genre and auto-tag classifications. The results obtained from each model are contrasted with state-of-the-art methods of classification and tag prediction for musical tracks. Deeper networks with input source separation are shown to yield the best results.

Library of Congress Subject Headings

Music--Data processing; Neural networks (Computer science); Machine learning

Publication Date

3-2017

Document Type

Thesis

Student Type

Graduate

Degree Name

Computer Engineering (MS)

Department, Program, or Center

Computer Engineering (KGCOE)

Advisor

Raymond Ptucha

Advisor/Committee Member

Andreas Savakis

Advisor/Committee Member

Sonia Lopez Alarcon

Comments

Physical copy available from RIT's Wallace Library at ML74 .D24 2017

Recommended Citation

Daigneau, Madeleine, "Advanced Music Audio Feature Learning with Deep Networks" (2017). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/9426

Campus

RIT – Main Campus

Plan Codes

CMPE-MS

Download

COinS

Theses

Advanced Music Audio Feature Learning with Deep Networks

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Advisor/Committee Member

Advisor/Committee Member

Comments

Recommended Citation

Campus

Plan Codes

Search

Browse

Author Corner

RIT Links

Theses

Advanced Music Audio Feature Learning with Deep Networks

Author

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Advisor/Committee Member

Advisor/Committee Member

Comments

Recommended Citation

Campus

Plan Codes

Share

Search

Browse

Author Corner

RIT Links