Abstract

Image classification and object recognition are some of the most prominent problems in computer vision. The difficult nature of finding objects regardless of pose and occlusions requires a large number of compute resources. Recent advancements in technology have made great strides towards solving this problem, and in particular, deep learning has revolutionized this field in the last few years.

The classification of large datasets, such as the popular ImageNet dataset, requires a network with millions of weights. Learning each of these weights using back propagation requires a compute intensive training phase with many training samples. Recent compute technology has proven adept at classifying 1000 classes, but it is not clear if computers will be able to differentiate and classify the more than 40,000 classes humans are capable of doing. The goal of this thesis is to train computers to attain human-like performance on large-class datasets. Specifically, we introduce two types of hierarchical architectures: Late Fusion and Early Fusion. These architectures will be used to classify datasets with up to 1000 objects, while simultaneously reducing both the number of computations and training time. These hierarchical architectures maintain discriminative relationships amongst networks within each layer as well as an abstract relationship from one layer to the next. The resulting framework reduces the individual network sizes, and thus the total number of parameters that need to be learned. The smaller number of parameters results in decreased training time.

Library of Congress Subject Headings

Classification--Data processing; Machine learning; Image processing--Digital techniques; Optical pattern recognition

Publication Date

5-2016

Document Type

Thesis

Student Type

Graduate

Degree Name

Computer Engineering (MS)

Department, Program, or Center

Computer Engineering (KGCOE)

Advisor

Raymond Ptucha

Advisor/Committee Member

Andreas Savakis

Advisor/Committee Member

John Kerekes

Comments

Physical copy available from RIT's Wallace Library at TA1650 .N66 2016

Campus

RIT – Main Campus

Share

COinS