Supervised classification methods often assume the train and test data distributions are the same and that all classes in the test set are present in the training set. However, deployed classifiers require the ability to recognize inputs from outside the training set as unknowns and update representations in near real-time to account for novel concepts unknown during offline training. This problem has been studied under multiple paradigms including out-of-distribution detection and open set recognition; however, for convolutional neural networks, there have been two major approaches: 1) inference methods to separate known inputs from unknown inputs and 2) feature space regularization strategies to improve model robustness to novel inputs. In this dissertation, we explore the relationship between the two approaches and directly compare performance on large-scale datasets that have more than a few dozen categories. Using the ImageNet large-scale classification dataset, we identify novel combinations of regularization and specialized inference methods that perform best across multiple open set classification problems of increasing difficulty level. We find that input perturbation and temperature scaling yield significantly better performance on large-scale datasets than other inference methods tested, regardless of the feature space regularization strategy. Conversely, we also find that improving performance with advanced regularization schemes during training yields better performance when baseline inference techniques are used; however, this often requires supplementing the training data with additional background samples which is difficult in large-scale problems.
To overcome this problem we further propose a simple regularization technique that can be easily applied to existing convolutional neural network architectures that improves open set robustness without the requirement for a background dataset. Our novel method achieves state-of-the-art results on open set classification baselines and easily scales to large-scale problems.
Finally, we explore the intersection of open set and continual learning to establish baselines for the first time for novelty detection while learning from online data streams. To accomplish this we establish a novel dataset created for evaluating image open set classification capabilities of streaming learning algorithms. Finally, using our new baselines we draw conclusions as to what the most computationally efficient means of detecting novelty in pre-trained models and what properties of an efficient open set learning algorithm operating in the streaming paradigm should possess.
Library of Congress Subject Headings
Machine learning; Convolutions (Mathematics); Neural networks (Computer science); Automatic classification; Artificial intelligence
Imaging Science (Ph.D.)
Department, Program, or Center
Chester F. Carlson Center for Imaging Science (COS)
Roady, Ryne, "Open Set Classification for Deep Learning in Large-Scale and Continual Learning Models" (2020). Thesis. Rochester Institute of Technology. Accessed from
RIT – Main Campus