In this thesis, we discuss the importance of data normalization in deep learning and its relationship with generalization. Normalization is a staple of deep learning architectures and has been shown to improve the stability and generalizability of deep learning models, yet the reason why these normalization techniques work is still unknown and is an active area of research. Inspired by this uncertainty, we explore how different normalization techniques perform when employed in different deep learning architectures, while also exploring generalization and metrics associated with generalization in congruence with our investigation into normalization. The goal behind our experiments was to investigate if there exist any identifiable trends for the different normalization methods across an array of different training schemes with respect to the various metrics employed. We found that class similarity was seemingly the strongest predictor for train accuracy, test accuracy, and generalization ratio across all employed metrics. Overall, BatchNorm and EvoNormBO generally performed the best on measures of test and train accuracy, while InstanceNorm and Plain performed the worst.
Library of Congress Subject Headings
Deep learning (Machine learning); Data structures (Computer science); Computational learning theory
Applied and Computational Mathematics (MS)
Department, Program, or Center
School of Mathematical Sciences (COS)
Hurt, Griffin, "Normalization and Generalization in Deep Learning" (2023). Thesis. Rochester Institute of Technology. Accessed from
RIT – Main Campus