Convolutional neural networks excel at extracting features from signals. These features are able to be utilized for many downstream tasks. These tasks include object recognition, object detection, depth estimation, pixel level semantic segmentation, and more. These tasks can be used for applications such as autonomous driving where images captured by a camera can be used to give a detailed understanding of the scene. While these models are impressive, they can fail to generalize to new environments. This forces the cumbersome process of collecting images from multifarious environments and annotating them by hand. Annotating thousands or millions of images is both expensive and time consuming. One can use transfer learning to transfer knowledge from a different dataset as an initial starting place for the weights of the same model training on the target dataset. This method requires that another dataset has already been annotated and salient information can be learned from the dataset that can aid the model on the second dataset. Another method that does notrely on human generated annotations is self-supervised learning in which annotation scan be computer generated using tasks that force learning image representations such as predicting the rotation of an image. In this thesis, self-supervised methods are evaluated specifically to improve semantic segmentation as the primary downstream task. Data augmentation’s affect during pretraining is observed in context of its effect on downstream performance. Knowledge from multiple self-supervised tasks are combined to create a starting point for training on a target dataset that outperforms either method individually.
Library of Congress Subject Headings
Computer vision; Signal processing--Digital techniques; Pattern recognition systems; Neural networks (Computer science); Machine learning; Convolutions (Mathematics)
Computer Engineering (MS)
Department, Program, or Center
Computer Engineering (KGCOE)
Alexopoulos, Kenneth, "Self-Supervision Initialization for Semantic Segmentation Networks" (2020). Thesis. Rochester Institute of Technology. Accessed from
RIT – Main Campus