Support Vector Machines (SVMs) have demonstrated accuracy and efficiency in a variety of binary classification applications including indoor/outdoor scene categorization of consumer photographs and distinguishing unsolicited commercial electronic mail from legitimate personal communications. This thesis examines a parallel implementation of the Sequential Minimal Optimization (SMO) method of training SVMs resulting in multiprocessor speedup subject to a decrease in accuracy dependent on the data distribution and number of processors. Subsequently the SVM classification system was applied to the image labeling and e-mail classification problems. A parallel implementation of the image classification system's color histogram, color coherence, and edge histogram feature extractors increased performance when using both noncaching and caching data distribution methods. The electronic mail classification application produced an accuracy of 96.69% with a user-generated dictionary. An implementation of the electronic mail classifier as a Microsoft Outlook add-in provides immediate mail filtering capabilities to the average desktop user. While the parallel implementation of the SVM trainer was not supported for the classification applications, the parallel feature extractor improved image classification performance.
Library of Congress Subject Headings
Machine learning; Computer algorithms; Images, Photographic--Classification; Electronic mail messages--Classification
Department, Program, or Center
Computer Engineering (KGCOE)
Woitaszek, Matthew, "Support vector machines for image and electronic mail classification" (2002). Thesis. Rochester Institute of Technology. Accessed from
RIT – Main Campus