We have constructed a linear discriminator for hand-printed character recognition that uses a (binary) vector of 1,500 features based on an equidistributed collection of products of pixel pairs. This classifier is competitive with other techniques, but faster to train and to run for classification. However, the 1,500-member feature set clearly contains many redundant (overlapping or useless) members, anda significantly smaller set would be very desirable (e.g., for faster training, a faster and smaller application program, and a smaller system suitable for hardware implementation). A system using the small set of features should also be better at generalization, since fewer features are less likely to allow a system to "memorize noise in the training data." Several approaches to using a genetic algorithm to search for effective small subsets of features have been tried, and we have successfully derived a 300-element set of features and built a classifier whose performance is as good on our training and testing set as the system using the full set.
Date of creation, presentation, or exhibit
Department, Program, or Center
Chester F. Carlson Center for Imaging Science (COS)
Gaborski R.S., Anderson P.G., Asbury C.T., Tilley D.G. (1993) Genetic Algorithm Selection of Features for Hand-printed Character Identification. In: Albrecht R.F., Reeves C.R., Steele N.C. (eds) Artificial Neural Nets and Genetic Algorithms. Springer, Vienna
RIT – Main Campus