Detecting text in images presents the unique challenge of finding both in-scene and superimposed text of various sizes, fonts, colors, and textures in complex backgrounds. The goal of this system is not to recognize specific letters or words but only to determine if a pixel is text or not. This pixel level decision is made by applying a set of weighted classifiers created using a set of high pass filters, and a series of image processing techniques. It is our assertion that the learned weighted combination of frequency filters in conjunction with image processing techniques may show better pixel level text detection performance in terms of precision, recall, and f-metric, than any of the components do individually. Qualitatively, our algorithm performs well and shows promising results. Quantitative numbers are not as high as is desired, but not unreasonable. For the complete ensemble, the f-metric was found to be 0.36.
Library of Congress Subject Headings
Video recordings--Data processing; Optical pattern recognition; Image analysis; Image processing--Digital techniques; CAPTCHA (Challenge-response test)--Data processing
Department, Program, or Center
Chester F. Carlson Center for Imaging Science (COS)
Snyder, Dave, "Text detection in natural scenes through weighted majority voting of DCT high pass filters, line removal, and color consistency filtering" (2011). Thesis. Rochester Institute of Technology. Accessed from
RIT – Main Campus