Abstract

The goal of scene classification is to automatically assign a scene image to a semantic category (i.e. "building" or "river") based on analyzing the visual contents of this image. This is a challenging problem due to the scene images' variability, ambiguity, and a wide range of illumination or scale conditions that may apply. On the contrary, it is a fundamental problem in computer vision and can be used to guide other processes such as image browsing, contentbased image retrieval and object recognition by providing contextual information. This thesis implemented two scene classification systems: one is based on Spatial Pyramid Matching (SPM) and the other one is applying Hierarchical Dirichlet Processes (HDP). Both approaches are based on the most popular "bag-of-words" representation, which is a histogram of quantized visual features. SPM represents an image as a "spatial pyramid" which is produced by computing histograms of local features for multiple levels with different resolutions. "Spatial Pyramid Matching" is then used to estimate the overall perceptual similarity between images which can be used as a support vector machine (SVM) kernel. In the second approach, HDP is used to model the "bag-of-words" representations of images; each image is described as a mixture of latent themes and each theme is described as a mixture of words. The number of themes is automatically inferred from data. The themes are shared by images not only inside one scene category but also across all categories. Both systems are tested on three popular datasets from the field and their performances are compared. In addition, the two approaches are combined, resulting in performance improvement over either separate system.

Library of Congress Subject Headings

Image processing--Digital techniques; Images, Photographic--Classification; Image analysis; Dirichlet forms

Publication Date

2010

Document Type

Thesis

Student Type

Graduate

Degree Name

Computer Science (MS)

Department, Program, or Center

Computer Science (GCCIS)

Advisor

Gaborski, Roger

Comments

Note: imported from RIT’s Digital Media Library running on DSpace to RIT Scholar Works in December 2013.

Recommended Citation

Yin, Haohui, "Scene classification using spatial pyramid matching and hierarchical Dirichlet processes" (2010). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/248

Campus

RIT – Main Campus

Plan Codes

COMPSCI-MS

Download

COinS

Theses

Scene classification using spatial pyramid matching and hierarchical Dirichlet processes

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Comments

Recommended Citation

Campus

Plan Codes

Search

Browse

Author Corner

RIT Links

Theses

Scene classification using spatial pyramid matching and hierarchical Dirichlet processes

Author

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Comments

Recommended Citation

Campus

Plan Codes

Share

Search

Browse

Author Corner

RIT Links