Abstract

Feature selection (FS) is the process of finding an ideal set of features for a prediction model from a set of candidate features. A key step in designing a prediction model is reducing the size of the input feature set while increasing its usefulness. This reduces the complexity of a model, making the model run more quickly while allowing one to explain the usefulness of each individual feature more easily. Despite the desire to determine an ideal feature set, the process of FS can be time consuming and yield mixed results. FS is often partially automated with the use of algorithms. The quality of FS algorithms varies with many requiring long run times to produce mixed results. Few FS algorithms have an intuitive method of exploring a feature space, with most requiring one to determine a finite list of features to begin the algorithm. To address the shortcomings of many FS algorithms, Kaizen Programming with Enhanced Feature Discovery (KP-EFD) has been developed. KP-EFD is an evolutionary tool that uses a Genetic Programming (GP) framework combined with concepts of Continuous Improvement from Kaizen, a Japanese methodology, to intuitively expand and search a feature space for an ideal feature set. KP-EFD was tested for use with continuous or binary variables for the purpose of interpolating or extrapolating. The method performed well for some datasets and model types while falling short of acceptable for others; however, with additional improvements, KP-EFD has the potential to become very versatile, saving time and frustration when working with any type of data and prediction algorithm.

Library of Congress Subject Headings

Database management; Data mining; Genetic algorithms; Business planning--Data processing; Industrial management--Data processing; Time-series analysis

Publication Date

4-23-2020

Document Type

Thesis

Student Type

Graduate

Degree Name

Industrial and Systems Engineering (MS)

Department, Program, or Center

Industrial and Systems Engineering (KGCOE)

Advisor

Katie McConky

Advisor/Committee Member

Nasibeh Azadeh Fard

Comments

This thesis has been embargoed. The full-text will be available on or around 10/27/2020.

Campus

RIT – Main Campus

Plan Codes

ISEE-MS

Available for download on Monday, November 23, 2020

Share

COinS