Abstract

In contemporary software development, developers commonly conduct regression testing to ensure that code changes do not affect software quality. Performance regression testing is an emerging research area from the regression testing domain in software engineering. Performance regression testing aims to maintain the system's performance. Conducting performance regression testing is known to be expensive. It is also complex, considering the increase of committed code and developing team members working simultaneously. Many automated regression testing techniques have been proposed in prior research. However, challenges in the practice of locating and resolving performance regression still exist. Directing regression testing to the commit level provides solutions to locate the root cause, yet it hinders the development process. This thesis outlines motivations and solutions to address locating performance regression root causes. First, we challenge a deterministic state-of-art approach by expanding the testing data to find improvement areas. The deterministic approach was found to be limited in searching for the best regression-locating rule. Thus, we presented two stochastic approaches to develop models that can learn from historical commits. The goal of the first stochastic approach is to view the research problem as a search-based optimization problem seeking to reach the highest detection rate. We are applying different multi-objective evolutionary algorithms and conducting a comparison between them. This thesis also investigates whether simplifying the search space by combining objectives would achieve comparative results. The second stochastic approach addresses the severity of class imbalance any system could have since code changes introducing regression are rare but costly. We formulate the identification of problematic commits that introduce performance regression as a binary classification problem that handles class imbalance. Further, the thesis provides an exploratory study on the challenges developers face in resolving performance regression. The study is based on the questions posted on a technical form directed to performance regression. We collected around 2k questions discussing the regression of software execution time, and all were manually analyzed. The study resulted in a categorization of the challenges. We also discussed the difficulty level of performance regression issues within the development community. This study provides insights to help developers during the software design and implementation to avoid regression causes.

Library of Congress Subject Headings

Computer software--Testing; Prediction theory; Classification

Publication Date

11-2022

Document Type

Dissertation

Student Type

Graduate

Degree Name

Computing and Information Sciences (Ph.D.)

Department, Program, or Center

Computer Science (GCCIS)

Advisor

Mohamed Wiem Mkaouer

Advisor/Committee Member

Ali Ouni

Advisor/Committee Member

Travis Desell

Campus

RIT – Main Campus

Plan Codes

COMPIS-PHD

Share

COinS