Statistical machine learning uses data to model a relationship between many parameters, or explanatory variables, and a response variable. The adaptive boosting algorithm is a machine learning method that can be used to model relationships of classification data. This method uses a weak base learner to improve accuracy of predicting the correct response class from a set of variables. Because of its learnability, adaptive boosting yields an exponentially decreasing empirical error. From this, an empirical error bound can be derived from the boosting algorithm. This empirical error bound inspires us to see if there is a generalized error bound and what form it takes. Evidence from boosting several real datasets will show that the generalized error follows the same shape as the empirical error, thus suggesting that a shift of the empirical error bound can create a generalized error bound. By simulating datasets from random and varying their characteristics based on criteria that seem to affect the shift, we can boost them and derive a function by which to shift the empirical error bound. We will record the test error of the boosted simulated datasets and build a regression model with that as the response and the varying characteristics of the datasets as the explanatory variables. The final regression model gives us the predicted outcome of the difference between the generalized error and the empirical error, thus enabling us to derive the suggested generalized error bound.
Library of Congress Subject Headings
Machine learning--Mathematical models
Applied Statistics (MS)
Department, Program, or Center
School of Mathematical Sciences (COS)
Houston, Paige, "An Empirical Demonstration of the Probabilistic Upper Bound of the Adaptive Boosting Test Error" (2016). Thesis. Rochester Institute of Technology. Accessed from
RIT – Main Campus