Abstract

A model is proposed for two-way tables of measurement data containing outliers. The two independent variables are categorical and error free. Neither missing values nor replication are present. The model consists of the sum of a customary additive part that can be fit using least squares and a part that is composed of outliers. Recommendations are made for methods for identifying cells containing outliers and for fitting the model. A graph of the observations is used to determine the outliers’ locations. For all cells containing an outlier, replacement values are determined simultaneously using a classical missing-data tool. The result is called the adjusted table. The inserted values are such that, when a mean-based fitting of the adjusted table is performed, the residuals in those cells are zero. The outlying portion of the observation in each of those cells is the difference of the observation and the replacement value. In this way, outliers are removed from further analyses of the adjusted table. This is particularly helpful, because outliers can greatly contaminate and alter computations and conclusions. Subsequently, the causes of the outliers might be determined, and statistical estimation and testing can be implemented on the adjusted table.

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Publication Date

2-2023

Document Type

Article

Department, Program, or Center

School of Mathematical Sciences (COS)

Campus

RIT – Main Campus

Share

COinS