Historically it has been difficult to measure the deviation in the notion of a concept. Several schemes have been proposed to attack this challenging problem [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]. The central notion of all these efforts is to detect the change point where the data mining model deviates significantly with respect to the data characteristics that it was trained or built on. The process of detecting such change points is often termed as concept drift. Current state of algorithms assume attribute independence, view the problem as a supervised learning problem and also need tagged data. The proposed algorithm does not make any assumption among attribute independence and uses the covariance summary to detect concept drift in an unsupervised setting. The algorithm proposed in this thesis monitors the underlying characteristics of the input data, maintains data summaries of the various snapshots in time and utilizes effective distance metrics to determine when concept drifts. The technique was evaluated against synthetic and real data sets.
Library of Congress Subject Headings
Data mining; Concepts; Machine learning
Computer Science (MS)
Department, Program, or Center
Computer Science (GCCIS)
Chakravorty, Mamidi Sree Kalyan, "Gaussian Mixture Approach to Detect Drift" (2006). Thesis. Rochester Institute of Technology. Accessed from
RIT – Main Campus