Traditional clustering algorithms are inapplicable to many real-world problems where limited knowledge from domain experts is available. Incorporating the do- main knowledge can guide a clustering algorithm, consequently improving the quality of clustering. In this paper, we propose SS-NMF: a Semi-Supervised Non-negative Ma- trix Factorization framework for data clustering. In SS-NMF, users are able to provide supervision for clustering in terms of pairwise constraints on a few data objects spec- ifying whether they \must" or \cannot" be clustered together. Through an iterative algorithm, we perform symmetric tri-factorization of the data similarity matrix to in- fer the clusters. Theoretically, we show the correctness and convergence of SS-NMF. Moveover, we show that SS-NMF provides a general framework for semi-supervised clustering. Existing approaches can be considered as special cases of it. Through extensive experiments conducted on publicly available datasets, we demonstrate the superior performance of SS-NMF for clustering.

Publication Date



The original publication is available at www.springerlink.com.Note: imported from RIT’s Digital Media Library running on DSpace to RIT Scholar Works in February 2014.

Document Type


Department, Program, or Center

Center for Advancing the Study of CyberInfrastructure


RIT – Main Campus