Secrets of GrabCut and Kernel K-means

Meng Tang, Ismail Ben Ayed, Dmitrii Marin, Yuri Boykov

In International Conference on Computer Vision (ICCV), Santiago, Chile, December, 2015.


The log-likelihood energy term in popular model-fitting segmentation methods,e.g. Zhu-Yuille 1996, Chan-Vese 2001, GrabCut 2004, Delong et al. 2012, is presented as a generalized probabilistic K-means energy for color space clustering. This interpretation reveals some limitations, e.g. over-fitting. We propose an alternative approach to color clustering using kernel K-means energy with well-known properties such as non-linear separation and scalability to higher-dimensional feature spaces. Our bound formulation for kernel K-means allows to combine general pair-wise feature clustering methods with image grid regularization using graph cuts, similarly to standard color model fitting techniques for segmentation. Unlike histogram or GMM fitting, our approach is closely related to average association and normalized cut. But, in contrast to previous pairwise clustering algorithms, our approach can incorporate any standard geometric regularization in the image domain. We analyze extreme cases for kernel bandwidth (e.g. Gini bias) and demonstrate effectiveness of KNN-based adaptive bandwidth strategies. Our kernel K-means approach to segmentation benefits from higher-dimensional features where standard model-fitting fails.

WHOLE PAPER: pdf (8.4Mb)
TECH.REPORT: arXiv Sept 2016 (extended version) - also IJCV submission