Superpixels and Supervoxels

Many methods for object recognition, segmentation, etc., rely on tessellation of an image into ``superpixels". A superpixel is an image patch which is better aligned with intensity edges than a rectangular patch. Superpixels can be extracted with any segmentation algorithm, however, most of them produce highly irregular superpixels, with widely varying sizes and shapes. A more regular space tessellation may be desired. We formulate the superpixel partitioning problem in an energy minimization framework, and optimize with graph cuts. Our energy function explicitly encourages regular superpixels. We explore variations of the basic energy, which allow a trade-off between a less regular tessellation but more accurate boundaries or better efficiency. Our advantage over previous work is computational efficiency, principled optimization, and applicability to 3D ``supervoxel" segmentation. We achieve high boundary recall on 2D images and spatial coherence on video. We also show that compact superpixels improve accuracy on a simple application of salient object segmentation.

Superpixel Segmentation:

An image is covered with overlapping square patches of fixed size, Figure 1, left. Each pixel is covered by several patches, and the task is to assign a pixel to one of them. If two neighboring pixels are assigned to the same patch, there is no penalty. If they belong to different patches, then there is a stitching penalty that is inversely proportional to the intensity difference between the pixels. Intuitively, we are stitching patches so that the seams are encouraged to align with intensity edges. The stitching result is in Figure 1, middle, and superpixel boundaries are in Figure 1, right. Boundaries are regularized due to the stitching energy function. A superpixel cannot be too large, not larger than a patch size. Small superpixels are discouraged because they contribute a higher cost to the stitching energy.

Figure 1

Left: the original image overlayed with square patches. Each patch corresponds to a label. This label can be assigned only to pixels covered by the patch. Middle: result of patch stitching. Right: superpixel boundaries.

Superpixel results

Code For Windows

Code For Unix (thanks to Alvaro Collet)

Supervoxel segmentation

Our approach naturally extends to segmenting ``supervoxels" in 3D space. Segmentation of volumes into supervoxels can be useful, potentially, for medical image and for video processing. First we create a 3D volume by stacking the frames together, Figure 2. Analogously to the 2D case, we cover the 3D volume by overlapping 3D blocks. For clarity, in Figure 2 we show only a few non-overlapping blocks. The depth of a block can be different from its width and height. The larger the depth, the more temporal coherency is encouraged. Each block corresponds to a label.

Figure 2

Supervoxel results

We show the original ``tennis" sequence and the result of supervoxel segmentation. For visualization, we compute the average intensity of each supervoxel and repaint the video with the average supervoxel intensity. To appreciate the degree of temporal coherence in the supervoxel segmentation, we also perform superpixel segmentation on each frame of the ``tennis" sequence separately, using the basic superpixel algorithm. We display the results by painting superpixels with their average intensity.

Results without temporal coherence
Results with temporal coherence


O. Veksler, Y. Boykov, P. Mehrani, "Superpixels and Supervoxels in an Energy Optimization Framework", in European Conference on Computer Vision (ECCV), 14 pages, 2010.    PDF