This homework has three parts covering standard image segmentation techniques based on graph algorithms (shortest paths, minimum graph cuts) and basic clustering methods (K-means). The first two are used for segmentation boundary regularization, while the last can be used for color quantization or superpixels. Generalization of these methods are routinely used in more advanced applications of computer vision. Thus, hands-on experience that you obtain while working on this assignment should aid in better understanding some later material in this course.
The students are strongly encouraged to work in python notebook environment (Jupiter/Anaconda/Python 2.7), which was already demonstrated in class. This environment is convenient for prototyping techniques for image analysis due to many visualization and plotting options, wide support for array handling and image processing libraries. Installation instructions for Anaconda (Python 2.7) as well as PyMaxflow library needed for part II of the assignment are given
NOTE: we tested the interface functions used in this assignment
under Anaconda versions 4.2.0 (the latest) and 2.4.1 (older version provided in the instructions). But relatively recent Anaconda 4.1.1 has some bugs. In case you already installed Anaconda, check its specific version. If it is 4.1.1 you should uninstall it (from the system/apps menu) and install either the latest Anaconda 4.2.0 or older version 2.4.1. In any case, use Anaconda - Python 2.7.
Besides the libraries above, the students should use several files provided specifically for this homework inside hw1.zip. It includes "starter" notebooks for Jupiter/Anaconda giving good initial point for each part of the assignemnt. These notebooks come integrated with the necessary GUI interfaces handling low-level evens (mouse-down, mouse-move, button-down, etc) for interactive segmentation (e.g. entering seeds) as well as visulaizization of images and the results (e.g. contours or segmentation masks). Such GUI functions are implemented in a separate file (asg1.py) containing the code for the "presenters" objects (e.g. LiveWirePresenter, GraphCutsPresenter). This file should be kept in the same directory where you have your notebook files. We also provide a sample notebook (MyRegionGrowing.ipynb) containing fully implemented Region Growing segmentation tool. This notebook can be used as an example. It also uses a specialized "RegionGrowingPresenter" implemented in "asg1.py".
It is expected that in each notebook (for parts I, II, or III below) you will use/create multiple "cells" corresponding to different specific tasks/experiemnts. For example, one "code" cell may contain a class implementation. Separate "code" cells can create and use object(s) of this and other classes to demonstrate experimental results on different images or with different values of parameters. The corresponding results should be in the figures below each "code" cell. If needed/required, the visual results can be discussed in separate "markdown" cells right below each figure. Keep the experiemnts in your notebook well-organizied. Presentatation clarity and good organization of your notebooks will be evaluated/marked.
When preparing your notebooks for submission, you should test each notebook after restarting the python kernel in Jupiter (Kernel->Restart). Then, run the cells consequtively and make sure that they work correctly in this order. When saving your notebook before submission, the cell numbers on the left margin of the notebook should be 1,2,3,4,... from top to bottom. When testing your code I can re-run the cells of your notebook only in this specific order. Also, save your notebook after each (interactive) figure cell is in the state showing the exact result that you want to demonstrate (e.g. for specific seeds) and that you might be discussing in a markdown cell right below the figure. Besides these tested notebook files, you should also save and submit their "html" versions (File->Download as->HTML), which also contain the "current" state of your figures.
Note that the assignment mentions some "optional" tasks. As all other required tasks, the optional tasks should be implemented in separate cells. Conclusive figures/experiments with specific points discussed in "markdown" cells (below) can yield bonus points.
Part I (Intelligent scissors). Implement "intelligent scissors" or "live-wire" method for interactive object boundry extraction. The starter notebook for this part is
"MyLiveWire.ipynb". It implements standard mouse clicks for "live-wire" but the paths are extracted from BFS tree on a non-weighted pixel grid (just as an example of a path). Instead, you should correct this cell's code to have proper live-wire (shortest paths) on image-contrast weighted grid graph. There could be a deduction for inefficient implementation of Dijkstra. HINT: use "heapq" module if you need a binary heap.
Compare the results on 4-connected and 8 connected grids (use a separate notebook cell for each).
Optional: you can compare different edge weight functions.
Optional: you can compare the speed of implementations with different data structures for "active" front.
Part II (basic Graph Cuts). Implement "intelligent paint" interactive segmentation tool using graph cuts algorithm on a weighted image grid.
The standard seed-interactive GUI ("GraphCutsPresenter") is available in "asg1.py". The starter notebook for this part is "MyGraphCuts.ipynb". It implements regional seeds from mouse interactions (left and right mouse clicks for object and background seeds) and displays these seeds over the image being segmented. However, instead of proper graph cut segmentation the provided code displays only some fixed binary mask (just as an example of a mask). This "fixed" mask should be replaced by the output of an interactive image segmentation method based on minimum graph cuts respecting the regional hard constraints marked by the user-seeds. You can use an existing library for computing minimum s/t cuts on arbitrary graphs (see PyMaxflow here). You should use this library to build a properly weighted graph based on selected image and user-entered seeds.
You need to implement only a basic (boundary-only) version of graph cut method (covered in topic 5) using undirected "contrast-weighted" neighborhood edges or "n-links" and terminal edges or "t-links" representing user seeds (regional hard constraints). NOTE: in my empirical experience, the provided max-flow/min-cut library is more efficient when using integer n-links in a relatively small range (e.g. 0,1,2,...,10). Thus, you can use integer-weighted graph where edge weights are discretized (truncated) values of your edge-weighting function.
First, you should carefully select appropriate parameters for exponential "edge weighting" function for n-links (see graph cut section of topic 5) using intensity differences/contrast between two pixels. Run your code with different values of parameter "sigma" (denominator inside the exponent). Show 3-4 representative results (in different cells). Use markdown cell to discuss your experimental observations.
Optional: recommend a strategy for automatically selecting a "good" value of parameter sigma (it may depend on specific image). Your stategy should be technically justified/explained (be brief/specific).
In practice, you have to use "large-enough" finite cost t-links in order to enforce hard-constraints (regional user seeds) since "infinity" is not a valid parameter value. Use a markdown cell to explain your strategy for selecting a finite weight for a t-link to pixel "p" guaranteing that "p" is not disconnected from the corresponding terminal by the minimum cut. COMMENT: later we will also learn how to use t-links to enforce various forms of "appearance consistency" for segments (instead of hard constraints / seeds).
Compare the results on 4 and 8 connected grids. Explain.
Briefly summarize the main similarities and differences for "intelligent sccissors" and "intelligent paint" approaches to interactive segmentation (use a "markdown" cell).
Part III (K-means). Implement K-means algorithm for clustering various pixel features. The starter notebook for this part is "MyKmeans.ipynb".
This part of the assignment corresponds to unsupervised segmentation, so no interactions are needed. The provided "starter" notebook only computes random pixel segments. You need to write code producing correct clusters and correct "means".
To achive this you need only to write around 15-20 lines of code in functions "compute means" and "compute_labels", which correspond to the main two steps in each iteration of the standard K-means algorithm (see "compute_k_means_clusters"). Fully implemented "KmeansPresenter" visulaizes the segmentation results (cluster labels mask) where each cluster is highlighted by some unique color (either random or by the "average" segment color).
Implement functions "compute means" and "compute_labels" in MyKmeansApp object.
Your implementation of the main two steps of K-means algorithm should use RGBXY features. Relative contribution of "squared errors" from XY features must be set by parameter "weightXY" (or self.w in MyKmeansApp object), so that the squared error between RGBXY feture Fp=[Rp,Gp,Bp,Xp,Yp] at any pixel p and any given cluster mean m=[Rm,Gm,Bm,Xm,Ym] is
K-means is commonly used for color quantization (when weightXY is zero), or superpixels (e.g SLIC superpixels, see the figure on the right). Compare your results for different values of weightXY. Show representative results in your notebook, discuss their properties, and explain the differences.
Experiment with different values of parameter K (in the range 2-80). Show 3-4 representative results in your notebook, discuss their properties, and explain the differences.
Compare representative values of optimal SSE for smaller and larger K and explain your observation.
Evaluate sensitivity of K-means to local minima. Show 2-3 different solutions for different random initial means and display the corresponding values of the K-means energy.
What to submit, where to submit, and when to submit
Submit one file hw1_your_name.zip containing three notebooks (MyLiveWire.ipynb, MyGraphCuts.ipynb, MyKmeans.ipynb)
and their html copies generated as explained above (MyLiveWire.html, MyGraphCuts.html, MyKmeans.html). Your notebooks should include cells with your python code, figures, and explanations/discussions. Your code cells should include "code comments" (e.g. description of function interfaces/parameters, etc). All files with images
used in your notebook should be placed in "images" subdirectory next to your notebook files (such images should be loaded to your notebooks using paths like "images/file_name.bmp"). When ready to submit your work, zip the notebook files, their html versions, and the subdirectory "images" into your submission file hw1_your_name.zip.
The corresponding file (.zip) should be submitted electronically via OWL. The notebook (code and comments) should be your independent effort.
All files should be submitted to OWL before 11:55pm on the due date (Oct.18). Do not wait until the last moment. A rush just before 11:55pm will not be accepted as a valid excuse.