### Assignment 4, due December 15, 5% of class score

Instructions
• Problem 1 (25%):

I give you complete code for setting up the MatConvNet library, reading data, normalizing data, defining neural network structure, training neural network, and, finally, applying it to the test data. The main file is "p1.m". Before running "p1.m", in file "setup.m" change to the correct location the directories for the MatConvNet library and the data.

When you run "p1.m", it outputs:

• Training/validation error for each mini-batch and each epoch
• The plot of training/validation energies vs. number of training epoch
• The plot of test/validation errors vs. the number of epochs.
• The test error as the last sentence

The plots are useful to understand whether you underfit/overfit to the training data.

What to do for this problem:

1. Run the code with parameters unchanged. Observe the plots, and the test error.
2. Change the number of epochs to 50. Observe the plots, the test error, and compare with (a). If you set trainOpts.continue = true; the result of the first 50 iterations is reused.
3. Come back to the parameters in (a) and now change the learning rate to 0.1. Observe the plots, the test error, and compare with (a).
4. Now change the learning rate to 0.00001. Observe the plots, the test error, the validation error, and and compare with (a).
5. Come back to the parameters in (a) and change the batch size to 1. Observe the plots, the test error, the validation error, and discuss the difference from (a).
6. Change the batch size to 200. Observe the plots, the test error, the validation error, and discuss the difference from (a).

• Problem 2 (25%):

Open file "initializeCNN.m". This file initializes the structure of convNet. There are two stages, each stage consists of a convolutional layer, ReLu layer (linear rectification), spatial normalization layer, and a pooling layer. After two stages, there is a fully connected layer and finally the softmax layer.

Try to get a better performance by experimenting with the network structure. You can change the number of layers (add/remove), change filter size, etc. In order to get a consistent network, notice the following:

• For the convolutional layer, the filters are of size w x h x d, where d = 3 for the first layer, or d=number of channels in the previous layer.
• For any layer, the number of biases must be equal to the number of filters.
• The last layer before the softmax layer must have the number of units equal to the number of classes (10 in our case).
• More comments on building a consistent DNN are in file "initializeCNN.m".
• If you are not sure how a convolutional layer changes the size of current layer x, use matlab command y = vl_nnconv(x, w, [], 'stride',s ), where w is initialized with the convolution filter size you are using. For example, if using 5 x 5 filters, and if current layer depth (the third dimension of x) has size 10, and if you want to create 20 new filters, and use stride 2, first set w = randn(5,5,10,20,'single') and then use command y = vl_nnconv(x, w, [], 'stride',2 ). Now size(y) will tell you the size of the new layer.
• Similarly, if you want to see what is the size of the new layer after the pooling layer, use command y = vl_nnpool(x, [p p]), 'stride', s). For example, if pooling in 3 x 3 region with a stride of 3, use y = vl_nnpool(x, [3 3]), 'stride', 3). The command size(y) now will tell you the new layer size.
• After you run "initializeCNN.m", you can use command vl_simplenn_display(net) to display network information

• Problem 3 (25%):

Ideally, DNN should be trained on a large amount of data. Our dataset is quite small. In this problem, you will experiment with increasing the training data by a factor of 10. For each training image of size R x C, take 4 different random subcrops of size, say, (R-M) x (C-M) where M is a fraction of image width (for example, M = (1/8)*C). This gives you five images in total. Also flip them left to right (fliplr in matlab). You get 10 images. They should be resized to the size expected by our network, namely 200 by 200. This increases the amount of training data tenfold. Choose the best performing network from Problem 1 and 2, and retrain it on this larger dataset. Report the energy/training error/validation error plots as well as the test error. Compare with training on a smaller dataset.

In order to do this problem, you will need to add code to matlab script readDataForCNN.m. This script creates a dataset of images to be used for training and validation, namely, imdb.images.data (size 200 x 200 x 3 x 800). There are 800 images, and each of size 200 x 200 x 3. In addition, it creates imdb.images.labels (size 1 x 800) that stores the labels (correct classes, their range is 1, 2,...,10) for imdb.images.data. It also creates imdb.images.set (size 1 x 800) that stores "1" for a sample to be used for training, and "2" for a sample to be used for validation.

The new versions of imdb.images.data, imdb.images.label, imdb.images.set should be ten times larger in the forth dimention, which corresponds to the number of training samples. That is the new imdb.images.data should be of size 200 x 200 x 3 x 8000, the new imdb.images.labels of size 1 x 8000 and the new imdb.images.set of size 1 x 8000.

When you create a new image, its "label" and set "values" should remain the same as that for the original image. To avoid confusion, let me call the old dataset imdb_Old, and the new one imdb. Suppose you are creating a new images from sampe number 20 in the old dataset. Suppose the index of this sample in the new dataset is 200. Then you should set: imdb.data.set(200) = imdb_Old.data.set(20) and imdb.data.labels(200) = imdb_Old.data.labels(20).

The requirement that the label set set is obvious, the new image should be of the same class as the image it was derived from. The requirement on the validation set ensures that if the original image was validation, all the resulting images are also validation images. This ensures that the validation images are not that too similar to training images, otherwise, our validation error would be too optimistic.

• Problem 4 (25%):

Deep neural networks have to be trained on large amounts of data. We do not have nearly enough data. However, we can use a network trained on a massive amount of data, but for a different task. Recall that a neural network can be viewed as learning features. Visual features learned for one recognition task are likely to be useful for another task. In this problem, we will take a network trained for the imageNet competition (1000 classes), extract features, and use them for our task.

• Extract the "imagenet-vgg" DNN from provided file "imagenet-vgg-f.mat". You can load it with command