文章目录
  1. 1. Dropout and Data Augmentation
  2. 2. Ensemble method
  3. 3. Transfer learning
  4. 4. Visualize Saliency Maps(very useful)
  5. 5. Fooling images for ConvNets (For detail code please view the ipython file)

Assignment Website
Reference

  1. K. Simonyan, A. Vedaldi, A. Zisserman , “Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps”, ICLR Workshop 2014
  2. Szegedy, Christian, et al. “Intriguing properties of neural networks.” arXiv preprint, 2013.
  3. Nguyen, Anh, Jason Yosinski, and Jeff Clune. “Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images.” arXiv preprint, 2014.

    Dropout and Data Augmentation

    (Networks with dropout usually take a bit longer to train, so we will use more training epochs this time.)

Dropout usually applied to fully connected layer instead of convolutional layer.

Code for dropout:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
def dropout_forward(x, dropout_param):
"""
Performs the forward pass for (inverted) dropout.

Inputs:
- x: Input data, of any shape
- dropout_param: A dictionary with the following keys:
- p: Dropout parameter. We keep each neuron output with probability p.
- mode: 'test' or 'train'. If the mode is train, then perform dropout;
if the mode is test, then just return the input.
- seed: Seed for the random number generator. Passing seed makes this
function deterministic, which is needed for gradient checking but not in
real networks.

Outputs:
- out: Array of the same shape as x.
- cache: A tuple (dropout_param, mask). In training mode, mask is the dropout
mask that was used to multiply the input; in test mode, mask is None.
"""

p, mode = dropout_param['p'], dropout_param['mode']
if 'seed' in dropout_param:
np.random.seed(dropout_param['seed'])

mask = None
out = None

if mode == 'train':
pass
mask = (np.random.rand(*x.shape)<p) / p
out = x * mask
elif mode == 'test':
pass
out = x

cache = (dropout_param, mask)
out = out.astype(x.dtype, copy=False)

return out, cache

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def dropout_backward(dout, cache):
"""
Perform the backward pass for (inverted) dropout.

Inputs:
- dout: Upstream derivatives, of any shape
- cache: (dropout_param, mask) from dropout_forward.
"""

dropout_param, mask = cache
mode = dropout_param['mode']
if mode == 'train':
###########################################################################
# TODO: Implement the training phase forward pass for inverted dropout. #
# Store the dropout mask in the mask variable. #
###########################################################################
pass
dx = mask * dout
###########################################################################
# END OF YOUR CODE #
###########################################################################
elif mode == 'test':
dx = dout
return dx

Code for data Augmentation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
def random_flips(X):
"""
Take random x-y flips of images.

Input:
- X: (N, C, H, W) array of image data.

Output:
- An array of the same shape as X, containing a copy of the data in X,
but with half the examples flipped along the horizontal direction.
"""

out = None
#############################################################################
# TODO: Implement the random_flips function. Store the result in out. #
#############################################################################
pass
N, C, H, W = X.shape
out = np.zeros(X.shape)
flag = np.random.rand(N)<0.5
flag = flag.astype(int)
flag[flag==0] = -1
for i in xrange(N):
out[i] = X[i,:,:,::flag[i]]


#############################################################################
# END OF YOUR CODE #
#############################################################################
return out


def random_crops(X, crop_shape):
"""
Take random crops of images. For each input image we will generate a random
crop of that image of the specified size.

Input:
- X: (N, C, H, W) array of image data
- crop_shape: Tuple (HH, WW) to which each image will be cropped.

Output:
- Array of shape (N, C, HH, WW)
"""

N, C, H, W = X.shape
HH, WW = crop_shape
assert HH < H and WW < W

out = np.zeros((N, C, HH, WW), dtype=X.dtype)
#############################################################################
# TODO: Implement the random_crops function. Store the result in out. #
#############################################################################
pass
H_range = H - HH
W_range = W - WW
ratio_H = np.random.rand(N)
ratio_W = np.random.rand(N)
startH = np.round(ratio_H * H_range)
startW = np.round(ratio_W * W_range)
for i in xrange(N):
out[i] = X[i,:,startH[i]:startH[i]+HH, startW[i]:startW[i]+WW]




#############################################################################
# END OF YOUR CODE #
#############################################################################

return out


def random_contrast(X, scale=(0.8, 1.2)):
"""
Randomly adjust the contrast of images. For each input image, choose a
number uniformly at random from the range given by the scale parameter,
and multiply each pixel of the image by that number.

Inputs:
- X: (N, C, H, W) array of image data
- scale: Tuple (low, high). For each image we sample a scalar in the
range (low, high) and multiply the image by that scaler.

Output:
- Rescaled array out of shape (N, C, H, W) where out[i] is a contrast
adjusted version of X[i].
"""

low, high = scale
N = X.shape[0]
out = np.zeros_like(X)

#############################################################################
# TODO: Implement the random_contrast function. Store the result in out. #
#############################################################################
pass
ratio = np.random.rand(N)
contrast = low + (high - low) * ratio
for i in xrange(N):
out[i] = contrast[i] * X[i]
#############################################################################
# END OF YOUR CODE #
#############################################################################

return out


def random_tint(X, scale=(-10, 10)):
"""
Randomly tint images. For each input image, choose a random color whose
red, green, and blue components are each drawn uniformly at random from
the range given by scale. Add that color to each pixel of the image.

Inputs:
- X: (N, C, W, H) array of image data
- scale: A tuple (low, high) giving the bounds for the random color that
will be generated for each image.

Output:
- Tinted array out of shape (N, C, H, W) where out[i] is a tinted version
of X[i].
"""

low, high = scale
N, C = X.shape[:2]
out = np.zeros_like(X)

#############################################################################
# TODO: Implement the random_tint function. Store the result in out. #
#############################################################################
pass
tint_color = low + np.random.rand(N,C) * (high - low)
#tint_color.reshape((N,C,1,1))
for i in xrange(N):
out[i] = X[i] + tint_color[i].reshape((C,1,1))
#############################################################################
# END OF YOUR CODE #
#############################################################################

return out


def fixed_crops(X, crop_shape, crop_type):
"""
Take center or corner crops of images.

Inputs:
- X: Input data, of shape (N, C, H, W)
- crop_shape: Tuple of integers (HH, WW) giving the size to which each
image will be cropped.
- crop_type: One of the following strings, giving the type of crop to
compute:
'center': Center crop
'ul': Upper left corner
'ur': Upper right corner
'bl': Bottom left corner
'br': Bottom right corner

Returns:
Array of cropped data of shape (N, C, HH, WW)
"""

N, C, H, W = X.shape
HH, WW = crop_shape

x0 = (W - WW) / 2
y0 = (H - HH) / 2
x1 = x0 + WW
y1 = y0 + HH

if crop_type == 'center':
return X[:, :, y0:y1, x0:x1]
elif crop_type == 'ul':
return X[:, :, :HH, :WW]
elif crop_type == 'ur':
return X[:, :, :HH, -WW:]
elif crop_type == 'bl':
return X[:, :, -HH:, :WW]
elif crop_type == 'br':
return X[:, :, -HH:, -WW:]
else:
raise ValueError('Unrecognized crop type %s' % crop_type)

Ensemble method

Ensemble methods tends to always make the prediction better.

A simple way to implement an ensemble of models is to average the predicted probabilites for each model in the ensemble.

More concretely, suppose we have models $k$ models $m_1,\ldots,m_k$ and we want to combine them into an ensemble. If $p(x=y_i \mid m_j)$ is the probability that the input $x$ is classified as $y_i$ under model $m_j$, then the enemble predicts

$$p(x=y_i \mid {m_1,\ldots,m_k}) = \frac1k\sum_{j=1}^kp(x=y_i\mid m_j)$$

In this example, we have 10 pretrained models and the structures are all the same
Each of these models was trained for 25 epochs over the TinyImageNet-100-A training data with a batch size of 50 and with dropout on the hidden affine layer. Each model was trained using slightly different values for the learning rate, regularization, and dropout probability.

We can use all of them, or just part of them. And we can plot the validation accuracy with respect to the number of models selected.

Transfer learning

  1. Just use the trained the network on a new dataset to get last fully connnected layer(RELUed) as the feature, and then use KNN/SVM/softmax as the classfier.
  2. Use the network as the initial the parameter and fine-tune it with the new dataset.

Visualize Saliency Maps(very useful)

[1] K. Simonyan, A. Vedaldi, A. Zisserman , “Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps”, ICLR Workshop 2014

understand which part of an image is important for classification by visualizing the gradient of the correct class score with respect to the input image. Recall that if a region of the image has a high data gradient, then this indicates that the output of the ConvNet is sensitive to perturbations in that region of the input image.

We will do something similar, instead visualizing the gradient of the data loss with respect to the input image; this gives similar results and is cleaner to implement using our codebase.

The computation is relatively easy(use the conv_relu_pool_backward.)

Fooling images for ConvNets (For detail code please view the ipython file)

Two other papers [2, 3] showed that given a trained ConvNet, an input image, and a desired label, that we can add a small amount of noise to the input image to force the ConvNet to classify it as having the desired label.

Suppose that $L(x, y, m)$ is the data loss under model $m$, where we tell the network that the input $x$ should be classified as having label $y$. Given a starting image $x_0$, a desired label $y$, and a pretrained model $m$, we will create a fooling image $x_f$ by solving the following optimization problem:

$$x_f = \arg\min_x \left(L(x, y, m) + \frac\lambda2 |x - x_0|^2_2\right)$$

The term $|x - x_0|^2$ is $L_2$ regularization in image space which encourages the fooling image to look similar to the starting image, and the constant $\lambda$ is the strength of this regularization. We will use gradient descent to perform optimization under this model.

In the past, when using gradient descent we have stopped after a fixed number of iterations. Here we will use a different stopping criteria. Suppose that $p(x=y \mid m)$ is the probability that the input $x$ is assigned the label $y$ under the model $m$. We will specify a desired confidence threshold $t$ for the fooling image, and we will stop our optimization when we have $p(x_f=y\mid m) >= t$.

  1. Fooling images from correctly classified images
  2. Fooling image from random noise
文章目录
  1. 1. Dropout and Data Augmentation
  2. 2. Ensemble method
  3. 3. Transfer learning
  4. 4. Visualize Saliency Maps(very useful)
  5. 5. Fooling images for ConvNets (For detail code please view the ipython file)