CS231Nassignment1之Image Features

目标

  • 之前实现的都是写好了一个linear classifier然后直接对输入图片的raw pixel进行分类
  • 这部分是先从raw data得到相应的图片特征,然后再对特征进行分类

前面的简单的setup和load data都和之前的一样。

Extract Features

  • 对每张图片计算HOG以及在HSV的color space上面的hue channel。(这是两个不同的功能)
  • HOG可以提取图片的texture的特征,忽略颜色的影响。而颜色的histogram表示的是颜色而忽略texture,颜色的特征会拉成一个新的vector然后进行分类。
  • 如果我们把这两个东西结合可能会有更好的结果。

  • 在这部分的代码里面,直接给出来了提取hog feature和color histogram的两个function,用这两个直接提取出了特征然后构成了一个新的extract_features,由图片内容和特征组成。

  • 然后预处理了特征,减去平均值,除以std(这样大家都在同一个scale里面),最后加上了一个bias的dim
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
from cs231n.features import *

num_color_bins = 10 # Number of bins in the color histogram
feature_fns = [hog_feature, lambda img: color_histogram_hsv(img, nbin=num_color_bins)]
X_train_feats = extract_features(X_train, feature_fns, verbose=True)
X_val_feats = extract_features(X_val, feature_fns)
X_test_feats = extract_features(X_test, feature_fns)

# Preprocessing: Subtract the mean feature
mean_feat = np.mean(X_train_feats, axis=0, keepdims=True)
X_train_feats -= mean_feat
X_val_feats -= mean_feat
X_test_feats -= mean_feat

# Preprocessing: Divide by standard deviation. This ensures that each feature
# has roughly the same scale.
std_feat = np.std(X_train_feats, axis=0, keepdims=True)
X_train_feats /= std_feat
X_val_feats /= std_feat
X_test_feats /= std_feat

# Preprocessing: Add a bias dimension
X_train_feats = np.hstack([X_train_feats, np.ones((X_train_feats.shape[0], 1))])
X_val_feats = np.hstack([X_val_feats, np.ones((X_val_feats.shape[0], 1))])
X_test_feats = np.hstack([X_test_feats, np.ones((X_test_feats.shape[0], 1))])

训练SVM来处理features

用处理多个类别的SVM来给这些特征分类,得到的结果应该比直接分类得到的结果好。大概结果为0.44左右,注意这里面用的是grid search而不是random search

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# Use the validation set to tune the learning rate and regularization strength

from cs231n.classifiers.linear_classifier import LinearSVM

learning_rates = [1e-9, 1e-8, 1e-7]
regularization_strengths = [5e4, 5e5, 5e6]

results = {}
best_val = -1
best_svm = None

################################################################################
# TODO: #
# Use the validation set to set the learning rate and regularization strength. #
# This should be identical to the validation that you did for the SVM; save #
# the best trained classifer in best_svm. You might also want to play #
# with different numbers of bins in the color histogram. If you are careful #
# you should be able to get accuracy of near 0.44 on the validation set. #
################################################################################

for lr in learning_rates:
for rs in regularization_strengths:
svm = LinearSVM()
svm.train(X_train_feats, y_train, learning_rate = lr, reg = rs, num_iters = 1000, verbose = True)
y_pred_val = svm.predict(X_val_feats)
y_pred_train = svm.predict(X_train_feats)

train_acc = np.mean(y_pred_train)
val_acc = np.mean(y_pred_val == y_val)

results[(lr, rs)] = (train_acc,val_acc)

if val_acc > best_val:
best_val = val_acc
best_svm = svm
################################################################################
# END OF YOUR CODE #
################################################################################

# Print out results.
for lr, reg in sorted(results):
train_accuracy, val_accuracy = results[(lr, reg)]
print('lr %e reg %e train accuracy: %f val accuracy: %f' % (
lr, reg, train_accuracy, val_accuracy))

print('best validation accuracy achieved during cross-validation: %f' % best_val)

同时也可视化了不是这个类别却被分到这个类别的错误sample:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# An important way to gain intuition about how an algorithm works is to
# visualize the mistakes that it makes. In this visualization, we show examples
# of images that are misclassified by our current system. The first column
# shows images that our system labeled as "plane" but whose true label is
# something other than "plane".

examples_per_class = 8
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
for cls, cls_name in enumerate(classes):
idxs = np.where((y_test != cls) & (y_test_pred == cls))[0]
idxs = np.random.choice(idxs, examples_per_class, replace=False)
for i, idx in enumerate(idxs):
plt.subplot(examples_per_class, len(classes), i * len(classes) + cls + 1)
plt.imshow(X_test[idx].astype('uint8'))
plt.axis('off')
if i == 0:
plt.title(cls_name)
plt.show()

example
(感觉自己训练了一个傻子)

用两层的nerual net试试看

  • 首先去除上文中bias的dim
  • 然后交叉训练,找到最好的参数
  • 这部分半天loss下不去的原因主要是lr选的太小了
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
from cs231n.classifiers.neural_net import TwoLayerNet

input_dim = X_train_feats.shape[1]
hidden_dim = 500
num_classes = 10
hidden_size = [300,400,500,600]
learning_rate = [1,1e-1,1e-2]
reg = [1e-4,1e-3,1e-2]


# net = TwoLayerNet(input_dim, hidden_dim, num_classes)
best_net = None
best_acc = -1
result = {}

################################################################################
# TODO: Train a two-layer neural network on image features. You may want to #
# cross-validate various parameters as in previous sections. Store your best #
# model in the best_net variable. #
################################################################################

for lr in learning_rate:
for hidd in hidden_size:
for rs in reg:
net = TwoLayerNet(input_size, hidden, num_class)
status = net.train(X_train_feats, y_train, X_val_feats, y_val, num_iters=1200, batch_size=400,
learning_rate=lr, learning_rate_decay=0.95, reg=rs, verbose= True)
val_acc = (net.predict(X_val_feats) == y_val).mean()

result[(lr, rs, hidd)] = (val_acc)
if val_acc > best_acc:
best_acc = val_acc
best_net = net

# print(result)
# for lr, rs, hidd in sorted(result):
# val_accuracy = result[(lr, rs, hidd)]
# print('lr %e reg %e hidden_units %e val accuracy: %f' % (
# lr, rs, hidd , val_accuracy))
print('best validation accuracy achieved during cross-validation: %f' % best_acc)
print('best parameter is :',list (result.keys()) [list (result.values()).index (best_acc)])
################################################################################
# END OF YOUR CODE #
################################################################################

best validation accuracy achieved during cross-validation: 0.605000
best parameter is : (1, 0.0001, 500)

一点感觉

  • 感觉要是lr太小的话,即使增加iteration的次数,后面的改变也不大
  • lr最基础的范围应该先定下来
  • 最后换了换参数居然训出来了60%的val正确率
  • test的正确率在55.8左右