CS231Nassignment1之Image Features

目标

之前实现的都是写好了一个linear classifier然后直接对输入图片的raw pixel进行分类
这部分是先从raw data得到相应的图片特征，然后再对特征进行分类

前面的简单的setup和load data都和之前的一样。

Extract Features

对每张图片计算HOG以及在HSV的color space上面的hue channel。（这是两个不同的功能）
HOG可以提取图片的texture的特征，忽略颜色的影响。而颜色的histogram表示的是颜色而忽略texture，颜色的特征会拉成一个新的vector然后进行分类。
如果我们把这两个东西结合可能会有更好的结果。
在这部分的代码里面，直接给出来了提取hog feature和color histogram的两个function，用这两个直接提取出了特征然后构成了一个新的extract_features，由图片内容和特征组成。
然后预处理了特征，减去平均值，除以std（这样大家都在同一个scale里面），最后加上了一个bias的dim

from cs231n.features import *

num_color_bins = 10 # Number of bins in the color histogram
feature_fns = [hog_feature, lambda img: color_histogram_hsv(img, nbin=num_color_bins)]
X_train_feats = extract_features(X_train, feature_fns, verbose=True)
X_val_feats = extract_features(X_val, feature_fns)
X_test_feats = extract_features(X_test, feature_fns)

# Preprocessing: Subtract the mean feature
mean_feat = np.mean(X_train_feats, axis=0, keepdims=True)
X_train_feats -= mean_feat
X_val_feats -= mean_feat
X_test_feats -= mean_feat

# Preprocessing: Divide by standard deviation. This ensures that each feature
# has roughly the same scale.
std_feat = np.std(X_train_feats, axis=0, keepdims=True)
X_train_feats /= std_feat
X_val_feats /= std_feat
X_test_feats /= std_feat

# Preprocessing: Add a bias dimension
X_train_feats = np.hstack([X_train_feats, np.ones((X_train_feats.shape[0], 1))])
X_val_feats = np.hstack([X_val_feats, np.ones((X_val_feats.shape[0], 1))])
X_test_feats = np.hstack([X_test_feats, np.ones((X_test_feats.shape[0], 1))])

训练SVM来处理features

用处理多个类别的SVM来给这些特征分类，得到的结果应该比直接分类得到的结果好。大概结果为0.44左右，注意这里面用的是grid search而不是random search

# Use the validation set to tune the learning rate and regularization strength

from cs231n.classifiers.linear_classifier import LinearSVM

learning_rates = [1e-9, 1e-8, 1e-7]
regularization_strengths = [5e4, 5e5, 5e6]

results = {}
best_val = -1
best_svm = None

################################################################################
# TODO:                                                                        #
# Use the validation set to set the learning rate and regularization strength. #
# This should be identical to the validation that you did for the SVM; save    #
# the best trained classifer in best_svm. You might also want to play          #
# with different numbers of bins in the color histogram. If you are careful    #
# you should be able to get accuracy of near 0.44 on the validation set.       #
################################################################################

for lr in learning_rates:
    for rs in regularization_strengths:
        svm = LinearSVM()
        svm.train(X_train_feats, y_train, learning_rate = lr, reg = rs, num_iters = 1000, verbose = True)
        y_pred_val = svm.predict(X_val_feats)
        y_pred_train = svm.predict(X_train_feats)

        train_acc = np.mean(y_pred_train)
        val_acc = np.mean(y_pred_val == y_val)

        results[(lr, rs)] = (train_acc,val_acc)

        if val_acc > best_val:
            best_val = val_acc
            best_svm = svm
################################################################################
#                              END OF YOUR CODE                                #
################################################################################

# Print out results.
for lr, reg in sorted(results):
    train_accuracy, val_accuracy = results[(lr, reg)]
    print('lr %e reg %e train accuracy: %f val accuracy: %f' % (
                lr, reg, train_accuracy, val_accuracy))
    
print('best validation accuracy achieved during cross-validation: %f' % best_val)

同时也可视化了不是这个类别却被分到这个类别的错误sample：

# An important way to gain intuition about how an algorithm works is to
# visualize the mistakes that it makes. In this visualization, we show examples
# of images that are misclassified by our current system. The first column
# shows images that our system labeled as "plane" but whose true label is
# something other than "plane".

examples_per_class = 8
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
for cls, cls_name in enumerate(classes):
    idxs = np.where((y_test != cls) & (y_test_pred == cls))[0]
    idxs = np.random.choice(idxs, examples_per_class, replace=False)
    for i, idx in enumerate(idxs):
        plt.subplot(examples_per_class, len(classes), i * len(classes) + cls + 1)
        plt.imshow(X_test[idx].astype('uint8'))
        plt.axis('off')
        if i == 0:
            plt.title(cls_name)
plt.show()

example
（感觉自己训练了一个傻子）

用两层的nerual net试试看

首先去除上文中bias的dim
然后交叉训练，找到最好的参数
这部分半天loss下不去的原因主要是lr选的太小了

from cs231n.classifiers.neural_net import TwoLayerNet

input_dim = X_train_feats.shape[1]
hidden_dim = 500
num_classes = 10
hidden_size = [300,400,500,600]
learning_rate   = [1,1e-1,1e-2]
reg = [1e-4,1e-3,1e-2]


# net = TwoLayerNet(input_dim, hidden_dim, num_classes)
best_net = None
best_acc = -1
result = {}

################################################################################
# TODO: Train a two-layer neural network on image features. You may want to    #
# cross-validate various parameters as in previous sections. Store your best   #
# model in the best_net variable.                                              #
################################################################################

for lr in learning_rate:
    for hidd in hidden_size:
        for rs in reg:
            net = TwoLayerNet(input_size, hidden, num_class)
            status = net.train(X_train_feats, y_train, X_val_feats, y_val, num_iters=1200, batch_size=400,
                             learning_rate=lr, learning_rate_decay=0.95, reg=rs, verbose= True)
            val_acc = (net.predict(X_val_feats) == y_val).mean()
            
            result[(lr, rs, hidd)] = (val_acc)
            if val_acc > best_acc:
                best_acc = val_acc
                best_net = net

# print(result)
# for lr, rs, hidd in sorted(result):
#     val_accuracy = result[(lr, rs, hidd)]
#     print('lr %e reg %e hidden_units %e val accuracy: %f' % (
#                 lr, rs, hidd , val_accuracy))
print('best validation accuracy achieved during cross-validation: %f' % best_acc)
print('best parameter is :',list (result.keys()) [list (result.values()).index (best_acc)])
################################################################################
#                              END OF YOUR CODE                                #
################################################################################

best validation accuracy achieved during cross-validation: 0.605000
best parameter is : (1, 0.0001, 500)

一点感觉

感觉要是lr太小的话，即使增加iteration的次数，后面的改变也不大
lr最基础的范围应该先定下来
最后换了换参数居然训出来了60%的val正确率
test的正确率在55.8左右