This part is from the assignment 2018:
stanford cs231n assignment2
目标
- 之前已经实现了两层的fc net,但是在这个网络里面的loss和gradient的计算用的是数学方法
- 这样的计算可以在两层的网络里实现,但是多层的情况下实现起来太困难了
- 所以在这里把电脑分成了forward pass和backward pass
forward的过程中,接受所有的input,weights,和其他的参数,返回output和cache(存储back的时候需要的东西)
1
2
3
4
5
6
7
8
9
10def layer_forward(x, w):
""" Receive inputs x and weights w """
# Do some computations ...
z = # ... some intermediate value
# Do some more computations ...
out = # the output
cache = (x, w, z, out) # Values we need to compute gradients
return out, cacheback的时候会接受derivative和之前存储的cache,然后计算最后的gradient
1
2
3
4
5
6
7
8
9
10
11
12
13def layer_backward(dout, cache):
"""
Receive dout (derivative of loss with respect to outputs) and cache,
and compute derivative with respect to inputs.
"""
# Unpack cache values
x, w, z, out = cache
# Use values in cache to compute derivatives
dx = # Derivative of loss with respect to x
dw = # Derivative of loss with respect to w
return dx, dw这样就可以组合各个部分达到最终需要的效果了,无论多深都可以实现了
- 还需要一部分的优化部分,包括Dropout,Batch/Layer的Normalization