In: Computer Science
Thank you
a) Back propagation is used to optimize parameters of neural networks (i.e. parameter of each neuron) to minimize loss function value . Back propagation consist of calculating the derivative of each parameter of each neuron with respect to loss function.
w = w - learning_rate * (derivative of loss with respect to w)
the above equation is used to update parameters. The value in bracket it computed during back propagation.
b) First we calculate activations of each layer during forward feed and then goes backward in reverse order. It works similarly to chain rule in differentiation. if we have y=f(x), z=g(y). To calculate z, we first have to calculate y and then we go to z. Similarly if we have to calculate derivative of z with respect to x, we calculate derivative of z with respect to y and then derivative of y with respect to x. (dz/dx)=(dz/dy)*(dy/dx)
For batch mode, Steps are->
for i=1 to number_of_iterations
//assuming vectorized implemetation of input and outputs
//this will happen for all inputs at same time
Forward Propagation to calculate activation of each layer (forward direction)
Calculate loss
back propagation to compute derivative of loss with respect to parameters (reverse direction)
Update parameters