In: Computer Science
Can I use a linear gain function in any of its layers of a multilayer perceptron? Explain
Yes, you can definitely use it but it is as good as having no hidden layer because the linear activation function reduces the equations to linear regression form with redundant parameters. To make the point more clear let's look at an example.
Consider a linear function f(x) = a x + b. If we take another linear function g(z) = c z + d, and apply g(f(x)) (which would be the equivalent of feeding the output of one layer as the input to the next layer with linear activation) we get g(f(x)) = c (a x + b) + d = ac x + cb + d = (ac) x + (cb + d) which is in itself another linear function with redundant parameters.The value of (ac) could be simply written as A and (cb + d) as B, giving Ax + B.
Therefore, using a linear activation function doesn't really add any value to the MLP. Infact, the main purpose of having non-linear activation functions like sigmoid , relu or tanh is to make MLP learn non-linear and more complex higher order features for making more accurate predictions.
----------------------------------------------------------------------------------------------------------
Please do upvote if you liked the solution.Happy learning!!