In: Computer Science
For Machine Learning, one question.......................................................................................
1) Discuss the relation between feature scaling and the gradient descent algorithm. To be precise, discuss how feature scaling can affect the gradient descent algorithm?
To understand effect of feature scaling on gradient descent, First we should understand what is feature scaling and gradient descent.
Feature scaling is a technique to standardize the independent features present in the data in a fixed range. It is performed during the data pre-processing to handle highly varying magnitudes or values or units. If feature scaling is not done, then a machine learning algorithm tends to weigh greater values, higher and consider smaller values as the lower values, regardless of the unit of the values.
Gradient descent is an optimization algorithm used for minimizing the cost function in various machine learning algorithms. It is basically used for updating the parameters of the learning model in order to increase accuracy of model. The formula for gradient descent is given below.
The presence of feature value X in the formula will affect the step size of the gradient descent. The difference in ranges of features will cause different step sizes for each feature. To ensure that the gradient descent moves smoothly towards the minima and that the steps for gradient descent are updated at the same rate for all the features, we scale the data before feeding it to the model. So, our model will train smoothly and update parameters in order to decrease cost function and increase accuracy.
So. we can say Having features on a similar scale (which can achive using feature scaling) can help the gradient descent converge more quickly towards the minima.