In: Advanced Math
What is the gradient of the least squares loss with regularization? This classifier is actually called ridge regression. Hint: Recall the least squares loss for a given point xi is (yi-(wTxi+w0))^2 .The regularized loss would be (yi-(wTxi+w0))2+||w||^2. Now write this out for three coordinate data where w=(w1,w2,w3) and xi=(xi1,xi2,xi3)and solve df/dw1 without and with regularization.
Least Squares loss for a point 
 is given by

And the regularized loss is given by


Writing in 3 coordinate form we have,

Now, 
Therefore,

Similarly,

------------------------
Consider the equation

assuming 
 as constants
we can rewrite this as,

1. With Regularization
In this case this is a linear differential equation of the form
with 
The Integration Factor (IF) is 

Let 
 , Then

Now, let us consider the integral,

Integrating by parts, we get

After deconstructing the integral into the two cases of
 and 
we get

Therefore,

where, 
2. Without Regularization

with 
Therefore, the integration factor is 


To integrate this, we separately consider the cases when
 and 
and combine the result to get the value


where,