In: Advanced Math
What is the gradient of the least squares loss with regularization? This classifier is actually called ridge regression. Hint: Recall the least squares loss for a given point xi is (yi-(wTxi+w0))^2 .The regularized loss would be (yi-(wTxi+w0))2+||w||^2. Now write this out for three coordinate data where w=(w1,w2,w3) and xi=(xi1,xi2,xi3)and solve df/dw1 without and with regularization.
Least Squares loss for a point
is given by

And the regularized loss is given by


Writing in 3 coordinate form we have,

Now,
Therefore,

Similarly,

------------------------
Consider the equation

assuming
as constants
we can rewrite this as,

1. With Regularization
In this case this is a linear differential equation of the form
with
The Integration Factor (IF) is

Let
, Then

Now, let us consider the integral,

Integrating by parts, we get

After deconstructing the integral into the two cases of
and
we get

Therefore,

where,
2. Without Regularization

with
Therefore, the integration factor is


To integrate this, we separately consider the cases when
and
and combine the result to get the value


where,