Linear regression is used to predict real-valued output
for a given input data point
. Linear regression establishes a relationship of dependent variable
with the features of the input data with an assumption that the expected value of the output(dependent variable) is a linear function of the input (
).

Let's assume our training dataset is
where
is the number of data points and
is the number of dimension or number of features in our dataset. From now on we will write our dataset as
where each
for
is a column vector.
We can write the output
as:
or we can write it as:
Before computing the final weights for this equation, we need to figure out what degree we should choose. We usually select the degree for which we get less mean squared error(MSE).
The most common form of linear regression is degree 1 form:
There are two ways by which we can estimate the parameters:
- Normal equation: Weight vector is estimated by matrix multiplication of psuedo-inverse of feature matrix and the label vector
- Gradient descent method: Loss minimization
gradient of loss w.r.t weight vector:  = 2(X^TXw - X^Ty))
from the above equation we get the ML-estimate for
:
In case of Ridge Regression, the loss function becomes:
now,  = 2(X^TXw - X^Ty + \lambda w))
and ^{-1}X^Ty)
Comments
Post a Comment