# Solved – Assumptions of linear regression and gradient descent

I have been reading on linear regression(from Andrew Ng's lectures and ISLR) and estimating the coefficients using gradient descent. This is what I've understood of gradient descent

• Include a dummy variable value of which is one throughout all sample points so that we can get the intercept value.
• assign weights to the variables randomly to each of the variables and make predictions according to the weights
• Using a cost function(squared error for example), compute the loss and the value of the derivative of the cost function with respect to the weights(the gradient)
• adjust the weights by subtracting its gradient (times the learning rate) from the weight
• iterate until the change in weights is insignificant or some number of iterations have been done

Now coming to the assumptions of linear regression, no where in this whole process we've had to assume anything like the normality and constant variance of errors, auto-correlation or independence of features. I agree that if the response variable is linearly related to the predictor variables the model fit will be better. But what of the other assumptions? My question is where do these assumptions stem from? What's the basis/justification for making these assumptions?

Contents