Regularization
- In simple terms Regulaization means adding penalty parameter to the equation to avoid overfitting.
- To avoid overfitting generally in Regression model , we use Regularization.
- When we calculate cost function in regression model without Regularization, then while reducing the cost function the model overfit the data , to avoid this situation we add penalty parameter with the cost function which help in removing overfitting.
- MSE is the evaluation metric of the regression model.
- Ridge and Lasso are the two Regularization Techiniques
Ridge:
- Ridge is also known as L2 regulaization. In Ridge the cost function is calculated as
** Cost function=MSE+ λ(m)^2**
- where MSE is
and m is the slope and λ is the Regulaizaion parameter,it should be positve.
- So if the slope of the best fit line is high then the cost function is high ,so the best fit line is again calculated for reducing the cost function , hence overfitting is reduced and we get more generalized model.
- Note : λ is kept low to avoid underfitting.
Lasso:
- Lasso is also known as L1 Regularization . It used in feature selection and to avoid overfitting.
- The equation of to calculate Lasso is
** Cost function=MSE+ λ|m|**
- In lasso we take the absolute value of the slope. If the particular feature is not affecting much the best fit line i.e it has low slope then we can neglect such feature.