Ridge and Lasso Regression

What is Regularization?
In a general manner, to form things regular or acceptable is what we mean by the term regularization. this is exactly why we use it for applied machine learning. within the domain of machine learning, regularization is that the process which prevents overfitting by discouraging developers learning a more complex or flexible model, and eventually , which regularizes or shrinks the coefficients towards zero. the essential idea is to penalize the complex models i.e. adding a complexity term in such how that it tends to offer a much bigger loss for evaluating complex models.

Lasso Regression (L1 Regularization)
This regularization technique performs L1 regularization. Unlike Ridge Regression, it modifies the RSS by adding the penalty (shrinkage quantity) like the sum of absolutely the value of coefficients.

Looking at the equation below, we will observe that almost like Ridge Regression, Lasso (Least Absolute Shrinkage and Selection Operator) also penalizes absolutely the size of the regression coefficients. additionally to the present , it’s quite capable of reducing the variability and improving the accuracy of rectilinear regression models.

Ridge Regression (L2 Regularization)
This technique performs L2 regularization. the most algorithm behind this is often to switch the RSS by adding the penalty which is like the square of the magnitude of coefficients. However, it’s considered to be a way used when the data suffers from multicollinearity (independent variables are highly correlated). In multicollinearity, albeit the littlest amount squares estimates (OLS) are unbiased, their variances are large which deviates the observed value away from truth value. By adding a degree of bias to the regression estimates, ridge regression reduces the standard errors. It tends to unravel the multicollinearity problem through shrinkage parameter λ.

Now, let’s see if ridge regression works better or lasso are going to be better. For ridge regression, we introduce GridSearchCV. This may allow us to automatically perform 5-fold cross-validation with a variety of various regularization parameters so as to seek out the optimal value of alpha. you ought to see that the optimal value of alpha is 100, with a negative MSE of -29.90570. we will easily observe a small improvement on comparing with the essential multiple rectilinear regression .

The code seems like this: