Hyperparameter Tuning in Linear Regression

Last Updated : 23 Jul, 2025

Linear regression is one of the simplest and most widely used algorithms in machine learning. Despite its simplicity, it can be quite powerful, especially when combined with proper hyperparameter tuning. Hyperparameter tuning is the process of tuning a machine learning model's parameters to achieve optimal results.

This article will delve into the intricacies of hyperparameter tuning in linear regression, exploring various techniques and their applications.

Table of Content

Understanding Hyperparameters in Linear Regression
Common Methods of Hyperparameter Tuning

1. Random Search
2. Grid Search

Steps for Hyperparameter Tuning in Linear Regression
Building a Model With Hyperparameter Tuning in Linear Regression

1. Fine Tuning Linear Regression Model Using RandomizedSearchCV
2. Fine Tuning Linear Regression Model Using GridSearchCV

Understanding Hyperparameters in Linear Regression

Linear regression is a statistical method that models the relationship between a dependent variable and one or more independent variables using a linear equation.

In machine learning, hyperparameters are the parameters that are set before the learning process begins. They control the behavior of the training algorithm and the structure of the model. Unlike model parameters, which are learned during training, hyperparameters are specified by the practitioner.

For standard linear regression, there are no hyperparameters to tune. However, when we extend linear regression to include regularization techniques such as Ridge Regression, Lasso Regression, and Elastic Net, hyperparameters become crucial. These parameters are not directly learned within the estimator; instead, they are given as an argument to the estimator class (linear regression class). Tuning the hyperparameter helps to control the learning process. In linear regression, common hyperparameters include:

Regularization Strength (Alpha): Controls the trade-off between model complexity and generalizability. Higher alpha values penalize large coefficients, leading to simpler models less prone to overfitting.
Learning Rate (for Gradient Descent-based Optimization): Dictates the step size during optimization. A well-chosen learning rate ensures convergence while avoiding overshooting the optimal solution.
Normalization/Standardization: Scaling features can improve the stability and convergence of gradient descent algorithms, particularly when features have varying scales.

Common Methods of Hyperparameter Tuning

Two of the most widely used methods for hyperparameter tuning are random search and grid search:

1. Random Search

Random search samples hyperparameters from a specified distribution. It is more efficient than grid search as it does not try every combination. Usage: Random search is ideal for discovering hyperparameter combinations that might give a better accuracy. One of the drawback is that it requires more time to execute compared to grid search.

The syntax for Random Search:

from sklearn.model_selection import RandomizedSearchCV
from sklearn.linear_model import Lasso

param_dist = {'alpha': [0.1, 1.0, 10.0, 100.0]}
lasso = Lasso()
random_search = RandomizedSearchCV(lasso, param_dist, n_iter=10, cv=5)
random_search.fit(X_train, y_train)

2. Grid Search

Grid search is an exhaustive search method where we specify a set of hyperparameters and try all possible combinations. It is simple but can be computationally expensive. It ensures thorough exploration of the parameter space but can be computationally intensive. The pseudocode for grid search is:

from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import Ridge

param_grid = {'alpha': [0.1, 1.0, 10.0, 100.0]}
ridge = Ridge()
grid_search = GridSearchCV(ridge, param_grid, cv=5)
grid_search.fit(X_train, y_train)

More advanced methods include Bayesian Optimization and Gradient-Based Optimization, which can offer more efficient searches.

Bayesian Optimization: Bayesian optimization builds a probabilistic model of the objective function and uses it to select the most promising hyperparameters to evaluate. It is more efficient than grid and random search.
Gradient-Based Optimization: Gradient-based optimization methods, such as gradient descent, can also be used for hyperparameter tuning, especially in neural networks. However, they are less common for linear regression.

Steps for Hyperparameter Tuning in Linear Regression

To search for the best combination of hyperparameters, one should follow the below points:

Initialize an estimator using a linear regression model.
Specify a parameter space based on the hyperparameter values that can be adjusted for linear regression.
Choose a hyperparameter tuning method such as GridSearchCV or RandomizedSearchCV to search for best combination of hyperparameters.
Provide the linear regression model, param space and a cross validation scheme to the hyperparameter tuning method and train the model.
Check the best hyperparameter and the best score.

Building a Model With Hyperparameter Tuning in Linear Regression

Let's take a simple dataset (you can find it's link here ) and try building a linear regression model for it and check the accuracy before and after hyperparameter tuning.

Python

import pandas as pd 
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score 
from sklearn.model_selection import train_test_split

dataset_url = "https://media.geeksforgeeks.org/%5C
wp-content/uploads/20240617221743/cars.csv"
df_cars = pd.read_csv(dataset_url)

# Independent variables
X = df_cars[["Year", "Kilometers_Driven", "Mileage", 
             "Engine", "Power", "Seats"]]


# Dependent variable
Y = df_cars["Price"]

# Split the data
X_train, X_test, y_train, y_test = train_test_split(
    X, Y, test_size=0.2, random_state=0)

model = LinearRegression()
model.fit(X_train, y_train)


predictions = model.predict(X_test)

#Accuracy of the model
print(r2_score(y_test, predictions)*100)

Output:

37.0619194663312

Our model's accuracy is 37% which is very less therefore now we will try hyperparameter tuning and check wether we are getting better results or not.

For hyperparameter tuning we have to get the parameter our model needs and for that we will run the following command which gives a dictionary as the output where it's keys are the parameter name and value are the current value these parameter have.

Python

model.get_params()

Output:

{'copy_X': True, 'fit_intercept': True, 'n_jobs': None, 'positive': False}

Using the above output we can create our parameter space which have multiple values for each parameter depending upon the type of value they can have. Just like given below :

Python

param_space = {'copy_X': [True,False], 
               'fit_intercept': [True,False], 
               'n_jobs': [1,5,10,15,None], 
               'positive': [True,False]}

Now we will perform Random search or Grid search to search the best combination of hyperparameter values.

1. Fine Tuning Linear Regression Model Using RandomizedSearchCV

Let's make use of Scikit-Learn's RandomizedSearchCV to search for the best combination of hyperparameter values.

Python

from sklearn.model_selection import RandomizedSearchCV
import pandas as pd 
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score 
from sklearn.model_selection import train_test_split

dataset_url = "https://media.geeksforgeeks.org/%5C
wp-content/uploads/20240617221743/cars.csv"
df_cars = pd.read_csv(dataset_url)

# Independent variables
X = df_cars[["Year", "Kilometers_Driven", "Mileage", 
             "Engine", "Power", "Seats"]]

# Dependent variable
Y = df_cars["Price"]

# Split the data
X_train, X_test, y_train, y_test = train_test_split(
    X, Y, test_size=0.2, random_state=0)

model = LinearRegression()

param_space = {'copy_X': [True,False], 
               'fit_intercept': [True,False], 
               'n_jobs': [1,5,10,15,None], 
               'positive': [True,False]}

random_search = RandomizedSearchCV(model, param_space, n_iter=100, cv=5)
random_search.fit(X_train, y_train)

# Parameter which gives the best results
print(f"Best Hyperparameters: {random_search.best_params_}")

# Accuracy of the model after using best parameters
print(f"Best Score: {random_search.best_score_}")

Output:

Best Hyperparameters: {'positive': False, 'n_jobs': 1, 'fit_intercept': True, 'copy_X': True} 
Best Score: 0.7510045389329963

2. Fine Tuning Linear Regression Model Using GridSearchCV

Let's make use of Scikit-Learn's GridSearchCV to search for the best combination of hyperparameter values.

Python

from sklearn.model_selection import GridSearchCV
import pandas as pd 
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score 
from sklearn.model_selection import train_test_split

dataset_url = "https://media.geeksforgeeks.org/%5C
wp-content/uploads/20240617221743/cars.csv"
df_cars = pd.read_csv(dataset_url)

# Independent variables
X = df_cars[["Year", "Kilometers_Driven", "Mileage", 
             "Engine", "Power", "Seats"]]

# Dependent variable
Y = df_cars["Price"]

# Split the data
X_train, X_test, y_train, y_test = train_test_split(
    X, Y, test_size=0.2, random_state=0)

model = LinearRegression()


param_space = {'copy_X': [True,False], 'fit_intercept': [True,False], 
               'n_jobs': [1,5,10,15,None], 'positive': [True,False]}
grid_search = GridSearchCV(model, param_space, cv=5)

grid_search.fit(X_train, y_train)

# Parameter which gives the best results
print(f"Best Hyperparameters: {grid_search.best_params_}")

# Accuracy of the model after using best parameters
print(f"Best Score: {grid_search.best_score_}")

Output:

Best Hyperparameters: {'copy_X': True, 'fit_intercept': True, 'n_jobs': 1, 'positive': False} 
Best Score: 0.7510045389329963

This way we can use RandomizedSearchCV and GridSearchCV for getting better accuracy from the linear regression model.

Conclusion

Hyperparameter tuning is a vital step in optimizing linear regression models. Techniques such as grid search, random search, and Bayesian optimization can help find the best hyperparameters to improve model performance. Regularization methods like Ridge, Lasso, and ElasticNet are crucial for controlling model complexity and preventing overfitting. By carefully tuning hyperparameters and evaluating the model, one can achieve better predictive performance and more robust models.

shraman08

Improve

Article Tags :

Practice Tags :

Machine Learning

Hyperparameter Tuning in Linear Regression

Understanding Hyperparameters in Linear Regression

Common Methods of Hyperparameter Tuning

1. Random Search

2. Grid Search

Steps for Hyperparameter Tuning in Linear Regression

Building a Model With Hyperparameter Tuning in Linear Regression

1. Fine Tuning Linear Regression Model Using RandomizedSearchCV

2. Fine Tuning Linear Regression Model Using GridSearchCV

Conclusion

Similar Reads

Introduction to Machine Learning

Python for Machine Learning

Feature Engineering

Supervised Learning

Unsupervised Learning

Model Evaluation and Tuning

Advance Machine Learning Technique

Machine Learning Practice

Thank You!

What kind of Experience do you want to share?