Joint Feature Selection with multi-task Lasso in Scikit Learn

Last Updated : 28 Apr, 2025

This article is likely to introduces the concepts of Lasso and multitask Lasso regression and demonstrates how to implement these methods in Python using the scikit-learn library.

The article is going to cover the differences between Lasso and multitask Lasso and provide guidance on which method to prefer in different situations. It may also provide examples of how to implement Lasso and multitask Lasso in Python using the scikit-learn library, and Finally how to perform joint feature selection with multitask Lasso.

What is Joint feature selection?

Joint feature selection is a method for selecting a subset of features to use as input to a machine learning model. The goal of joint feature selection is to select a set of features that are relevant to the prediction task and that work well together to improve the performance of the model.

Joint feature selection can be an important step in the machine learning process, as it can help to improve the performance of the model by selecting the most relevant and informative features from the dataset. It can also help to reduce the complexity of the model and make it more interpretable by eliminating unnecessary or redundant features.

What is Lasso?

Lasso is a type of linear regression that uses L1 regularization, which is a method for reducing overfitting and improving the generalization of the model by adding a penalty term to the objective function. L1 regularization involves adding a term to the objective function that is proportional to the absolute value of the coefficients of the features.

Lasso regression is particularly useful for feature selection, as the L1 regularization term encourages the coefficients of the less important features to be reduced to zero, effectively eliminating those features from the model. This results in sparse solutions, meaning that the final model will only include the most important features. Lasso regression can be used to improve the performance of a linear regression model by reducing overfitting and increasing the generalization of the model.

In general, Lasso regression is a type of shrinkage method that can be used to improve the prediction performance of a linear regression model. It is particularly well-suited for situations where the number of features is larger than the number of observations, or when you want to select a subset of the most important features from a larger dataset.

What is Multi-task Lasso?

Multitask Lasso is a variant of Lasso regression that is designed to handle multiple tasks simultaneously. In multitask Lasso regression, the model is trained to predict multiple target variables at once, rather than just one. The regularization term in multitask Lasso regression is shared across all tasks, which means that the model can learn relationships between the tasks and use that information to improve the prediction performance.

Multitask Lasso regression is often used in multi-output or multi-task learning scenarios, where the goal is to predict multiple related target variables at once. It can be useful for situations where there are multiple tasks that are related and where it is possible to learn relationships between the tasks that can improve the prediction performance.

Like Lasso regression, multitask Lasso regression uses L1 regularization to reduce overfitting and improve the generalization of the model. However, in multitask Lasso regression, the regularization term is shared across all tasks, rather than being applied individually to each task as in Lasso regression. This allows the model to learn relationships between the tasks and use that information to improve the prediction performance.

Difference between the Lasso and the multi-task Lasso using toy datasets from scikit-learn

To use the MultiTaskLasso model with the load_diabetes dataset from sklearn.datasets and perform feature selection using both recursive feature elimination (RFE) and Lasso regularization, the following steps are taken in the code:

Python3

import numpy as np
from sklearn.datasets import load_diabetes
from sklearn.feature_selection import RFE
from sklearn.linear_model import MultiTaskLasso, Lasso
from sklearn.model_selection import train_test_split

Load the load_diabetes dataset and split it into training and test sets.

Python3

# Load the diabetes dataset
X, y = load_diabetes(return_X_y=True)

# Split the dataset into training and test sets
X_train, X_test,\
    y_train, y_test = train_test_split(X, y,
                                       test_size=0.2,
                                       random_state=42)

Create output variables and create a MultiTaskLasso model with an alpha value of 0.1.

Python3

y_train = y_train[:, np.newaxis]
y_test = y_test[:, np.newaxis]

# Create a multi-task Lasso 
# model with an alpha value of 0.1
model = MultiTaskLasso(alpha=0.1)

The MultiTaskLasso model is created with an alpha value of 0.1, which is a hyperparameter that controls the strength of the Lasso regularization term in the optimization objective. A smaller alpha value means that the model is more likely to select more features, while a larger alpha value means that the model is more likely to select fewer features.

Create an RFE object with the model and the desired number of features to select and fit the RFE object to the training data.

Python3

# Create an RFE object with the multi-task
# Lasso model and the desired number of
# features to select
rfe = RFE(model, n_features_to_select=3)

# Fit the RFE object to the training data
rfe.fit(X_train, y_train)

The RFE object is fit to the training data using the fit method. This will train the MultiTaskLasso model on the training data and perform recursive feature elimination to select the 3 most important features.

Use the model to make predictions on the test set and get the indices of the selected features using RFE. After this we will fit the Lasso model to the training data.

Python3

# Use the multi-task Lasso model
# to make predictions on the test set
y_pred = rfe.predict(X_test)

# Get the indices of the selected features using RFE
selected_features_MultiTaskLasso = \
    rfe.get_support(indices=True)

# Fit the Lasso model to the training data
model = Lasso(alpha=0.1)
model.fit(X_train, y_train)

A Lasso model is created with an alpha value of 0.1 and fits the training data using the fit method. This will train the Lasso model on the training data and perform feature selection using Lasso regularization.

Get the indices of the selected features using Lasso regularization and print them.

Python3

# Get the indices of the selected 
# features using Lasso regularization
selected_features_lasso = np.flatnonzero(model.coef_)
print("Selected features using MultiTaskLasso:",
      selected_features_MultiTaskLasso)
print("Selected features using Lasso:",
      selected_features_lasso)

Output:

Selected features using MultiTaskLasso: [2 3 8]
Selected features using Lasso: [1 2 3 4 6 8 9]

To compare the mean squared error (MSE) of the MultiTaskLasso and Lasso models on the test set.

Python3

# Calculate the MSE of the MultiTaskLasso model
mse_MultiTaskLasso = np.mean((y_test - y_pred)**2)

# Calculate the MSE of the Lasso model
y_pred_lasso = model.predict(X_test)
mse_lasso = np.mean((y_test - y_pred_lasso)**2)

# Print the MSE of the two models
print("MSE of MultiTaskLasso:", mse_MultiTaskLasso)
print("MSE of Lasso:", mse_lasso)

Output:

MSE of MultiTaskLasso: 2880.345311787223
MSE of Lasso: 7894.624137792652

The MultiTaskLasso model's predictions on the test set (y_pred) are used to calculate its MSE using the np.mean function and the squared error between the predictions and the true values (y_test). The Lasso model's predictions on the test set (y_pred_lasso) are similarly used to calculate its MSE. Finally, the MSE of the two models is printed using the print function.

To visualize the MSE of the MultiTaskLasso and Lasso models on the test set in a graph, you can use the Matplotlib library:

Python3

import matplotlib.pyplot as plt
# Create a bar plot comparing the MSE of the two models
plt.bar(["MultiTaskLasso", "Lasso"],
        [mse_MultiTaskLasso, mse_lasso])
plt.ylabel("MSE")
plt.show()

Output:

The bar function from matplotlib.pyplot is used to create a bar plot with the names of the two models on the x-axis and their MSE on the y-axis. The ylabel function is used to add a label to the y-axis, and the show function is used to display the plot.

This will create a simple bar plot comparing the MSE of the MultiTaskLasso and Lasso models. You can customize the appearance of the plot by using additional functions from matplotlib.pyplot, such as title to add a title to the plot, xlabel to add a label to the x-axis, or ylim to set the limits of the y-axis.

Joint Feature Selection with multi-task Lasso in Scikit Learn

vinayedula

Improve

Article Tags :

Practice Tags :

Joint Feature Selection with multi-task Lasso in Scikit Learn

What is Joint feature selection?

What is Lasso?

What is Multi-task Lasso?

Difference between the Lasso and the multi-task Lasso using toy datasets from scikit-learn

Similar Reads

Thank You!

What kind of Experience do you want to share?