100% found this document useful (1 vote)

385 views9 pages

Machine Learning Python

This document summarizes four Python libraries that can be used for interpretable machine learning: Yellowbrick, ELI5, LIME, and MLxtend. Yellowbrick provides visualizations for model selection, feature importances, and model performance. ELI5 allows users to inspect feature importances and reasons for individual predictions. LIME explains predictions from classifiers using local approximations. MLxtend contains tools for stacking/voting classifiers, evaluation, and plotting decision boundaries of constituent classifiers.

Uploaded by

milad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

385 views9 pages

Machine Learning Python

Uploaded by

milad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

8/7/2019 Python Libraries for Interpretable Machine Learning

Python Libraries for Interpretable

Machine Learning
4 libraries for better visualisation, explanation and interpretation of
models

Rebecca Vickery
Aug 7 · 5 min read

As concerns regarding bias in artificial intelligence become more prominent it is

becoming more and more important for businesses to be able to explain both the
predictions their models are producing and how the models themselves work.
Fortunately, there is an increasing number of python libraries being developed that
attempt to solve this problem. In the following post, I am going to give a brief guide to

https://towardsdatascience.com/python-libraries-for-interpretable-machine-learning-c476a08ed2c7 1/9
8/7/2019 Python Libraries for Interpretable Machine Learning

four of the most established packages for interpreting and explaining machine learning
models.

The following libraries are all pip installable, come with good documentation and have
an emphasis on visual interpretation.

yellowbrick
This library is essentially an extension of the scikit-learn library and provides some
really useful and pretty looking visualisations for machine learning models. The
visualiser objects, the core interface, are scikit-learn estimators and so if you are used
to working with scikit-learn the workflow should be quite familiar.

The visualisations that can be rendered cover model selection, feature importances and
model performance analysis.

Let’s walk through a few brief examples.

The library can be installed via pip.

pip install yellowbrick

To illustrate a few features I am going to be using a scikit-learn dataset called the wine
recognition set. This dataset has 13 features and 3 target classes and can be loaded
directly from the scikit-learn library. In the below code I am importing the dataset and
converting it to a data frame. The data can be used in a classifier without any
additional preprocessing.

import pandas as pd
from sklearn import datasets

wine_data = datasets.load_wine()
df_wine =
pd.DataFrame(wine_data.data,columns=wine_data.feature_names)
df_wine['target'] = pd.Series(wine_data.target)

I am also using scikit-learn to further split the data set into test and train.

https://towardsdatascience.com/python-libraries-for-interpretable-machine-learning-c476a08ed2c7 2/9
8/7/2019 Python Libraries for Interpretable Machine Learning

from sklearn.model_selection import train_test_split

X = df_wine.drop(['target'], axis=1)
y = df_wine['target']

X_train, X_test, y_train, y_test = train_test_split(X, y,

test_size=0.2)

Next, let’s use the Yellowbricks visualiser to view correlations between features in the
data set.

from yellowbrick.features import Rank2D

import matplotlib.pyplot as plt

visualizer = Rank2D(algorithm="pearson", size=(1080, 720))

visualizer.fit_transform(X_train)
visualizer.poof()

https://towardsdatascience.com/python-libraries-for-interpretable-machine-learning-c476a08ed2c7 3/9
8/7/2019 Python Libraries for Interpretable Machine Learning

Let’s now fit a RandomForestClassifier and evaluate the performance with another
visualiser.

from yellowbrick.classifier import ClassificationReport

from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier()
visualizer = ClassificationReport(model, size=(1080, 720))

visualizer.fit(X_train, y_train)
visualizer.score(X_test, y_test)
visualizer.poof()

ELI5
ELI5 is another visualisation library that is useful for debugging machine learning
models and explaining the predictions they have produced. It works with the most
common python machine learning libraries including scikit-learn, XGBoost and Keras.

Let’s use ELI5 to inspect the feature importances for the model we trained above.

https://towardsdatascience.com/python-libraries-for-interpretable-machine-learning-c476a08ed2c7 4/9
8/7/2019 Python Libraries for Interpretable Machine Learning

import eli5

eli5.show_weights(model, feature_names = X.columns.tolist())

By default the show_weights method uses gain to calculate the weight but you can
specify other types by adding the importance_type argument.

You can also use show_prediction to inspect the reasons for individual predictions.

from eli5 import show_prediction

show_prediction(model, X_train.iloc[1], feature_names =

X.columns.tolist(),
show_feature_values=True)

LIME
LIME (local interpretable model-agnostic explanations) is a package for explaining the
predictions made by machine learning algorithms. Lime supports explanations for
individual predictions from a wide range of classifiers, and support for scikit-learn is
built in.

Let’s use Lime to interpret some predictions from the model we trained earlier.

https://towardsdatascience.com/python-libraries-for-interpretable-machine-learning-c476a08ed2c7 5/9
8/7/2019 Python Libraries for Interpretable Machine Learning

Lime can be installed via pip.

pip install lime

First, we build the explainer. This takes a training dataset as an array, the names of the
features used in the model and the names of the classes in the target variable.

import lime.lime_tabular

explainer = lime.lime_tabular.LimeTabularExplainer(X_train.values,
feature_names=X_train.columns.values.tolist(),
class_names=y_train.unique())

Next, we create a lambda function that uses the model to predict on a sample of the
data. This is borrowed from this excellent, more in-depth, tutorial on Lime.

predict_fn = lambda x: model.predict_proba(x).astype(float)

We then use the explainer to explain the prediction on a selected example. The result is
shown below. Lime produces a visualisation showing how the features have
contributed to this particular prediction.

exp = explainer.explain_instance(X_test.values[0], predict_fn,

num_features=6)
exp.show_in_notebook(show_all=False)

MLxtend
This library contains a host of helper functions for machine learning. This covers things
like stacking and voting classifiers, model evaluation, feature extraction and
https://towardsdatascience.com/python-libraries-for-interpretable-machine-learning-c476a08ed2c7 6/9
8/7/2019 Python Libraries for Interpretable Machine Learning

engineering and plotting. In addition to the documentation, this paper is a good

resource for a more detailed understanding of the package.

Let’s use MLxtend to compare the decision boundaries for a voting classifier against its
constituent classifiers.

Again it can be installed via pip.

pip install mlxtend

The imports I am using are shown below.

from mlxtend.plotting import plot_decision_regions

from mlxtend.classifier import EnsembleVoteClassifier
import matplotlib.gridspec as gridspec
import itertools

from sklearn import model_selection

from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier

The following visualisation only works with two features at a time so we will first create
an array containing the features proline and color_intensity . I have chosen these as
they had the highest weighting from all the features we inspected earlier using ELI5.

X_train_ml = X_train[['proline', 'color_intensity']].values

y_train_ml = y_train.values

Next, we create the classifiers, fit them to the training data and visualise the decision
boundaries using MLxtend. The output is shown below the code.

clf1 = LogisticRegression(random_state=1)
clf2 = RandomForestClassifier(random_state=1)
clf3 = GaussianNB()
eclf = EnsembleVoteClassifier(clfs=[clf1, clf2, clf3], weights=
[1,1,1])

https://towardsdatascience.com/python-libraries-for-interpretable-machine-learning-c476a08ed2c7 7/9
8/7/2019 Python Libraries for Interpretable Machine Learning

value=1.5
width=0.75

gs = gridspec.GridSpec(2,2)

fig = plt.figure(figsize=(10,8))

labels = ['Logistic Regression', 'Random Forest', 'Naive Bayes',

'Ensemble']

for clf, lab, grd in zip([clf1, clf2, clf3, eclf],

labels,
itertools.product([0, 1], repeat=2)):

clf.fit(X_train_ml, y_train_ml)
ax = plt.subplot(gs[grd[0], grd[1]])
fig = plot_decision_regions(X=X_train_ml, y=y_train_ml, clf=clf)
plt.title(lab)

This is by no means an exhaustive list of libraries for interpreting, visualising and

explaining machine learning models. This excellent post contains a long list of other
useful libraries to try out.

Thanks for reading!

https://towardsdatascience.com/python-libraries-for-interpretable-machine-learning-c476a08ed2c7 8/9
8/7/2019 Python Libraries for Interpretable Machine Learning

Machine Learning Arti cial Intelligence Programming Data Science Towards Data Science

About Help Legal

https://towardsdatascience.com/python-libraries-for-interpretable-machine-learning-c476a08ed2c7 9/9

Python Seaborn Notes
No ratings yet
Python Seaborn Notes
28 pages
Fake News Detection Using Machine Learning Models
No ratings yet
Fake News Detection Using Machine Learning Models
5 pages
How To Document Your Data Science Project
No ratings yet
How To Document Your Data Science Project
9 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Churn For Bank Customers
No ratings yet
Churn For Bank Customers
28 pages
21 Machine Learning Using Scikit Learn Ipynb Colaboratory PDF
100% (1)
21 Machine Learning Using Scikit Learn Ipynb Colaboratory PDF
23 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
15 pages
Bank Customer Churn Analysis - Jupyter Notebook
No ratings yet
Bank Customer Churn Analysis - Jupyter Notebook
11 pages
Deploy A Machine Learning Model Using Flask - Towards Data Science
No ratings yet
Deploy A Machine Learning Model Using Flask - Towards Data Science
12 pages
Prediction of Company Bankruptcy: Amlan Nag
100% (2)
Prediction of Company Bankruptcy: Amlan Nag
16 pages
Machine Learning in Python Main Developments and T
100% (1)
Machine Learning in Python Main Developments and T
44 pages
Python Data Analysis Visualization
No ratings yet
Python Data Analysis Visualization
34 pages
Scikit - Notes ML
100% (2)
Scikit - Notes ML
12 pages
Hyperparameter Tuning in XGBoost Using Genetic Algorithm
100% (1)
Hyperparameter Tuning in XGBoost Using Genetic Algorithm
11 pages
Data Pre-Processing (Pandas)
No ratings yet
Data Pre-Processing (Pandas)
19 pages
Machine Learning Project Report
100% (1)
Machine Learning Project Report
4 pages
Lec16 - Autoencoders
No ratings yet
Lec16 - Autoencoders
18 pages
Loss Functions
No ratings yet
Loss Functions
37 pages
Machine Learning
100% (5)
Machine Learning
56 pages
Matplotlib PDF
No ratings yet
Matplotlib PDF
16 pages
Clustering K-Means
100% (2)
Clustering K-Means
28 pages
CCS355 Neural Networks and Deep Learning Lab
No ratings yet
CCS355 Neural Networks and Deep Learning Lab
43 pages
Feature Selection Techniques in ML With Python-1
No ratings yet
Feature Selection Techniques in ML With Python-1
7 pages
Classification Algorithms
100% (2)
Classification Algorithms
23 pages
Curse of Dimensionality
No ratings yet
Curse of Dimensionality
9 pages
Deep Learning With Keras
100% (5)
Deep Learning With Keras
136 pages
ML Lab File
No ratings yet
ML Lab File
53 pages
7 Classification
100% (3)
7 Classification
63 pages
Bagging and Boosting Regression Algorithms
100% (1)
Bagging and Boosting Regression Algorithms
84 pages
ML Project Shivani Pandey
100% (2)
ML Project Shivani Pandey
49 pages
MAchine Learning
No ratings yet
MAchine Learning
120 pages
ML Notes
100% (2)
ML Notes
125 pages
Supervised Learning 1 PDF
100% (1)
Supervised Learning 1 PDF
162 pages
Salary Prediction LinearRegression
100% (1)
Salary Prediction LinearRegression
7 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
ML Practical File
100% (2)
ML Practical File
43 pages
Ensemble Learning Methods
100% (1)
Ensemble Learning Methods
24 pages
Machine Learning Theory
100% (1)
Machine Learning Theory
12 pages
Machine Learnin
100% (2)
Machine Learnin
23 pages
L2 - Machine Learning Process
No ratings yet
L2 - Machine Learning Process
17 pages
Pandas
100% (1)
Pandas
1,131 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages
The Problem of Overfitting: Overfitting With Linear Regression
No ratings yet
The Problem of Overfitting: Overfitting With Linear Regression
32 pages
K Means
100% (2)
K Means
329 pages
Statistical Machine Learning
100% (1)
Statistical Machine Learning
12 pages
Feature Engineering
100% (2)
Feature Engineering
44 pages
Machine Learning Bits
100% (2)
Machine Learning Bits
28 pages
Introduction To ML
100% (1)
Introduction To ML
39 pages
Deep Learning Laboratory
No ratings yet
Deep Learning Laboratory
69 pages
02 - Decision Tree Classification On Iris Dataset
No ratings yet
02 - Decision Tree Classification On Iris Dataset
6 pages
Machine Learning Hands-On
100% (1)
Machine Learning Hands-On
18 pages
Deep Learning Interview Questions
No ratings yet
Deep Learning Interview Questions
17 pages
Introduction To Python Libraries
No ratings yet
Introduction To Python Libraries
13 pages
Top 9 Feature Engineering Techniques With Python: Dataset & Prerequisites
No ratings yet
Top 9 Feature Engineering Techniques With Python: Dataset & Prerequisites
27 pages
Loading The Dataset: 'Churn - Modelling - CSV'
No ratings yet
Loading The Dataset: 'Churn - Modelling - CSV'
6 pages
Machine Learning Projects For Final Year PDF
No ratings yet
Machine Learning Projects For Final Year PDF
4 pages
Data Visualisation Using Python
100% (1)
Data Visualisation Using Python
77 pages
Bagging and Random Forest Presentation1
100% (3)
Bagging and Random Forest Presentation1
23 pages
Gujarat Technological University: Semester - V Subject Name: Python Programming
No ratings yet
Gujarat Technological University: Semester - V Subject Name: Python Programming
4 pages
Feature Selection Techniques in Machine Learning
No ratings yet
Feature Selection Techniques in Machine Learning
9 pages
Radosavovic Data Distillation Towards CVPR 2018 Paper
No ratings yet
Radosavovic Data Distillation Towards CVPR 2018 Paper
10 pages
1615888543RME - Detail Syllabus PhD-2020
No ratings yet
1615888543RME - Detail Syllabus PhD-2020
28 pages
Ch3 Classification
No ratings yet
Ch3 Classification
80 pages
Bypass Fraud Detection:: Artificial Intelligence Approach
No ratings yet
Bypass Fraud Detection:: Artificial Intelligence Approach
4 pages
Backpropagation Algorithm
No ratings yet
Backpropagation Algorithm
3 pages
DAM 363 Economic Statistics
No ratings yet
DAM 363 Economic Statistics
223 pages
An Intrusion Detection Model Based On A Convolutio
No ratings yet
An Intrusion Detection Model Based On A Convolutio
8 pages
Module 01 - Performance Metrics in ML
No ratings yet
Module 01 - Performance Metrics in ML
15 pages
2732 6870 2 LE Proof1
No ratings yet
2732 6870 2 LE Proof1
11 pages
Problem 1:: Readingcsv PD Read - Excel (Readingcsv) Readingcsv Head
No ratings yet
Problem 1:: Readingcsv PD Read - Excel (Readingcsv) Readingcsv Head
18 pages
4141 Final IEEE-igarss Publishedpaper2018
No ratings yet
4141 Final IEEE-igarss Publishedpaper2018
4 pages
ML-CBT July24
No ratings yet
ML-CBT July24
3 pages
Semi-Automatic Classification Plugin - Tutorial
No ratings yet
Semi-Automatic Classification Plugin - Tutorial
20 pages
I Wordify I A Tool For Discovering and
No ratings yet
I Wordify I A Tool For Discovering and
21 pages
Report Structure
No ratings yet
Report Structure
15 pages
Psychoradiologic Utility of MR Imaging For Diagnosis of Attention Deficit Hyperactivity Disorder
No ratings yet
Psychoradiologic Utility of MR Imaging For Diagnosis of Attention Deficit Hyperactivity Disorder
11 pages
Marks Hi Marks: Be Comp MCQ PDF
100% (1)
Marks Hi Marks: Be Comp MCQ PDF
878 pages
IDA-Group Assignment Question
No ratings yet
IDA-Group Assignment Question
6 pages
Classification Algorithms II
No ratings yet
Classification Algorithms II
9 pages
CH 05 PPTaccessible
No ratings yet
CH 05 PPTaccessible
60 pages
7sem Syllabus
No ratings yet
7sem Syllabus
7 pages
Analysis of Coastline Changes Using Sentinel 2A Im
No ratings yet
Analysis of Coastline Changes Using Sentinel 2A Im
16 pages
What Is Machine Learning - Qifang Bi, Katherine E. Goodman, Joshua Kaminsky, and Justin Lessler
No ratings yet
What Is Machine Learning - Qifang Bi, Katherine E. Goodman, Joshua Kaminsky, and Justin Lessler
18 pages
Learning Algorithms For The Classification Restricted Boltzmann Machine
No ratings yet
Learning Algorithms For The Classification Restricted Boltzmann Machine
27 pages
Fuzzy Logic-Based DDoS Attacks and Network Traffic Anomaly Detection Methods
No ratings yet
Fuzzy Logic-Based DDoS Attacks and Network Traffic Anomaly Detection Methods
24 pages
Prediction of Autism and Dyslexia Using Machine Learning and Clinical Data Balancing
No ratings yet
Prediction of Autism and Dyslexia Using Machine Learning and Clinical Data Balancing
11 pages
NEURAL NETWORKS - Prediction of Admission in Pediatric Emergency Department
No ratings yet
NEURAL NETWORKS - Prediction of Admission in Pediatric Emergency Department
8 pages
Ganesh
No ratings yet
Ganesh
28 pages
ML Cheatsheet PDF
100% (1)
ML Cheatsheet PDF
211 pages
Prediction of Diabetes Using Machine Learning
0% (1)
Prediction of Diabetes Using Machine Learning
6 pages

Machine Learning Python

Uploaded by

Machine Learning Python

Uploaded by

8/7/2019 Python Libraries for Interpretable Machine Learning

Python Libraries for Interpretable

As concerns regarding bias in artificial intelligence become more prominent it is

Let’s walk through a few brief examples.

The library can be installed via pip.

pip install yellowbrick

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y,

from yellowbrick.features import Rank2D

visualizer = Rank2D(algorithm="pearson", size=(1080, 720))

from yellowbrick.classifier import ClassificationReport

eli5.show_weights(model, feature_names = X.columns.tolist())

from eli5 import show_prediction

show_prediction(model, X_train.iloc[1], feature_names =

Lime can be installed via pip.

pip install lime

predict_fn = lambda x: model.predict_proba(x).astype(float)

exp = explainer.explain_instance(X_test.values[0], predict_fn,

engineering and plotting. In addition to the documentation, this paper is a good

Again it can be installed via pip.

pip install mlxtend

The imports I am using are shown below.

from mlxtend.plotting import plot_decision_regions

from sklearn import model_selection

X_train_ml = X_train[['proline', 'color_intensity']].values

labels = ['Logistic Regression', 'Random Forest', 'Naive Bayes',

for clf, lab, grd in zip([clf1, clf2, clf3, eclf],

This is by no means an exhaustive list of libraries for interpreting, visualising and

Thanks for reading!

About Help Legal

You might also like