SlideShare a Scribd company logo
Graph Neural Network in practiceGraph Neural Network in practice
Céline Brouard and Nathalie Vialaneix, INRAE/MIATCéline Brouard and Nathalie Vialaneix, INRAE/MIAT
WG GNN, December 17th, 2020WG GNN, December 17th, 2020
1 / 261 / 26
GNN in practiceGNN in practice
Message passing principle and exploration of two librairiesMessage passing principle and exploration of two librairies
2 / 262 / 26
OverviewofGNN
last layer is fed to a standard MLP for prediction (at the graph level).
3 / 26
Message passing layers
are the generalization of convolutional layers to graph data
general concept introduced in [Gilmer et al. 2017] (general framework for
several previous GNN)
More formally, if is a graph with nodes
nodes
edges ,
node features for :
edge features for : are associated
representation of node , learned iteratively (layers ):
with : differential permutation invariant function (mean, sum, max...)
Rq: Actually [Gilmer et al. 2017] use: and (but no example).
G = (X, E) n
x ∈ X
e ∈ E
x lx
e le
x hx ∈ R
K
t = 1 … T
h
t+1
x = F (h
t
x , □y∈N (x)
ϕt(h
t
x , h
t
y, exy))
□
□ = ∑ Ft
4 / 26
Examples ofstandard MP layers
(restricted to those present in both PyTorch Geometric and Spektral)
spectral chebyshev (ChebNets) [Defferrard et al., 2016] DETAILS
Gated Graph Neural Network (GATGNN) [Li et al., 2016] DETAILS
attention-based (GAT) [Veličković et al., 2017]
Attention-based GNN (AGNN) [Thekumparampil et al., 2018]
GraphSAGE [Hamilton et al., 2017]
Graph Convolutional Networks (GCN) [Kipf & Welling, 2017] DETAILS
edge-convolution operator [Wang et al., 2018]
Graph Isomorphism Network (GIN) [Xu et al., 2019] DETAILS
ARMA [Bianchi et al., 2019]
Approximate Personalized Propagation of Neural Predictions (APPNP)
[Klicpera et al., 2019]
5 / 26
ChebNets [De errard etal., 2016]
Setting: (weighted graph)
Main idea: Signal filtering based on the Laplacian eigendecomposition
, and
is replaced by
(row corresponds to new feature , ie )
with
and is a polynomial (a decomposition on
Chebyshev polynomial basis is used) with , the polynomial
coefficients, learned during training).
h
t+1
x = F (h
t
x , □y∈N (x)
ϕt(h
t
x , h
t
y, exy))
le ∈ R
(Λ, U )
h
t
x ∈ R
K(t)
F (h
t
x , . ) = σ(. )
□y∈N (x)
ϕt(h
t
x , h
t
y, exy)
(∑
K(t)
k
′
=1
gθ(k,k
′
)
(L)(h
t
1k
′   …  h
t
nk
′ )
⊤
)
k=1,…,K(t+1)
∈ R
n×K(t+1)
x h
t+1
x gθ(k,k
′
)
(L) ∈ R
n×n
gθ(k,k
′
) (L) = U gθ(k,k
′
) (Λ)U
⊤
gθ(k,k
′
)
θ(k, k
′
) ∈ R
r
6 / 26
ChebNets [De errard etal., 2016](some
explanations)
Why is it message passing?
can be rewritten under the compact form
with
:
slight difference with general framework: MP is performed over all nodes (not
just neighbors) + Laplacian used to provide proximity relations between nodes
h
t+1
x = F (h
t
x , □y∈N (x)
ϕt(h
t
x , h
t
y, exy))
(∑
K(t)
k
′
=1
gθ(k,k
′
) (L)(h1k
′ ,
t
… h
t
nk
′ )
⊤
)
k=1,…,K(t+1)
∑
y
C
t
xy(θ)h
t
y
C
t
xy(θ) ∈ R
K(t+1)×K(t)
[C
t
xy]kk
′ = [gθ(k,k
′
)
]xy(L)
7 / 26
GATGNN [Li etal., 2016]
Setting: discrete (potentially directed)
Main idea: Use GRU (Gated Recurrent Unit [Cho et al., 2016]) in the original
GNN [Scarselli et al., 2009]
, and where
learned matrix depending on only
(update)
(reset)
h
t+1
x = F (h
t
x , □y∈N (x)
ϕt(h
t
x , h
t
y, exy))
le ∈ {A, B, . . . }
h
t
x ∈ R
K(t)
□ = ∑ ϕt(h
t
x , h
t
y, exy) = Alexy
h
t
y
Alexy
∈ R
K(t+1)×K(t)
lexy
z
t
x = σ(W
z
a
t
x + U
z
h
t
x )
r
t
x = σ(W
r
a
t
x + U
r
h
t
x )
~
h
t
x
= tanh(W a
t
x + U (r
t
x ⊙ h
t
x ))
h
t+1
x = (1 − z
t
x ) ⊙ h
t
x + z
t
x
~
h
t
x
8 / 26
GATGNN [Li etal., 2016](with some explanations)
: no update
: reset of in
These parameters and the matrices are learned.
h
t+1
x = F (h
t
x , □y∈N (x)
ϕt(h
t
x , h
t
y, exy))
z
t
x = 1
r
t
x = 0 h
t
x
~
h
t
x
9 / 26
Graph Convolutional Networks (GCN) [Kipf& Welling,
2017]
, , and
, where and are the degrees of and
. This step encourages similar prediction among locally connected nodes.
The propagation rule over the entire graph can be expressed as:
, where is the adjacency matrix of
the undirected graph.
This propagation rule is based on a first-order approximation of spectral
convolution on graphs.
h
t+1
x = F (h
t
x , □y∈N (x)
ϕt(h
t
x , h
t
y, exy))
h
t
x ∈ R
K(t)
□ = ∑ F (h
t
x , . ) = σ(. )
ϕt(h
t
x , h
t
y, exy) = h
t
y
exy
√(dx +1)(dy +1)
dx dy x y
H
t+1
← σ(
~
D
− ~
A
~
D
−
H
t
W
t
)
1
2
1
2
~
A = A + I
10 / 26
Graph IsomorphismNetwork (GIN) [Xu et al., 2019]
, , (multi-layer perceptron)
GIN- : learns by gradient descent,
GIN-0: is fixed to 0.
GIN is proved to be as powerful as the WL test for distinguishing between
different graph structures by using simple architecture (MLP).
Sum aggregation is better than mean and max aggregation in terms of
distinguishing graph structure:
h
t+1
x = F (h
t
x , □y∈N (x)
ϕt(h
t
x , h
t
y, exy))
h
t
x ∈ R
K(t)
□ = ∑ F = MLP
t+1
h
t+1
x = MLP
t+1
((1 + ϵ
t
)h
t
x + ∑
y∈N (x)
h
t
y)
ϵ ϵ
ϵ
11 / 26
Pooling layers
Graph pooling: reduction of the number of nodes in a graph. It helps GNN to
discard information that is superfluous for the task and keeps model
complexity under control.
DiffPool (Ying et al., 2018): extracts a complex hierarchical structure by
performing clustering of the graphs after each MP layer.
Top-K (Hongyang Gao, 2019; Lee et al., 2019): learns a projection vector
and selects the nodes with the K highest projection values.
MinCut (Bianchi et al., 2020): pooling method that uses spectral clustering
and aggregates nodes belonling to the same cluster.
Global pooling: reduction of a graph to a single node.
sum
average
max
SortPool (Zhang et al., 2018): sorts the vertex features in a consistent order
(based on WL colors). After sorting, the output tensor is truncated from n
to k in order to unify graph sizes.
12 / 26
The Python librairies Spektral and Pytorch GeometricThe Python librairies Spektral and Pytorch Geometric
13 / 2613 / 26
Basic overview
Spektral [Grattarola and Alippi, 2020]
based on tensorflow (at least 2.3.1) (easy to install on ubuntu with
pip3 but installation from source required for the last version)
github repository https://github.com/danielegrattarola/spektral and
detailed documentation https://graphneural.network/ with tutorials
many datasets included: https://graphneural.network/datasets/
PyTorch Geometric [Fey and Lenssen, 2019]
based on PyTorch (a bit harder to install on ubuntu due to
dependencies)
github repository https://github.com/rusty1s/pytorch_geometric and
detailed documentation https://pytorch-
geometric.readthedocs.io/en/latest/ with examples
many datasets included: https://pytorch-
geometric.readthedocs.io/en/latest/modules/datasets.html
14 / 26
Main available datasets in Spektral and PyTorch
geometric
Citation: Cora, CiteSeer and Pubmed citation datasets (node classification)
GraphSAGE: PPI dataset and Reddit dataset containing Reddit posts
belonging to different communities (node classification)
QM7, QM9: chemical datasets of molecules (graph classification)
TUDataset: benchmark datasets for graph kernels from TU Dortmund
(e.g. MUTAG, ENZYMES, PROTEINS ...) (graph classification)
Example in PyTorch geometric:
dataset =
torch_geometric.datasets.TUDataset(root='/tmp/MUTAG',
name='MUTAG')
Example in Spektral:
dataset = spektral.datasets.TUDataset('MUTAG')
15 / 26
Data modes and mini-batching
Scaling to huge amounts of data: examples in a mini-batch are grouped into a
unified representation where it can efficiently be processed in parallel.
Data modes:
single mode: only 1 graph (node classification)
disjoint mode: a set of graphs is represented as a single graph (disjoint
union)
batch mode: the graphs are zero-padded so that they fit into tensors of
shape [batch, N, N]
mixed mode: single graph with different node attributes
16 / 26
Data modes and mini-batching
Spektral
single node: loader = spektral.data.SingleLoader(dataset)
disjoint mode: loader = spektral.data.DisjointLoader(dataset,
batch_size=3)
batch mode: loader = spektral.data.BatchLoader(dataset,
batch_size=3)
PyTorch geometric: only uses the disjoint mode
loader = torch_geometric.data.DataLoader(dataset,
batch_size=3)
17 / 26
MP Layers
Spektral
ChebNets: spektral.layers.ChebConv(channels, K)
GATGNN: spektral.layers.GatedGraphConv(channels, n_layers)
GCN: spektral.layers.GCNConv(channels)
GIN: spektral.layers.GINConv(channels, epsilon) channels:
number of output channels
PyTorch geometric
ChebNets: torch_geometric.nn.ChebConv(in_channels,
out_channels, K)
GATGNN: torch_geometric.nn.GatedGraphConv(out_channels,
num_layers)
GCN: torch_geometric.nn.GCNConv(in_channels, out_channels)
GIN: torch_geometric.nn.GINConv(nn, eps, train_eps), where
nnis a neural network (e.g. torch_geometric.nn.Sequential)
18 / 26
Comparison on node classi cation
Example: Cora (2708 scientific publications, edges are co-citations, features are
words-in-documents descriptors and seven classes)
Task: starting from an initial set of training nodes with known classes, learn
the classes of the other node (test set)
the first layer, then dropout (50%) before the second layer, softmax after the
second layer, target error is categorical_crossentropy.
Learning algorithm: ADAM optimizer, 200 iterations (no early stopping),
learning rates and regularization parameter (weight decays) set to the same
value (probably)
19 / 26
Comparison on node classi cation (critical
assessment)
very fast: ~4 s for PyTorch Geometric and ~13 s for Spektral on my
computer
BUT: settings of the different parameters (iterations, learning rates and
iterations, dropout rates, dimension in hidden layers) in addition to
architecture is very hard
good accuracy: ~80% at every run
BUT: results are not at all the same!
20 / 26
Comparison on graph classi cation with PyG
For IMDB-binary, one-hot encodings of node degrees are used as input
features.
Comparison in PyTorch Geometric of:
different MP layers: GCN, GIN0, GIN, CHEB (k=3)
different global pooling layers: average, sum, max, SortPool
Architecture: 4 MP layers of dim 32, each one followed by relu, 1 global
pooling layer, relu, and then softmax. The target error is
categorical_crossentropy.
Learning algorithm: ADAM optimizer, 100 iterations. The batch size is 128.
Cross-validation with 10 folds is used.
21 / 26
Comparison on graph classi cation with PyG: results
22 / 26
Comparison on graph classi cation: critical
assignment
I also experimented graph classification wih Spektral and the type of the data
in the loaders is different compared to PyTorch Geometric
PyTorch Geometric:
data
>>>Batch(batch=[1012], edge_attr=[2244, 4], edge_index=[2,
2244], x=[1012, 7], y=[56])
x, a, e, i = data.x, data.edge_index, data.edge_attr,
data.batch
Spektral :
data is a tuple: ((x,a,i), y) or ((x,a,e,i),y) if there are edge features
More difficult to handle the two cases (edge features/no edge features)
23 / 26
That's all for now...That's all for now...
... questions?... questions?
24 / 2624 / 26
References
Bianchi FM, Grattarola D, Livi L, Alippi C (2020) Graph neural network with convolutional
ARMA filters. Preprint arXiv: 1901.01343.
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2016)
Learning phrase representations usin RNN encoder-decoder for statistical machine
translation. Preprint arXiv: 1406.1078.
Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural network on graphs
with fast localized spectral filtering. Proceedings of NIPS 2016, Barcelona, Spain, 3844-3852.
Fey M, Lenssen JE (2019) Fast graph representation learning with pytorch geometric.
Proceedings of ICLR 2019 Workshop, New Orleans, LA, USA.
Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for
quantum chemistry. Proceedings of ICML 2017, Sidney, Australia, PMLR 70.
Grattarola D, Alippi C (2020) Graph neural networks in TensorFlow and Keras with Spektral.
Proceedings of ICML 2020 workshop on Graph Representation Learning and Beyond.
Hamilton W, Ying Z, Lesbovec J (2017) Inductive representation learning on large graphs.
Proceedings of NIPS 2017, Long Beach, CA, USA.
Kipf TN, Welling M (2017) Semi-supervised classification with Graph Convolutional
networks. Proceedings of ICLR 2017, Toulon, France.
Klicpera J, Bojchevski A, Günnemann S (2019) Predict then propagate: graph neural
networks meet personalized pagerank. Proceedings of ICLR 2019, New Orleans, LA, USA. 25 / 26
References
Li Y, Zemel R, Brockschmidt M, Tarlow D (2016) Gated graph sequence neural networks.
Proceedings of ICLR 2016, Toulon, France.
Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2009) The graph neural network
model. IEEE Transactions on Neural Networks, 20(1), 61-80.
Thekumparampil KK, Wang C, Oh S, Li LJ (2018) Attention-based graph neural network for
semi-supervised learning. Proceedings of ICLR 2018, Vancouver, Canada.
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention
networks. Proceedings of ICLR 2018, Vancouver, Canada.
Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2018) Dynamic graph CNN for
learning on point clouds. ACM Transactions on Graphics, 38(5), 146. DOI: 10.1145/3326362.
Xu K, Hu W, Leskovec J, Jegelka S (2019) How powerful are graph neural network?
Proceedings of ICLR 2019, New Orleans, LA, USA.
26 / 26

More Related Content

What's hot (20)

PPTX
Graph Neural Networks
tm1966
 
PDF
IIBMP2016 深層生成モデルによる表現学習
Preferred Networks
 
PDF
Single Image Super Resolution Overview
LEE HOSEONG
 
PDF
ViT (Vision Transformer) Review [CDM]
Dongmin Choi
 
PDF
[기초개념] Graph Convolutional Network (GCN)
Donghyeon Kim
 
PDF
How Powerful are Graph Networks?
IAMAl
 
PDF
GAN(と強化学習との関係)
Masahiro Suzuki
 
PDF
[DL輪読会]Attention InterpretabilityAcross NLPTasks
Deep Learning JP
 
PPTX
Deep walk について
Tamakoshi Hironori
 
PDF
Masked Autoencoders Are Scalable Vision Learners
GuoqingLiu9
 
PDF
Introduction to Generative Adversarial Networks (GANs)
Appsilon Data Science
 
PDF
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
WithTheBest
 
PDF
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Hansol Kang
 
PPTX
【DL輪読会】Generative models for molecular discovery: Recent advances and challenges
Deep Learning JP
 
PPTX
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
SOYEON KIM
 
PDF
A Short Introduction to Generative Adversarial Networks
Jong Wook Kim
 
PDF
Latent Dirichlet Allocation
Sangwoo Mo
 
PPTX
【論文紹介】How Powerful are Graph Neural Networks?
Masanao Ochi
 
PDF
Deep Learning for Graphs
DeepLearningBlr
 
PDF
動画認識における代表的なモデル・データセット(メタサーベイ)
cvpaper. challenge
 
Graph Neural Networks
tm1966
 
IIBMP2016 深層生成モデルによる表現学習
Preferred Networks
 
Single Image Super Resolution Overview
LEE HOSEONG
 
ViT (Vision Transformer) Review [CDM]
Dongmin Choi
 
[기초개념] Graph Convolutional Network (GCN)
Donghyeon Kim
 
How Powerful are Graph Networks?
IAMAl
 
GAN(と強化学習との関係)
Masahiro Suzuki
 
[DL輪読会]Attention InterpretabilityAcross NLPTasks
Deep Learning JP
 
Deep walk について
Tamakoshi Hironori
 
Masked Autoencoders Are Scalable Vision Learners
GuoqingLiu9
 
Introduction to Generative Adversarial Networks (GANs)
Appsilon Data Science
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
WithTheBest
 
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...
Hansol Kang
 
【DL輪読会】Generative models for molecular discovery: Recent advances and challenges
Deep Learning JP
 
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
SOYEON KIM
 
A Short Introduction to Generative Adversarial Networks
Jong Wook Kim
 
Latent Dirichlet Allocation
Sangwoo Mo
 
【論文紹介】How Powerful are Graph Neural Networks?
Masanao Ochi
 
Deep Learning for Graphs
DeepLearningBlr
 
動画認識における代表的なモデル・データセット(メタサーベイ)
cvpaper. challenge
 

Similar to Graph Neural Network in practice (20)

PPTX
Chapter 4 better.pptx
AbanobZakaria1
 
PPTX
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Liad Magen
 
PDF
Graph convolutional networks in apache spark
Emiliano Martinez Sanchez
 
PPTX
20191107 deeplearningapproachesfornetworks
tm1966
 
PDF
Grl book
HibaRamadan4
 
PPTX
Chapter 3.pptx
AbanobZakaria1
 
PDF
Learning Convolutional Neural Networks for Graphs
Mathias Niepert
 
PDF
A review on structure learning in GNN
tuxette
 
PDF
Predicting organic reaction outcomes with weisfeiler lehman network
Kazuki Fujikawa
 
PDF
Webinar on Graph Neural Networks
LucaCrociani1
 
PPTX
Sun_MAPL_GNN.pptx
ssuser1760c0
 
PPTX
Colloquium.pptx
Mythili680896
 
PPTX
Towards Predicting Molecular Property by Graph Neural Networks
Shion Honda
 
PPTX
NS - CUK Seminar: S.T.Nguyen, Review on "Hypergraph Neural Networks", AAAI 2019
ssuser4b1f48
 
PDF
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Eiji Sekiya
 
PPTX
240729_JW_labseminar[Semi-Supervised Classification with Graph Convolutional ...
thanhdowork
 
PDF
Node classification with graph neural network based centrality measures and f...
IJECEIAES
 
PPTX
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
ssuser2624f71
 
PDF
High-Performance Graph Analysis and Modeling
Nesreen K. Ahmed
 
PDF
Graph deep learningまとめ (as of 20190919)
Ichigaku Takigawa
 
Chapter 4 better.pptx
AbanobZakaria1
 
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Liad Magen
 
Graph convolutional networks in apache spark
Emiliano Martinez Sanchez
 
20191107 deeplearningapproachesfornetworks
tm1966
 
Grl book
HibaRamadan4
 
Chapter 3.pptx
AbanobZakaria1
 
Learning Convolutional Neural Networks for Graphs
Mathias Niepert
 
A review on structure learning in GNN
tuxette
 
Predicting organic reaction outcomes with weisfeiler lehman network
Kazuki Fujikawa
 
Webinar on Graph Neural Networks
LucaCrociani1
 
Sun_MAPL_GNN.pptx
ssuser1760c0
 
Colloquium.pptx
Mythili680896
 
Towards Predicting Molecular Property by Graph Neural Networks
Shion Honda
 
NS - CUK Seminar: S.T.Nguyen, Review on "Hypergraph Neural Networks", AAAI 2019
ssuser4b1f48
 
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Eiji Sekiya
 
240729_JW_labseminar[Semi-Supervised Classification with Graph Convolutional ...
thanhdowork
 
Node classification with graph neural network based centrality measures and f...
IJECEIAES
 
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
ssuser2624f71
 
High-Performance Graph Analysis and Modeling
Nesreen K. Ahmed
 
Graph deep learningまとめ (as of 20190919)
Ichigaku Takigawa
 
Ad

More from tuxette (20)

PDF
Detecting differences between 3D genomic data: a benchmark study
tuxette
 
PDF
Racines en haut et feuilles en bas : les arbres en maths
tuxette
 
PDF
Méthodes à noyaux pour l’intégration de données hétérogènes
tuxette
 
PDF
Méthodologies d'intégration de données omiques
tuxette
 
PDF
Projets autour de l'Hi-C
tuxette
 
PDF
Can deep learning learn chromatin structure from sequence?
tuxette
 
PDF
Multi-omics data integration methods: kernel and other machine learning appro...
tuxette
 
PDF
ASTERICS : une application pour intégrer des données omiques
tuxette
 
PDF
Autour des projets Idefics et MetaboWean
tuxette
 
PDF
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
tuxette
 
PDF
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
tuxette
 
PDF
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
tuxette
 
PDF
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
tuxette
 
PDF
Journal club: Validation of cluster analysis results on validation data
tuxette
 
PDF
Overfitting or overparametrization?
tuxette
 
PDF
Selective inference and single-cell differential analysis
tuxette
 
PDF
SOMbrero : un package R pour les cartes auto-organisatrices
tuxette
 
PDF
Graph Neural Network for Phenotype Prediction
tuxette
 
PDF
A short and naive introduction to using network in prediction models
tuxette
 
PDF
Explanable models for time series with random forest
tuxette
 
Detecting differences between 3D genomic data: a benchmark study
tuxette
 
Racines en haut et feuilles en bas : les arbres en maths
tuxette
 
Méthodes à noyaux pour l’intégration de données hétérogènes
tuxette
 
Méthodologies d'intégration de données omiques
tuxette
 
Projets autour de l'Hi-C
tuxette
 
Can deep learning learn chromatin structure from sequence?
tuxette
 
Multi-omics data integration methods: kernel and other machine learning appro...
tuxette
 
ASTERICS : une application pour intégrer des données omiques
tuxette
 
Autour des projets Idefics et MetaboWean
tuxette
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
tuxette
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
tuxette
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
tuxette
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
tuxette
 
Journal club: Validation of cluster analysis results on validation data
tuxette
 
Overfitting or overparametrization?
tuxette
 
Selective inference and single-cell differential analysis
tuxette
 
SOMbrero : un package R pour les cartes auto-organisatrices
tuxette
 
Graph Neural Network for Phenotype Prediction
tuxette
 
A short and naive introduction to using network in prediction models
tuxette
 
Explanable models for time series with random forest
tuxette
 
Ad

Recently uploaded (20)

PDF
Investigatory_project Topic:-effect of electrolysis in solar desalination .pdf
shubham997ku
 
PPTX
Single-Cell Multi-Omics in Neurodegeneration p1.pptx
KanakChaudhary10
 
PPTX
Chromosomal Aberration (Mutation) and Classification.
Dr-Haseeb Zubair Tagar
 
PPTX
Bronchiolitis: Current Guidelines for Diagnosis and Management By DrShamavu.pptx
Gabriel Shamavu
 
PDF
POLISH JOURNAL OF SCIENCE №87 (2025)
POLISH JOURNAL OF SCIENCE
 
PDF
Evidence for a sub-Jovian planet in the young TWA 7 disk
Sérgio Sacani
 
PDF
Impacts on Ocean Worlds Are Sufficiently Frequent and Energetic to Be of Astr...
Sérgio Sacani
 
PPT
rate of reaction and the factors affecting it.ppt
MOLATELOMATLEKE
 
DOCX
Accomplishment Report on YES- O SY 2025 2026.docx
WilsonVillamater
 
PDF
Disk Evolution Study Through Imaging of Nearby Young Stars (DESTINYS): Eviden...
Sérgio Sacani
 
PPTX
MEDICINAL CHEMISTRY PROSPECTIVES IN DESIGN OF EGFR INHIBITORS.pptx
40RevathiP
 
PDF
Relazione di laboratorio Idrolisi dell'amido (in inglese)
paolofvesco
 
PDF
The scientific heritage No 162 (162) (2025)
The scientific heritage
 
PDF
The MUSEview of the Sculptor galaxy: survey overview and the planetary nebula...
Sérgio Sacani
 
PPTX
Comparative Testing of 2D Stroke Gesture Recognizers in Multiple Contexts of Use
Jean Vanderdonckt
 
PPTX
The-Emergence-of-Social-Science-Disciplines-A-Historical-Journey.pptx
RomaErginaBachiller
 
PDF
EV REGENERATIVE ACCELERATION INNOVATION SUMMARY PITCH June 13, 2025.pdf
Thane Heins NOBEL PRIZE WINNING ENERGY RESEARCHER
 
PPSX
Overview of Stem Cells and Immune Modulation.ppsx
AhmedAtwa29
 
PDF
We are Living in a Dangerous Multilingual World!
Editions La Dondaine
 
PPTX
Liquid Biopsy Biomarkers for early Diagnosis
KanakChaudhary10
 
Investigatory_project Topic:-effect of electrolysis in solar desalination .pdf
shubham997ku
 
Single-Cell Multi-Omics in Neurodegeneration p1.pptx
KanakChaudhary10
 
Chromosomal Aberration (Mutation) and Classification.
Dr-Haseeb Zubair Tagar
 
Bronchiolitis: Current Guidelines for Diagnosis and Management By DrShamavu.pptx
Gabriel Shamavu
 
POLISH JOURNAL OF SCIENCE №87 (2025)
POLISH JOURNAL OF SCIENCE
 
Evidence for a sub-Jovian planet in the young TWA 7 disk
Sérgio Sacani
 
Impacts on Ocean Worlds Are Sufficiently Frequent and Energetic to Be of Astr...
Sérgio Sacani
 
rate of reaction and the factors affecting it.ppt
MOLATELOMATLEKE
 
Accomplishment Report on YES- O SY 2025 2026.docx
WilsonVillamater
 
Disk Evolution Study Through Imaging of Nearby Young Stars (DESTINYS): Eviden...
Sérgio Sacani
 
MEDICINAL CHEMISTRY PROSPECTIVES IN DESIGN OF EGFR INHIBITORS.pptx
40RevathiP
 
Relazione di laboratorio Idrolisi dell'amido (in inglese)
paolofvesco
 
The scientific heritage No 162 (162) (2025)
The scientific heritage
 
The MUSEview of the Sculptor galaxy: survey overview and the planetary nebula...
Sérgio Sacani
 
Comparative Testing of 2D Stroke Gesture Recognizers in Multiple Contexts of Use
Jean Vanderdonckt
 
The-Emergence-of-Social-Science-Disciplines-A-Historical-Journey.pptx
RomaErginaBachiller
 
EV REGENERATIVE ACCELERATION INNOVATION SUMMARY PITCH June 13, 2025.pdf
Thane Heins NOBEL PRIZE WINNING ENERGY RESEARCHER
 
Overview of Stem Cells and Immune Modulation.ppsx
AhmedAtwa29
 
We are Living in a Dangerous Multilingual World!
Editions La Dondaine
 
Liquid Biopsy Biomarkers for early Diagnosis
KanakChaudhary10
 

Graph Neural Network in practice

  • 1. Graph Neural Network in practiceGraph Neural Network in practice Céline Brouard and Nathalie Vialaneix, INRAE/MIATCéline Brouard and Nathalie Vialaneix, INRAE/MIAT WG GNN, December 17th, 2020WG GNN, December 17th, 2020 1 / 261 / 26
  • 2. GNN in practiceGNN in practice Message passing principle and exploration of two librairiesMessage passing principle and exploration of two librairies 2 / 262 / 26
  • 3. OverviewofGNN last layer is fed to a standard MLP for prediction (at the graph level). 3 / 26
  • 4. Message passing layers are the generalization of convolutional layers to graph data general concept introduced in [Gilmer et al. 2017] (general framework for several previous GNN) More formally, if is a graph with nodes nodes edges , node features for : edge features for : are associated representation of node , learned iteratively (layers ): with : differential permutation invariant function (mean, sum, max...) Rq: Actually [Gilmer et al. 2017] use: and (but no example). G = (X, E) n x ∈ X e ∈ E x lx e le x hx ∈ R K t = 1 … T h t+1 x = F (h t x , □y∈N (x) ϕt(h t x , h t y, exy)) □ □ = ∑ Ft 4 / 26
  • 5. Examples ofstandard MP layers (restricted to those present in both PyTorch Geometric and Spektral) spectral chebyshev (ChebNets) [Defferrard et al., 2016] DETAILS Gated Graph Neural Network (GATGNN) [Li et al., 2016] DETAILS attention-based (GAT) [Veličković et al., 2017] Attention-based GNN (AGNN) [Thekumparampil et al., 2018] GraphSAGE [Hamilton et al., 2017] Graph Convolutional Networks (GCN) [Kipf & Welling, 2017] DETAILS edge-convolution operator [Wang et al., 2018] Graph Isomorphism Network (GIN) [Xu et al., 2019] DETAILS ARMA [Bianchi et al., 2019] Approximate Personalized Propagation of Neural Predictions (APPNP) [Klicpera et al., 2019] 5 / 26
  • 6. ChebNets [De errard etal., 2016] Setting: (weighted graph) Main idea: Signal filtering based on the Laplacian eigendecomposition , and is replaced by (row corresponds to new feature , ie ) with and is a polynomial (a decomposition on Chebyshev polynomial basis is used) with , the polynomial coefficients, learned during training). h t+1 x = F (h t x , □y∈N (x) ϕt(h t x , h t y, exy)) le ∈ R (Λ, U ) h t x ∈ R K(t) F (h t x , . ) = σ(. ) □y∈N (x) ϕt(h t x , h t y, exy) (∑ K(t) k ′ =1 gθ(k,k ′ ) (L)(h t 1k ′   …  h t nk ′ ) ⊤ ) k=1,…,K(t+1) ∈ R n×K(t+1) x h t+1 x gθ(k,k ′ ) (L) ∈ R n×n gθ(k,k ′ ) (L) = U gθ(k,k ′ ) (Λ)U ⊤ gθ(k,k ′ ) θ(k, k ′ ) ∈ R r 6 / 26
  • 7. ChebNets [De errard etal., 2016](some explanations) Why is it message passing? can be rewritten under the compact form with : slight difference with general framework: MP is performed over all nodes (not just neighbors) + Laplacian used to provide proximity relations between nodes h t+1 x = F (h t x , □y∈N (x) ϕt(h t x , h t y, exy)) (∑ K(t) k ′ =1 gθ(k,k ′ ) (L)(h1k ′ , t … h t nk ′ ) ⊤ ) k=1,…,K(t+1) ∑ y C t xy(θ)h t y C t xy(θ) ∈ R K(t+1)×K(t) [C t xy]kk ′ = [gθ(k,k ′ ) ]xy(L) 7 / 26
  • 8. GATGNN [Li etal., 2016] Setting: discrete (potentially directed) Main idea: Use GRU (Gated Recurrent Unit [Cho et al., 2016]) in the original GNN [Scarselli et al., 2009] , and where learned matrix depending on only (update) (reset) h t+1 x = F (h t x , □y∈N (x) ϕt(h t x , h t y, exy)) le ∈ {A, B, . . . } h t x ∈ R K(t) □ = ∑ ϕt(h t x , h t y, exy) = Alexy h t y Alexy ∈ R K(t+1)×K(t) lexy z t x = σ(W z a t x + U z h t x ) r t x = σ(W r a t x + U r h t x ) ~ h t x = tanh(W a t x + U (r t x ⊙ h t x )) h t+1 x = (1 − z t x ) ⊙ h t x + z t x ~ h t x 8 / 26
  • 9. GATGNN [Li etal., 2016](with some explanations) : no update : reset of in These parameters and the matrices are learned. h t+1 x = F (h t x , □y∈N (x) ϕt(h t x , h t y, exy)) z t x = 1 r t x = 0 h t x ~ h t x 9 / 26
  • 10. Graph Convolutional Networks (GCN) [Kipf& Welling, 2017] , , and , where and are the degrees of and . This step encourages similar prediction among locally connected nodes. The propagation rule over the entire graph can be expressed as: , where is the adjacency matrix of the undirected graph. This propagation rule is based on a first-order approximation of spectral convolution on graphs. h t+1 x = F (h t x , □y∈N (x) ϕt(h t x , h t y, exy)) h t x ∈ R K(t) □ = ∑ F (h t x , . ) = σ(. ) ϕt(h t x , h t y, exy) = h t y exy √(dx +1)(dy +1) dx dy x y H t+1 ← σ( ~ D − ~ A ~ D − H t W t ) 1 2 1 2 ~ A = A + I 10 / 26
  • 11. Graph IsomorphismNetwork (GIN) [Xu et al., 2019] , , (multi-layer perceptron) GIN- : learns by gradient descent, GIN-0: is fixed to 0. GIN is proved to be as powerful as the WL test for distinguishing between different graph structures by using simple architecture (MLP). Sum aggregation is better than mean and max aggregation in terms of distinguishing graph structure: h t+1 x = F (h t x , □y∈N (x) ϕt(h t x , h t y, exy)) h t x ∈ R K(t) □ = ∑ F = MLP t+1 h t+1 x = MLP t+1 ((1 + ϵ t )h t x + ∑ y∈N (x) h t y) ϵ ϵ ϵ 11 / 26
  • 12. Pooling layers Graph pooling: reduction of the number of nodes in a graph. It helps GNN to discard information that is superfluous for the task and keeps model complexity under control. DiffPool (Ying et al., 2018): extracts a complex hierarchical structure by performing clustering of the graphs after each MP layer. Top-K (Hongyang Gao, 2019; Lee et al., 2019): learns a projection vector and selects the nodes with the K highest projection values. MinCut (Bianchi et al., 2020): pooling method that uses spectral clustering and aggregates nodes belonling to the same cluster. Global pooling: reduction of a graph to a single node. sum average max SortPool (Zhang et al., 2018): sorts the vertex features in a consistent order (based on WL colors). After sorting, the output tensor is truncated from n to k in order to unify graph sizes. 12 / 26
  • 13. The Python librairies Spektral and Pytorch GeometricThe Python librairies Spektral and Pytorch Geometric 13 / 2613 / 26
  • 14. Basic overview Spektral [Grattarola and Alippi, 2020] based on tensorflow (at least 2.3.1) (easy to install on ubuntu with pip3 but installation from source required for the last version) github repository https://github.com/danielegrattarola/spektral and detailed documentation https://graphneural.network/ with tutorials many datasets included: https://graphneural.network/datasets/ PyTorch Geometric [Fey and Lenssen, 2019] based on PyTorch (a bit harder to install on ubuntu due to dependencies) github repository https://github.com/rusty1s/pytorch_geometric and detailed documentation https://pytorch- geometric.readthedocs.io/en/latest/ with examples many datasets included: https://pytorch- geometric.readthedocs.io/en/latest/modules/datasets.html 14 / 26
  • 15. Main available datasets in Spektral and PyTorch geometric Citation: Cora, CiteSeer and Pubmed citation datasets (node classification) GraphSAGE: PPI dataset and Reddit dataset containing Reddit posts belonging to different communities (node classification) QM7, QM9: chemical datasets of molecules (graph classification) TUDataset: benchmark datasets for graph kernels from TU Dortmund (e.g. MUTAG, ENZYMES, PROTEINS ...) (graph classification) Example in PyTorch geometric: dataset = torch_geometric.datasets.TUDataset(root='/tmp/MUTAG', name='MUTAG') Example in Spektral: dataset = spektral.datasets.TUDataset('MUTAG') 15 / 26
  • 16. Data modes and mini-batching Scaling to huge amounts of data: examples in a mini-batch are grouped into a unified representation where it can efficiently be processed in parallel. Data modes: single mode: only 1 graph (node classification) disjoint mode: a set of graphs is represented as a single graph (disjoint union) batch mode: the graphs are zero-padded so that they fit into tensors of shape [batch, N, N] mixed mode: single graph with different node attributes 16 / 26
  • 17. Data modes and mini-batching Spektral single node: loader = spektral.data.SingleLoader(dataset) disjoint mode: loader = spektral.data.DisjointLoader(dataset, batch_size=3) batch mode: loader = spektral.data.BatchLoader(dataset, batch_size=3) PyTorch geometric: only uses the disjoint mode loader = torch_geometric.data.DataLoader(dataset, batch_size=3) 17 / 26
  • 18. MP Layers Spektral ChebNets: spektral.layers.ChebConv(channels, K) GATGNN: spektral.layers.GatedGraphConv(channels, n_layers) GCN: spektral.layers.GCNConv(channels) GIN: spektral.layers.GINConv(channels, epsilon) channels: number of output channels PyTorch geometric ChebNets: torch_geometric.nn.ChebConv(in_channels, out_channels, K) GATGNN: torch_geometric.nn.GatedGraphConv(out_channels, num_layers) GCN: torch_geometric.nn.GCNConv(in_channels, out_channels) GIN: torch_geometric.nn.GINConv(nn, eps, train_eps), where nnis a neural network (e.g. torch_geometric.nn.Sequential) 18 / 26
  • 19. Comparison on node classi cation Example: Cora (2708 scientific publications, edges are co-citations, features are words-in-documents descriptors and seven classes) Task: starting from an initial set of training nodes with known classes, learn the classes of the other node (test set) the first layer, then dropout (50%) before the second layer, softmax after the second layer, target error is categorical_crossentropy. Learning algorithm: ADAM optimizer, 200 iterations (no early stopping), learning rates and regularization parameter (weight decays) set to the same value (probably) 19 / 26
  • 20. Comparison on node classi cation (critical assessment) very fast: ~4 s for PyTorch Geometric and ~13 s for Spektral on my computer BUT: settings of the different parameters (iterations, learning rates and iterations, dropout rates, dimension in hidden layers) in addition to architecture is very hard good accuracy: ~80% at every run BUT: results are not at all the same! 20 / 26
  • 21. Comparison on graph classi cation with PyG For IMDB-binary, one-hot encodings of node degrees are used as input features. Comparison in PyTorch Geometric of: different MP layers: GCN, GIN0, GIN, CHEB (k=3) different global pooling layers: average, sum, max, SortPool Architecture: 4 MP layers of dim 32, each one followed by relu, 1 global pooling layer, relu, and then softmax. The target error is categorical_crossentropy. Learning algorithm: ADAM optimizer, 100 iterations. The batch size is 128. Cross-validation with 10 folds is used. 21 / 26
  • 22. Comparison on graph classi cation with PyG: results 22 / 26
  • 23. Comparison on graph classi cation: critical assignment I also experimented graph classification wih Spektral and the type of the data in the loaders is different compared to PyTorch Geometric PyTorch Geometric: data >>>Batch(batch=[1012], edge_attr=[2244, 4], edge_index=[2, 2244], x=[1012, 7], y=[56]) x, a, e, i = data.x, data.edge_index, data.edge_attr, data.batch Spektral : data is a tuple: ((x,a,i), y) or ((x,a,e,i),y) if there are edge features More difficult to handle the two cases (edge features/no edge features) 23 / 26
  • 24. That's all for now...That's all for now... ... questions?... questions? 24 / 2624 / 26
  • 25. References Bianchi FM, Grattarola D, Livi L, Alippi C (2020) Graph neural network with convolutional ARMA filters. Preprint arXiv: 1901.01343. Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2016) Learning phrase representations usin RNN encoder-decoder for statistical machine translation. Preprint arXiv: 1406.1078. Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural network on graphs with fast localized spectral filtering. Proceedings of NIPS 2016, Barcelona, Spain, 3844-3852. Fey M, Lenssen JE (2019) Fast graph representation learning with pytorch geometric. Proceedings of ICLR 2019 Workshop, New Orleans, LA, USA. Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for quantum chemistry. Proceedings of ICML 2017, Sidney, Australia, PMLR 70. Grattarola D, Alippi C (2020) Graph neural networks in TensorFlow and Keras with Spektral. Proceedings of ICML 2020 workshop on Graph Representation Learning and Beyond. Hamilton W, Ying Z, Lesbovec J (2017) Inductive representation learning on large graphs. Proceedings of NIPS 2017, Long Beach, CA, USA. Kipf TN, Welling M (2017) Semi-supervised classification with Graph Convolutional networks. Proceedings of ICLR 2017, Toulon, France. Klicpera J, Bojchevski A, Günnemann S (2019) Predict then propagate: graph neural networks meet personalized pagerank. Proceedings of ICLR 2019, New Orleans, LA, USA. 25 / 26
  • 26. References Li Y, Zemel R, Brockschmidt M, Tarlow D (2016) Gated graph sequence neural networks. Proceedings of ICLR 2016, Toulon, France. Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2009) The graph neural network model. IEEE Transactions on Neural Networks, 20(1), 61-80. Thekumparampil KK, Wang C, Oh S, Li LJ (2018) Attention-based graph neural network for semi-supervised learning. Proceedings of ICLR 2018, Vancouver, Canada. Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. Proceedings of ICLR 2018, Vancouver, Canada. Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2018) Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics, 38(5), 146. DOI: 10.1145/3326362. Xu K, Hu W, Leskovec J, Jegelka S (2019) How powerful are graph neural network? Proceedings of ICLR 2019, New Orleans, LA, USA. 26 / 26