SlideShare a Scribd company logo
Graph Convolutional Networks
In Apache Spark
Intelligent Workflow Automati
Scalacon
Emiliano Martínez
November 2021
BBVA Innovation Labs
n
About Me:
Programming in Scala for 10 years
Akka development
Functional domain models with FP libraries Cats, Scalaz
Big Data: Spark, Kafka, Cassandra, NoSql, ...
Machine Learning: Spark ML, Sklearn, Analytics Zoo, Tensorflow, Torch
Currently, I do NLP at BBVA
Deep Learning Models
- It is a machine learning model based on neural networks, that tries to mimic the
structure and the function of the human brain.
- Supervised Machine learning method.
- The goal is to approximate a function that maps an input x to a category by
adjusting the value of the Θ parameters: y = f(x; Θ)
- Automatic speech recognition. A generated wave from a human voice that is broken down into
what are called phonemes. Each phoneme is like a chain link and by analyzing them in sequence,
starting from the first phoneme, the ASR software uses statistical probability analysis to deduce
whole words and then from there, complete sentences.
- Image Recognition. Based on CNN. To automatically identify objects, people, place in images. It is
used in guiding robots, autonomous vehicles, driver assistant systems, etc ...
- Drug Discovery. Using graph convolutional networks.
- Natural Language Processing. Set of machine learning models/techniques to process natural
language data.
Deep Learning Models
Neural Networks
Hidden Layers
y ∊ ℝk
ŷ ∊ ℝk
x ∊ ℝn
Input training with
samples n features
Output label vector
Prediction vector
a1
l1
a1
l2
a4
l2
a4
l1
W + b parameters
Input Layer Output Layer
zl
= Wl
al-1
+ bl
al
= σ(zl
)
Layer Feed Forward Equations
g(z)=1/(1+e^(-z)) g(z)= max(0, z) g(z)= (e^(z) - e^(-z))/(e^(z) + e^(-z))
Sigmoid ReLU Tanh
Hi
=σ(Wi
Hi-1
+ bi
)
CE(𝑃,𝑄)=−𝐸𝑥∼𝑃[log(𝑞(𝑥)]
Training Equations
Layer feed Forward
Loss Function
Gradient Calculation
Weights update
Forward step
Backward step
Graphs
- Graphs are described by a set of vertices and edges.
- Data that can not be represented in an euclidean space.
- Input training samples can be represented as nodes of a graph. They are defined by
its properties and by the connections with other nodes.
- They can be cyclic, acyclic, weighted, ...
G = (V, E)
Graphs Examples
Caffeine. Image taken from
Wikipedia
Social Network Graph . Image taken from Wikimedia
- A type of NN the operates directly on Graph Structures.
- It can be used for tasks of node node classification.
- Different approaches that can be used depending of the case:
a. Inductive: GraphSage, ...
b. Transductive: Spectral graph convolutions, DeepWalk, ...
Graph Neural Networks
Convolutions 0 1 1
0 0 0
1 1 1
1 0 1
0 1 1
1 0 0
1 -1 0
0 -1 0
1 1 1
- To apply filters that detect details of the
images.
- Less parameters than the Fully
Connected Layer Model.
- Pixel positions and neighborhood have
semantic meanings.
- Element-wise multiplication between the
filter-sized patch of the input and filter,
which is then summed, always resulting
in a single value.
- Translation invariance.
Some CNN improvements:
Graph Convolutional Networks
Graph
Convolutional
Networks
Spectral Based
Models
Spacial Based
Models
Classic CNN Models
Propagation
Models
General
Frameworks
Graph
Convolutional
Networks
Computer
Vision
NLP
Science
Images
Videos
Point Clouds
Meshes
Physics
Chemistry
Social
Representation in the
Fourier Domain:
Eigendecomposition of
the Graph Laplacian
Matrix.
GCN Intuition
2
4
5
3
1
0 1 1 0 0
1
1
0
0 1 1 1
1 0 0 0
1 1
0 0
0 1 0 1 0
X = N x F
Convolution 1
NxN Graph Adjacency Matrix
Hi
= f(Hi-1
A)
If First Layer: Hi
= f(XA)
Convolution 2
Dense
Dense
H1
H2
N nodes by F
features per
node
SoftMax
dot(A, X)
dot(A, H1)
Graph Analysis models
- Fast Spectral Graph Convolution. Kipf & Welling (2017)
H(l+1)
= σ[D-1/2
ÂD-1/2
Hl
Wl
]
● Semi-supervised learning method.
● Simplification of Spectral Graph Analysis.
● 2 Layers GCN.
● Two hops neighborhood.
D is the degree matrix
 adjacency matrix plus the identity matrix
Laplacian
Eigendecomposition
Spectral Graph Convolution
(Defferrard et al., 2016).
Truncated expansion in terms of Chebyshev
polynomials
(Defferrard et al., 2016).
First-order approximation of spectral
graph convolutions
(Kipf & Welling, 2016). with K = 1, θ0
= 2, and θ1
= −1
Frameworks used for implementation
- Apache Spark 3.1.2
1. GraphX to get Connected components
2. Spark ML to transform the dataframes corresponding to graphs
3. Spark Core to create the RDD[Sample] and to partition the dataset depending of the graph´s components.
- Breeze 1.0
1. Create the adjacency matrix that represents node connections, sparse matrix.
2. Convert to symmetric matrix
3. Normalize the matrix according to the spectral graph approximation
- Analytics Zoo for Spark 3.1.2
1. Build the Model Graph
2. Model optimization
Deep Learning in Spark
- Analytics Zoo
It provides a distributed deep learning framework using a Scala Keras based implementation that runs on on BigDL framework.
https://arxiv.org/pdf/1804.05839.pdf
BigDL: A Distributed Deep Learning Framework for Big Data
Experiment Steps
Edges
Nodes
Read Files Input Tensor
Dataset
Adjacency Matrix
Get Spark
Workers
Two modules of one
convolutional and one hidden
layer. Adam Optimizer an L2
Regularization
Model
Split Graph
Partition 1
Partition 2
Two modules of one
convolutional and one hidden
layer. Adam Optimizer an L2
Regularization
Model
Graph 1
Graph 2
Graph 1
Graph 2
All-reduce parameters
RDD Partitioner
Sparse Breeze to
Analytics Zoo
sparse Tensor
Case implementation
- Cora dataset
The Cora dataset consists of 2708 scientific publications classified into one of seven classes. The citation network consists of 5429 links. Each
publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the
dictionary. The dictionary consists of 1433 unique words.
Documents can be classified among this seven classes: Neural_Networks, Rule_Learning, Reinforcement_Learning,
Probabilistic_Methods, Theory, Genetic_Algorithms, Case_Based
Two main files: content with the input features and edges. Data is loaded in a
RDD with one partition per graph.
Cora.cites
35 1033
35 103482
35 103515
35 1050679
35 1103960
887 334153
906 910
Cora.content
31336 0 0 0 0 0
0 0 0 0 0
0 0 1 0 0
0 0 0 0 0
0 1 0 0 0
0 0 0 0 0
0 0 0 0 0
… 0 Neural_Networks
case class Element(id: String, words:
Array[Float], label: String)
RDD[Element]
1. One partition per graph.
2. Avoid shuffle
case class Edge(orig: Int, dest: Int)
RDD[Edge]
1. Build adjacency matrix from this representation
private[gcn] def createInputDataset(rdd: RDD[Element]) : RDD[Sample[Float]] = {
rdd.map { case Element(_, words, label) =>
Sample(
Tensor(element.words, Array(1432)),
Tensor(Array(label), Array(1))
)
}
}
Build the datasets using Sample and Tensor[T] types from Analytics Zoo library
Datasets are represented as RDD[Sample]
One partition per graph. Use Spark Partioner in case of multiple graphs.
Use GraphX to split the graph in different components.
1. Adjacency matrix is built and processed using Breeze CSCMatrix.
val builder = new CSCMatrix.Builder[Float](rows, cols)
edges.foreach { case Edge(r, c) =>
builder.add(r, c, 1.0F)
}
2. Transform to Symmetrical.
sparseAdj +:+ (sparseAdj.t *:* (sparseAdj.t >:> sparseAdj)
.map(el => if (el) 1.0F else 0.0F)) - (sparseAdj *:* (sparseAdj.t >:> sparseAdj)
.map(el => if (el) 1.0F else 0.0F))
3. Matrix normalization.
According to the spectral graph convolution equation.
def getModel(
dropout: Double,
matrix: Tensor[Float],
batchSize: Int,
inputSize: Int,
intermediateSize: Int,
labelsNumber: Int
): Sequential[Float] = {
Sequential[Float]()
.add(Dropout(dropout))
.add(GraphConvolution[Float](matrix, batchSize,
inputSize))
.add(Linear[Float](inputSize, intermediateSize,
wRegularizer =
L2Regularizer(5e-4)).setName("layer-1"))
.add(ReLU())
.add(Dropout(dropout))
.add(GraphConvolution[Float](matrix, batchSize,
intermediateSize))
.add(Linear[Float](intermediateSize,
labelsNumber).setName("layer-2"))
.add(LogSoftMax())
}
NN Model
GraphConvLayer LinearLayer
ReLU
Drop
GraphConvLayer
Drop
LinearLayer SoftMax Label
Prediction
Input
Trainable parameters are in red modules!
Model sequential implementation:
Optimization Process
- We use only train with 140 samples of the 2708.
- Every mini-batch is equivalent to one Epoch.
- Avoid shuffle the data in data broadcast.
- For every sub-graph one Spark Partition.
- The negative log likelihood (NLL) criterion.
- Adam Optimizer with lr = 1E-3, beta1= 0.9, beta2 = 0.999, epsilon =
1E-8, decay = 0, wdecay = 0
Results
accuracy: 0.531019
- Training 1000 Epochs.
- 140 labeled examples.
- Propagation Function HW. Multilayer perceptron: D-1/2
ÂD-1/2
HW
Case 1: One graph in one partition.
- Propagation Function D-1/2
ÂD-1/2
HW. Renormalization trick.
accuracy: 0.769202
Identity matrix!
Process Visualization
- Represent the output of the second hidden layer (7 neurons)
- Dimensionality reduction applying tSNE(t-Distributed Stochastic
Neighbor Embedding)
- A Snapshot every 200 epochs is taken.
Representation of the last layer using tSNE with NO convolution
Representation of the last layer using tSNE with spectral Conv
Conclusions and future work
- Convolutions on graphs show promising results in graph analysis using deep
learning.
- We can get benefit of the Spark processing power to perform distributed NN
training.
- The Scala ecosystem will help to develop and integrate with the big data world.
- Scala 3 Graph Neural Network library on top of Spark.
Implementation:
https://github.com/emartinezs44/SparkGCN

More Related Content

PPTX
Transfer Learning and Fine-tuning Deep Neural Networks
PDF
Representation learning on graphs
PPTX
Python Seaborn Data Visualization
PDF
Graph neural networks overview
PPTX
Mining Data Streams
PDF
Generative adversarial networks
PDF
Data mining in social network
PPTX
Convolutional Neural Network
Transfer Learning and Fine-tuning Deep Neural Networks
Representation learning on graphs
Python Seaborn Data Visualization
Graph neural networks overview
Mining Data Streams
Generative adversarial networks
Data mining in social network
Convolutional Neural Network

What's hot (20)

PPT
Schemaless Databases
PPTX
Week 10 - Lecture 27 - 3D Arrays.pptx
PPTX
Graph Neural Network (한국어)
PDF
Deep learning - A Visual Introduction
PDF
Graph Convolutional Neural Networks
PPT
Chapter1 introduction
KEY
NoSQL Databases: Why, what and when
PDF
Gnn overview
PPTX
Image Classification using deep learning
PDF
Spark SQL
PDF
Garbage Collection
PDF
Apache spark
PPTX
Pytorch
PPTX
Convolutional Neural Networks
PPTX
Multilayer perceptron
PPTX
Spark architecture
PPTX
Convolutional Neural Network (CNN) - image recognition
PPTX
Map Reduce
PDF
Mobilenetv1 v2 slide
PDF
Speed up UDFs with GPUs using the RAPIDS Accelerator
Schemaless Databases
Week 10 - Lecture 27 - 3D Arrays.pptx
Graph Neural Network (한국어)
Deep learning - A Visual Introduction
Graph Convolutional Neural Networks
Chapter1 introduction
NoSQL Databases: Why, what and when
Gnn overview
Image Classification using deep learning
Spark SQL
Garbage Collection
Apache spark
Pytorch
Convolutional Neural Networks
Multilayer perceptron
Spark architecture
Convolutional Neural Network (CNN) - image recognition
Map Reduce
Mobilenetv1 v2 slide
Speed up UDFs with GPUs using the RAPIDS Accelerator
Ad

Similar to Graph convolutional networks in apache spark (20)

PDF
Deep Learning for Graphs
PDF
Graph Neural Network in practice
PDF
Learning Convolutional Neural Networks for Graphs
PPTX
Chapter 4 better.pptx
PDF
Grl book
PPTX
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
PDF
Training Graph Convolutional Neural Networks in Graph Database
PPTX
20191107 deeplearningapproachesfornetworks
PDF
Deep learning 1.0 and Beyond, Part 1
PDF
From RNN to neural networks for cyclic undirected graphs
PDF
Talk Norway Aug2016
PDF
Memory Efficient Graph Convolutional Network based Distributed Link Prediction
PDF
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...
PPTX
240729_JW_labseminar[Semi-Supervised Classification with Graph Convolutional ...
PDF
Graph Neural Networks.pdf
PDF
Deep learning for molecules, introduction to chainer chemistry
PDF
Knowledge graphs, meet Deep Learning
PPTX
Towards Predicting Molecular Property by Graph Neural Networks
PDF
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
PDF
Machine Learning in the Cloud with GraphLab
Deep Learning for Graphs
Graph Neural Network in practice
Learning Convolutional Neural Networks for Graphs
Chapter 4 better.pptx
Grl book
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Training Graph Convolutional Neural Networks in Graph Database
20191107 deeplearningapproachesfornetworks
Deep learning 1.0 and Beyond, Part 1
From RNN to neural networks for cyclic undirected graphs
Talk Norway Aug2016
Memory Efficient Graph Convolutional Network based Distributed Link Prediction
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...
240729_JW_labseminar[Semi-Supervised Classification with Graph Convolutional ...
Graph Neural Networks.pdf
Deep learning for molecules, introduction to chainer chemistry
Knowledge graphs, meet Deep Learning
Towards Predicting Molecular Property by Graph Neural Networks
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Machine Learning in the Cloud with GraphLab
Ad

Recently uploaded (20)

PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
1.pptx 2.pptx for biology endocrine system hum ppt
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPTX
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PDF
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
diccionario toefl examen de ingles para principiante
PDF
Crime Scene Investigation: A Guide for Law Enforcement (2013 Update)
PPTX
2. Earth - The Living Planet earth and life
PDF
BRAIN-TUMOR_20250519_040536_pdff0000.pdf
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
1.pptx 2.pptx for biology endocrine system hum ppt
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
bbec55_b34400a7914c42429908233dbd381773.pdf
Comparative Structure of Integument in Vertebrates.pptx
Taita Taveta Laboratory Technician Workshop Presentation.pptx
Cell Membrane: Structure, Composition & Functions
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
AlphaEarth Foundations and the Satellite Embedding dataset
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
INTRODUCTION TO EVS | Concept of sustainability
Derivatives of integument scales, beaks, horns,.pptx
neck nodes and dissection types and lymph nodes levels
diccionario toefl examen de ingles para principiante
Crime Scene Investigation: A Guide for Law Enforcement (2013 Update)
2. Earth - The Living Planet earth and life
BRAIN-TUMOR_20250519_040536_pdff0000.pdf

Graph convolutional networks in apache spark

  • 1. Graph Convolutional Networks In Apache Spark Intelligent Workflow Automati Scalacon Emiliano Martínez November 2021 BBVA Innovation Labs n
  • 2. About Me: Programming in Scala for 10 years Akka development Functional domain models with FP libraries Cats, Scalaz Big Data: Spark, Kafka, Cassandra, NoSql, ... Machine Learning: Spark ML, Sklearn, Analytics Zoo, Tensorflow, Torch Currently, I do NLP at BBVA
  • 3. Deep Learning Models - It is a machine learning model based on neural networks, that tries to mimic the structure and the function of the human brain. - Supervised Machine learning method. - The goal is to approximate a function that maps an input x to a category by adjusting the value of the Θ parameters: y = f(x; Θ)
  • 4. - Automatic speech recognition. A generated wave from a human voice that is broken down into what are called phonemes. Each phoneme is like a chain link and by analyzing them in sequence, starting from the first phoneme, the ASR software uses statistical probability analysis to deduce whole words and then from there, complete sentences. - Image Recognition. Based on CNN. To automatically identify objects, people, place in images. It is used in guiding robots, autonomous vehicles, driver assistant systems, etc ... - Drug Discovery. Using graph convolutional networks. - Natural Language Processing. Set of machine learning models/techniques to process natural language data. Deep Learning Models
  • 5. Neural Networks Hidden Layers y ∊ ℝk ŷ ∊ ℝk x ∊ ℝn Input training with samples n features Output label vector Prediction vector a1 l1 a1 l2 a4 l2 a4 l1 W + b parameters Input Layer Output Layer
  • 6. zl = Wl al-1 + bl al = σ(zl ) Layer Feed Forward Equations g(z)=1/(1+e^(-z)) g(z)= max(0, z) g(z)= (e^(z) - e^(-z))/(e^(z) + e^(-z)) Sigmoid ReLU Tanh
  • 7. Hi =σ(Wi Hi-1 + bi ) CE(𝑃,𝑄)=−𝐸𝑥∼𝑃[log(𝑞(𝑥)] Training Equations Layer feed Forward Loss Function Gradient Calculation Weights update Forward step Backward step
  • 8. Graphs - Graphs are described by a set of vertices and edges. - Data that can not be represented in an euclidean space. - Input training samples can be represented as nodes of a graph. They are defined by its properties and by the connections with other nodes. - They can be cyclic, acyclic, weighted, ... G = (V, E)
  • 9. Graphs Examples Caffeine. Image taken from Wikipedia Social Network Graph . Image taken from Wikimedia
  • 10. - A type of NN the operates directly on Graph Structures. - It can be used for tasks of node node classification. - Different approaches that can be used depending of the case: a. Inductive: GraphSage, ... b. Transductive: Spectral graph convolutions, DeepWalk, ... Graph Neural Networks
  • 11. Convolutions 0 1 1 0 0 0 1 1 1 1 0 1 0 1 1 1 0 0 1 -1 0 0 -1 0 1 1 1 - To apply filters that detect details of the images. - Less parameters than the Fully Connected Layer Model. - Pixel positions and neighborhood have semantic meanings. - Element-wise multiplication between the filter-sized patch of the input and filter, which is then summed, always resulting in a single value. - Translation invariance. Some CNN improvements:
  • 12. Graph Convolutional Networks Graph Convolutional Networks Spectral Based Models Spacial Based Models Classic CNN Models Propagation Models General Frameworks Graph Convolutional Networks Computer Vision NLP Science Images Videos Point Clouds Meshes Physics Chemistry Social Representation in the Fourier Domain: Eigendecomposition of the Graph Laplacian Matrix.
  • 13. GCN Intuition 2 4 5 3 1 0 1 1 0 0 1 1 0 0 1 1 1 1 0 0 0 1 1 0 0 0 1 0 1 0 X = N x F Convolution 1 NxN Graph Adjacency Matrix Hi = f(Hi-1 A) If First Layer: Hi = f(XA) Convolution 2 Dense Dense H1 H2 N nodes by F features per node SoftMax dot(A, X) dot(A, H1)
  • 14. Graph Analysis models - Fast Spectral Graph Convolution. Kipf & Welling (2017) H(l+1) = σ[D-1/2 ÂD-1/2 Hl Wl ] ● Semi-supervised learning method. ● Simplification of Spectral Graph Analysis. ● 2 Layers GCN. ● Two hops neighborhood. D is the degree matrix  adjacency matrix plus the identity matrix
  • 15. Laplacian Eigendecomposition Spectral Graph Convolution (Defferrard et al., 2016). Truncated expansion in terms of Chebyshev polynomials (Defferrard et al., 2016). First-order approximation of spectral graph convolutions (Kipf & Welling, 2016). with K = 1, θ0 = 2, and θ1 = −1
  • 16. Frameworks used for implementation - Apache Spark 3.1.2 1. GraphX to get Connected components 2. Spark ML to transform the dataframes corresponding to graphs 3. Spark Core to create the RDD[Sample] and to partition the dataset depending of the graph´s components. - Breeze 1.0 1. Create the adjacency matrix that represents node connections, sparse matrix. 2. Convert to symmetric matrix 3. Normalize the matrix according to the spectral graph approximation - Analytics Zoo for Spark 3.1.2 1. Build the Model Graph 2. Model optimization
  • 17. Deep Learning in Spark - Analytics Zoo It provides a distributed deep learning framework using a Scala Keras based implementation that runs on on BigDL framework. https://arxiv.org/pdf/1804.05839.pdf BigDL: A Distributed Deep Learning Framework for Big Data
  • 18. Experiment Steps Edges Nodes Read Files Input Tensor Dataset Adjacency Matrix Get Spark Workers Two modules of one convolutional and one hidden layer. Adam Optimizer an L2 Regularization Model Split Graph Partition 1 Partition 2 Two modules of one convolutional and one hidden layer. Adam Optimizer an L2 Regularization Model Graph 1 Graph 2 Graph 1 Graph 2 All-reduce parameters RDD Partitioner Sparse Breeze to Analytics Zoo sparse Tensor
  • 19. Case implementation - Cora dataset The Cora dataset consists of 2708 scientific publications classified into one of seven classes. The citation network consists of 5429 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 1433 unique words. Documents can be classified among this seven classes: Neural_Networks, Rule_Learning, Reinforcement_Learning, Probabilistic_Methods, Theory, Genetic_Algorithms, Case_Based
  • 20. Two main files: content with the input features and edges. Data is loaded in a RDD with one partition per graph. Cora.cites 35 1033 35 103482 35 103515 35 1050679 35 1103960 887 334153 906 910 Cora.content 31336 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 … 0 Neural_Networks case class Element(id: String, words: Array[Float], label: String) RDD[Element] 1. One partition per graph. 2. Avoid shuffle case class Edge(orig: Int, dest: Int) RDD[Edge] 1. Build adjacency matrix from this representation
  • 21. private[gcn] def createInputDataset(rdd: RDD[Element]) : RDD[Sample[Float]] = { rdd.map { case Element(_, words, label) => Sample( Tensor(element.words, Array(1432)), Tensor(Array(label), Array(1)) ) } } Build the datasets using Sample and Tensor[T] types from Analytics Zoo library Datasets are represented as RDD[Sample] One partition per graph. Use Spark Partioner in case of multiple graphs. Use GraphX to split the graph in different components.
  • 22. 1. Adjacency matrix is built and processed using Breeze CSCMatrix. val builder = new CSCMatrix.Builder[Float](rows, cols) edges.foreach { case Edge(r, c) => builder.add(r, c, 1.0F) } 2. Transform to Symmetrical. sparseAdj +:+ (sparseAdj.t *:* (sparseAdj.t >:> sparseAdj) .map(el => if (el) 1.0F else 0.0F)) - (sparseAdj *:* (sparseAdj.t >:> sparseAdj) .map(el => if (el) 1.0F else 0.0F)) 3. Matrix normalization. According to the spectral graph convolution equation.
  • 23. def getModel( dropout: Double, matrix: Tensor[Float], batchSize: Int, inputSize: Int, intermediateSize: Int, labelsNumber: Int ): Sequential[Float] = { Sequential[Float]() .add(Dropout(dropout)) .add(GraphConvolution[Float](matrix, batchSize, inputSize)) .add(Linear[Float](inputSize, intermediateSize, wRegularizer = L2Regularizer(5e-4)).setName("layer-1")) .add(ReLU()) .add(Dropout(dropout)) .add(GraphConvolution[Float](matrix, batchSize, intermediateSize)) .add(Linear[Float](intermediateSize, labelsNumber).setName("layer-2")) .add(LogSoftMax()) } NN Model GraphConvLayer LinearLayer ReLU Drop GraphConvLayer Drop LinearLayer SoftMax Label Prediction Input Trainable parameters are in red modules! Model sequential implementation:
  • 24. Optimization Process - We use only train with 140 samples of the 2708. - Every mini-batch is equivalent to one Epoch. - Avoid shuffle the data in data broadcast. - For every sub-graph one Spark Partition. - The negative log likelihood (NLL) criterion. - Adam Optimizer with lr = 1E-3, beta1= 0.9, beta2 = 0.999, epsilon = 1E-8, decay = 0, wdecay = 0
  • 25. Results accuracy: 0.531019 - Training 1000 Epochs. - 140 labeled examples. - Propagation Function HW. Multilayer perceptron: D-1/2 ÂD-1/2 HW Case 1: One graph in one partition. - Propagation Function D-1/2 ÂD-1/2 HW. Renormalization trick. accuracy: 0.769202 Identity matrix!
  • 26. Process Visualization - Represent the output of the second hidden layer (7 neurons) - Dimensionality reduction applying tSNE(t-Distributed Stochastic Neighbor Embedding) - A Snapshot every 200 epochs is taken.
  • 27. Representation of the last layer using tSNE with NO convolution
  • 28. Representation of the last layer using tSNE with spectral Conv
  • 29. Conclusions and future work - Convolutions on graphs show promising results in graph analysis using deep learning. - We can get benefit of the Spark processing power to perform distributed NN training. - The Scala ecosystem will help to develop and integrate with the big data world. - Scala 3 Graph Neural Network library on top of Spark.