Representation Learning in Large Attributed Graphs

-‐
-‐
-‐
-‐
-‐
Social network
Human Disease Network
[Barabasi 2007]
Food Web [2007]
Terrorist Network
[Krebs 2002]Internet (AS) [2005]
Gene Regulatory Network
[Decourty 2008]
Protein Interactions
[breast cancer]
Political blogs
Power grid

input 0 …
1 …
0 …
Feature

Engineering
features
1 …
1 … 0
0
1
0
0
Learning

AlgorithmModel
Prediction
Task
Link
prediction
Classification

Anomaly
detection

input 0 …
1 …
0 …
Feature

Engineering
features
1 …
1 … 0
0
1
0
0
Learning

AlgorithmModel
Prediction
Task
Automatic

Feature
Learning
Link
prediction
Classification

Anomaly
detection

§ Goal: Learn representation (features) for a set of graph
elements (nodes, edges, etc.)
§ Key intuition: Map the graph elements (e.g., nodes) to the
d-‐dimension space, while preserving node similarity
§ Use the features for any downstream prediction task

Recent work: Map nodes based on their proximity in the
input graph – (nearby nodes are close together)
DeepWalk Model
Perrozi et
al.
KDD
2014

How to get nearby nodes?
Perrozi et
al.
KDD
2014
Grover
et
al.
KDD
2016

§ A (conditional) walk/path is a finite sequence of adjacent
vertices in the graph
How to get nearby nodes?
Perrozi et
al.
KDD
2014
Grover
et
al.
KDD
2016

V1
V3
V4
V2
V5
The random walk traversed link V1 -‐-‐-‐ V2
Evaluating next step at node V2

Mikolov et
al.
ICLR
2013
Perrozi et
al.
KDD
2014
focus
vertex

§ No support for inductive/transfer learning
• features are learned for node identities
• features do not generalize beyond the input graph
§ Map nodes based on their proximity only
§ No notion of attributes
§ No notion of structural similarity

Communities: cohesive subsets of nodes
Roles: represent structural patterns
-‐ two nodes belong to the same role if they’ve similar structural patterns
Cj#
Ci#
Ck#
Rossi
&
Ahmed
TKDE
2015
Ahmed
et
al.
AAAI
2017

Goal: Find a mapping of nodes to d-‐dimensions that preserves
proximity and node similarity
Using structure + attributes (if any)

A (conditional) attributed walk is a finite sequence of adjacent
node types (words) in the graph
Ahmed
et.
al
2017

The random walk traversed link ,
Evaluating next step at node V2

G1
1
G2
3
2
G3
4
G4
5
6
G5
7
8
G6
9
G7
10
11
12
G9
15
G8
13
14
Network Motifs: Simple Building Blocks of Complex Networks – [Milo et al. – Science 2002]
The Structure and Function of Complex Networks – [Newman – Siam Review 2003]
Applied to food, biologcal, genetic, neural, web, and other networks

§ Predict which pairs of nodes are likely to connect
§ Applications: social network analysis, biological networks,
terrorist networks, etc.

Deepwalk (DW) – Perrozi et al. KDD 2014
node2vec (N2V) – Grover et al. KDD 2016
LINE: Tang et al. – WWW 2015

1 2 4 8 12 16
0
2
4
6
8
10
12
14
16
Number of processing units
Speedup
socfb−MIT
bio−dmela
soc−gowalla
tech−RL−caida
web−wikipedia09
1 2 4 8 12 16
0
2
4
6
8
10
12
14
16
Number of processing units
Speedup
Strong scaling results
Using Intel Xeon E5-‐2687W server, 16 cores
Motif Counting

§ We propose a generic framework for learning representation
in large attributed graphs
§ Maps nodes based on Structural similarity + proximity +
attributes (if any)
§ Learns universal features that can generalize across
networks/graphs
§ Useful for inductive/transfer learning
§ Scalable for large graphs

§ Generalizing other deep graph models
§ Theoretical analysis
§ Choice of mapping functions
§ Impact of sampling strategy
§ Evaluation on other ML tasks

§ Efficient estimation of word representations in vector space. ICLR 2013 [Mikolov et. al]
§ A Framework for Generalizing Graph-‐based Representation Learning Methods. arXiv:1709.04596 2017 [Ahmed et. al]
§ Role Discovery in Networks. TKDE 2015 [Rossi & Ahmed]
§ A Higher-‐order Latent Space Network Model. AAAI 2017 [Ahmed, Rossi, Willke, Zhou]
§ node2vec: Scalable Feature Learning for Networks. KDD 2016 [Grover, Leskovec]
§ DeepWalk: online learning of social representations. KDD 2014 [Perozzi, Al-‐Rafou, Skiena]
§ Efficient Graphlet Counting for Large Networks. ICDM 2015, [Ahmed et al.]
§ Graphlet Decomposition: Framework, Algorithms, and Applications. J. Know. & Info. 2016 [Ahmed et al.]
§ Network Motifs: Simple Building Blocks of Complex Networks. Science 2002, [Milo et al.]
§ Uncovering Biological Network Function via Graphlet Degree Signatures. Cancer Informatics 2008 [Milenković-‐Pržulj]
§ Graph Kernels. JMLR 2010, [Vishwanathan et al.]
§ The Structure and Function of Complex Networks. SIAM Review 2003, [Newman]
§ Biological network comparison using graphlet degree distribution. Bioinformatics 2007 [Pržulj]
§ Efficient Graphlet Kernels for Large Graph Comparison. AISTAT 2009 [Shervashidze et al.]
§ Local structure in social networks. Sociological methodology 1976, [Holland-‐Leinhardt]
§ The strength of weak ties: A network theory revisited. Sociological theory 1983 [Granovetter]

Thank you!
Questions?
nesreen.k.ahmed@intel.com
http://nesreenahmed.com

Representation Learning in Large Attributed Graphs

More Related Content

What's hot (20)

Similar to Representation Learning in Large Attributed Graphs (20)

More from Nesreen K. Ahmed (6)

Recently uploaded (20)

Representation Learning in Large Attributed Graphs