SlideShare a Scribd company logo
Indian Institute of Technology, Patna
Graph Coloring Algorithms on Pregel Model using Hadoop
Supervisor
Dr. Rajiv Misra
Candidate
Nishant M Gandhi
Roll No: 1311CS05
March 29, 2015
Contents
• Introduction
• Related Work
• Pregel Graph Coloring Algorithms
◦ Algorithms
◦ Analysis/Result
• Conclusion & Future Work
• References
2 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Introduction
• Challange:
◦ Graph Coloring (Total Vertex Coloring) of Large Scale Graph on top
of Hadoop
• Graph Coloring:
◦ G = (V , E) undirected graph
◦ V is set of vertices and E is set of edges
◦ The problem of graph coloring is to assign color to each vertex
such that for all (i, j) ∈ E; i and j does not get same color.
3 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Introduction: Applications
• Finding substructure in social network [Cha11]
• Frequency Assignment [RPM05]
• Content Delivery Network
• Distibuted Resource Directory Service [Ko06]
4 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Introduction
• Motivation:
◦ MapReduce model is not suitable for iterative graph computation
such as Graph Coloring. Pregel is more suitable for that.
◦ Existing work on Graph Coloring Algorithms on Pregel are like
demonstration of Graph Coloring can also be implemented on Pregel.
[SW14]
◦ Lack careful study of different Graph Coloring Algorithms on Pregel.
5 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Introduction: My Work
• Studies 5 Pregel Graph Coloring Algorithms
◦ Local Maxima First(LMF)
◦ Local Minima-Maxima First(LMMF)
◦ Local Largest Degree First(LLDF)
◦ Local Smallest-Largest Degree First(LSLDF)
◦ Local Incident Degree First(LIDF)
• Being more suitable Pregel based open source platform
[HDA+14], Apache Giraph is used to implement algorithms.
• Evaluated performace of Pregel Graph Coloring Algorithms with
large real-world graphs on 8 node Apache Hadoop cluster.
6 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Background
• Minimum number required to properly color graph is called
chromatic number of that graph.
• Finding chromatic number of a graph is well known NP-Hard
Problem. [GJ79]
• It is not possible to approximate chromatic number into
considerable bound. [FK96]
• Relax chromatic number and many polynomial time sequential
algorithm exist for simple graph coloring problem.
• Maximal Independent Set(MIS) algorithms, which can be easly
parallelized can be used for solving graph coloring problem in
parallel.
7 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Related Work: MapReduce
• Problem with MapReduce Graph Algorithm
◦ Iterative MR-Jobs
◦ High I/O
◦ Not intuitive for Graph Algorithm
• No attempts are made in designing Graph Coloring Algorithm
with MapReduce model
• Pregel model is more suitable for iterative graph computation
than MapReduce model on top of Hadoop [QWH12]
8 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Related Work: Pregel
• Pregel [MABD10], Graph Processing System
◦ In-memory Computation
◦ Vertex-Centic High-level programing model
◦ Batch oriented processing
◦ Based on Valient’s Bulk Synchronization Parallel Model [Val90]
9 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Related Work: Pregel Model
• Graph G=(V,E), Graph is mutable during execution of Algorithm.
• The computation starts simultaneously in all vertices, and
proceeds in discrete rounds.
• The number of rounds that elapse from the beginning of the
algorithm until its end is called the running time of the algorithm.
• Vertices are allowed to perform unbounded local computations.
• Each Vertex can be in either Active or Inactive State. Only Active
vertices in each round take part in local computation.
• In each round, each vertex v is allowed to send message to each
of its neighbors.
• A vertex is allowed to send distict messages to distict neighbor.
• The vertices communicate over the edges of E in the synchronous
manner. 10 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Related Work: Pregel Model
• Pregel works in iterations called Supersteps
• Program Flow:
For Superstep Si=S1,S2,S3,...,Sn
◦ For each Active Vertex,
Execute Compute:
• Messages are received
• Local computation
• Messages are Sent
• Graph Mutation
• VoteToHalt
◦ Termination:
• All Vertices are in Inactive state
• No Messages are sent
11 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Related Work: Pregel Model
• Vertex
◦ VertexId
◦ VertexValue
• Edge
◦ Target Vertex
◦ Weight
• Vertex State
◦ Active
◦ Inactive
12 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Related Work: Distributed Algorithms
• MIS algorithms colors the graph by repeatedly finding
Independent Set
• Randomized Algorithms to find MIS
◦ Luby’s MIS algorithm [Lub86]
◦ Jones-Placement algorithm [JP93]
◦ Welsh-Powell algorithm [WP67]
◦ E G Boman et al. algorithm [BBCG]
13 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Pregel Graph Coloring Algorithms
• Heuristic Approach
• Does not give optimal solutions
• Based on computing Maximul Independent Set in parallel
• Certain assuptions are made for this algorithms.
◦ Graph is undirected and unweighted
◦ Each vertex has unique identifier
◦ Each vertex has one storage variable and assigned color is stored in
that variable
◦ Instead of color, we assign number to vertices
14 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Pregel Graph Coloring Algorithms: Local Maxima
First(LMF)
• Simple Heuristic Approach
• Use only VertexId of Vertex
• Among Active Vertices, Vertices with maximum VertexId in
neighbors are selected
• Each Supersteps generate one MIS and color it
15 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Pregel Graph Coloring Algorithms: Local Maxima
First(LMF)
16 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Pregel Graph Coloring Algorithms: Local
Minima-Maxima First(LMMF)
• Improvement over LMF
• Use only VertexId of Vertex
• Among Active Vertices, Vertices with minimum and maximum
VertexId in neighbors are selected
• Each Supersteps generate one or two MIS and color them
differently
17 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Pregel Graph Coloring Algorithms: Local
Minima-Maxima First(LMMF)
18 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Pregel Graph Coloring Algorithms: Local Largest
Degree First(LLDF)
• Better Heuristic than previous approch
• Use Degree of a Vertex
• Each Supersteps generate one MIS and color it
19 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Pregel Graph Coloring Algorithms: Local Largest
Degree First(LLDF)
20 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Pregel Graph Coloring Algorithms: Local Smallest
Largest Degree First(LSLDF)
• Improvement over LLDF
• Use Degree of a Vertex
• Each Supersteps generate one or two MIS and color them
differently
21 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Pregel Graph Coloring Algorithms: Local Smallest
Largest Degree First(LSLDF)
22 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Pregel Graph Coloring Algorithms: Local Smallest
Largest Degree First(LSLDF)
23 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Pregel Graph Coloring Algorithms: Local Incident
Degree First(LIDF)
• Dynemic Ordering based Heuristic
• Use Incident Degree of a Vertex
• Two Supersteps generate one MIS and color it
24 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Pregel Graph Coloring Algorithms: Local Incident
Degree First(LIDF)
25 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Pregel Graph Coloring Algorithms: Local Incident
Degree First(LIDF)
26 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Experiments: Cluster Configuration
Parameters Details
Number of Nodes 8
RAM for Each Node 2 GB
Hard Disk for Each Node 100 GB
Operating System for
Each Node
Ubuntu Desktop 14.04
(Linux 3.13.0-24-generic)
Hadoop Version 1.2.1 MR1
Pregel like System Name Apache Giraph
Pregel like System Version 1.2.0
Configured Workers 4 per node
Table : Hadoop Cluster Configuration Details
27 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Experiments: Dataset
Dataset |V | |E|
Internet-Topology 1,696,415 11,095,298 35,455
Youtube 1,138,499 2,990,443 28,754
Texas Road Network 1,379,917 1,921,660 12
Flicker 1,715,255 22,613,981 27,236
Table : Real World Datasets from Stanford Network Analysis Platform
28 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Experiments & Result: Performance on Color
Color Used Internet-Topology Youtube Texas Road Network Flicker
LMF 1586 704 344 4303
LMMF 1587 705 345 4303
LLDF 484 261 123 1667
LSLDF 478 267 139 1653
LIDF 648 283 19 3133
Table : Color Used by Different Graph Coloring Algorithm on Different
Dataset
• Performace of LLDF & LSLDF are better than others and very close to each
other.
• LMF & LMMF performace equaly worst than others.
• LIDF has performance better than LMF,LMMF and worst than LLDF, LSLDF.
29 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Experiments & Result: Run Time (second)
Run Time Internet-Topology Youtube Texas Road Network Flicker
LMF 2700 407 66 648122
LMMF 2460 233 49 218556
LLDF 1783 350 47 217031
LSLDF 1380 94 44 2113
LIDF 2597 343 51 1080588
Table : Time(in seconds) taken by Different Graph Coloring Algorithm
30 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Experiments & Result: Supersteps
Supersteps Internet-Topology Youtube Texas Road Network Flicker
LMF 1587 705 345 4304
LMMF 794 353 173 2153
LLDF 485 262 124 1667
LSLDF 241 135 120 827
LIDF 1293 567 39 6267
Table : Time(in seconds) taken by Different Graph Coloring Algorithm
31 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Conclusion
• Effective Graph Coloring is possible using various Heuristic with
Pregel on Hadoop
• Among the algorithm presented, LLDF perform best in the matrix
of Color used in most of the cases of social Netwrok Graphs.
• LSLDF come out as overall best performer in terms of time and
Color used.
• LMF & LMMF are not good approach to color graph in general.
• LIDF perform best in sparce graph but takes more time than
others.
32 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Future Work
• Performance guarantee graph coloring algorithms on Pregel
• Custom Graph partition for performance tuning
33 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
34 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
References (1)
Erik G Boman, Doruk Bozda˘g, Umit Catalyurek, and Gebremedhin, A scalable parallel graph
coloring algorithm for distributed memory computers, Euro-Par 2005 Parallel Processing,
Springer, pp. 241–251.
David Chalupa, On the ability of graph coloring heuristics to find substructures in social
networks, Information Sciences and Technologies, Bulletin of ACM Slovakia 3 (2011), no. 2,
51–54.
Uriel Feige and Joe Kilian, Zero knowledge and the chromatic number, Computational
Complexity, 1996. Proceedings., Eleventh Annual IEEE Conference on, IEEE, 1996,
pp. 278–287.
M R Garey and D S Johnson, Computers and intractability, Freeman (1979).
Minyang Han, Khuzaima Daudjee, Khaled Ammar, M Tamer Ozsu, Xingfang Wang, and Tianqi
Jin, An experimental comparison of pregel-like graph processing systems, Proceedings of the
VLDB Endowment 7 (2014), no. 12, 1047–1058.
35 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
References (2)
Mark T Jones and Paul E Plassmann, A parallel graph coloring heuristic, SIAM Journal on
Scientific Computing 14 (1993), no. 3, 654–669.
Bong Jun Ko, Distributed, self-organizing replica placement in large scale networks, Columbia
University, 2006.
Michael Luby, A simple parallel algorithm for the maximal independent set problem, SIAM
journal on computing 15 (1986), no. 4, 1036–1053.
Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, and Dehnert, Pregel: a system for
large-scale graph processing, Proceedings of the 2010 ACM SIGMOD International Conference
on Management of data, ACM, 2010, pp. 135–146.
Louise Quick, Paul Wilkinson, and David Hardcastle, Using pregel-like large scale graph
processing frameworks for social network analysis, Proceedings of the 2012 International
Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012), IEEE
Computer Society, 2012, pp. 457–463.
36 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
References (3)
Janne Riihij¨arvi, Marina Petrova, and Petri M¨ah¨onen, Frequency allocation for wlans using
graph colouring techniques., WONS, vol. 5, 2005, pp. 216–222.
Semih Salihoglu and Jennifer Widom, Optimizing graph algorithms on pregel-like systems.
Leslie G Valiant, A bridging model for parallel computation, Communications of the ACM 33
(1990), no. 8, 103–111.
Dominic JA Welsh and Martin B Powell, An upper bound for the chromatic number of a graph
and its application to timetabling problems, The Computer Journal 10 (1967), no. 1, 85–86.
37 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop
Thank You
38 of 38
Nishant M Gandhi, Roll No: 1311CS05 -
Graph Coloring Algorithms on Pregel Model using Hadoop

More Related Content

What's hot (12)

PDF
Object Detection & Machine Learning Paper
Joseph Mogannam
 
PPTX
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
MLconf
 
PPTX
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
Eric Haibin Lin
 
PDF
Dynamic programming
International Islamic University
 
PPTX
Python in geospatial analysis
Sakthivel R
 
PDF
Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...
MLconf
 
PDF
SASUM: A Sharing-based Approach to Fast Approximate Subgraph Matching for Lar...
Kyong-Ha Lee
 
PPTX
Strata + Hadoop World 2012: Knitting Boar
Cloudera, Inc.
 
PDF
Scalable and Adaptive Graph Querying with MapReduce
Kyong-Ha Lee
 
PPT
KARNAUGH MAP using OpenGL (KMAP)
Sagar Uday Kumar
 
PPTX
How Criteo optimized and sped up its TensorFlow models by 10x and served them...
Nicolas Kowalski
 
PDF
EuroMPI 2016 Keynote: How Can MPI Fit Into Today's Big Computing
Jonathan Dursi
 
Object Detection & Machine Learning Paper
Joseph Mogannam
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
MLconf
 
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
Eric Haibin Lin
 
Python in geospatial analysis
Sakthivel R
 
Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...
MLconf
 
SASUM: A Sharing-based Approach to Fast Approximate Subgraph Matching for Lar...
Kyong-Ha Lee
 
Strata + Hadoop World 2012: Knitting Boar
Cloudera, Inc.
 
Scalable and Adaptive Graph Querying with MapReduce
Kyong-Ha Lee
 
KARNAUGH MAP using OpenGL (KMAP)
Sagar Uday Kumar
 
How Criteo optimized and sped up its TensorFlow models by 10x and served them...
Nicolas Kowalski
 
EuroMPI 2016 Keynote: How Can MPI Fit Into Today's Big Computing
Jonathan Dursi
 

Similar to Graph Coloring Algorithms on Pregel Model using Hadoop (20)

PDF
Benchmarking tool for graph algorithms
Yash Khandelwal
 
PPTX
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
ssuser4b1f48
 
PDF
Benchmarking Tool for Graph Algorithms
Yash Khandelwal
 
PPT
Mining quasi bicliques using giraph
Hsiao-Fei Liu
 
PPTX
240513_Thuy_Labseminar[Universal Prompt Tuning for Graph Neural Networks].pptx
thanhdowork
 
PDF
Deep stream dynamic graph analytics with grapharis - Massimo Perini
Flink Forward
 
PDF
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Flink Forward
 
PPTX
Large-scale Recommendation Systems on Just a PC
Aapo Kyrölä
 
PPTX
[NS][Lab_Seminar_241230]HiGDA: Hierarchical Graph of Nodes to Learn Local-to-...
thanhdowork
 
PDF
Elementary Parallel Algorithms
Heman Pathak
 
PDF
A framework for low communication approaches for large scale 3D convolution
Carlos Reaño González
 
PDF
Vladislav Kolbasin “Introduction to Generative Adversarial Networks (GANs)”
Lviv Startup Club
 
PPT
design mapping lecture6-mapreducealgorithmdesign.ppt
turningpointinnospac
 
PPTX
The Knowledge Graph Conference 2022 - Bo Wu's Presentation
Katana Graph
 
PDF
MapReduce Algorithm Design - Parallel Reduce Operations
Jason J Pulikkottil
 
PDF
Comparing pregel related systems
Prashant Raaghav
 
PPTX
Sun_MAPL_GNN.pptx
ssuser1760c0
 
PDF
Mod05lec23(map reduce tutorial)
Ankit Gupta
 
PPTX
Map reduce programming model to solve graph problems
Nishant Gandhi
 
PDF
Benchmarking tool for graph algorithms
Yash Khandelwal
 
Benchmarking tool for graph algorithms
Yash Khandelwal
 
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
ssuser4b1f48
 
Benchmarking Tool for Graph Algorithms
Yash Khandelwal
 
Mining quasi bicliques using giraph
Hsiao-Fei Liu
 
240513_Thuy_Labseminar[Universal Prompt Tuning for Graph Neural Networks].pptx
thanhdowork
 
Deep stream dynamic graph analytics with grapharis - Massimo Perini
Flink Forward
 
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Flink Forward
 
Large-scale Recommendation Systems on Just a PC
Aapo Kyrölä
 
[NS][Lab_Seminar_241230]HiGDA: Hierarchical Graph of Nodes to Learn Local-to-...
thanhdowork
 
Elementary Parallel Algorithms
Heman Pathak
 
A framework for low communication approaches for large scale 3D convolution
Carlos Reaño González
 
Vladislav Kolbasin “Introduction to Generative Adversarial Networks (GANs)”
Lviv Startup Club
 
design mapping lecture6-mapreducealgorithmdesign.ppt
turningpointinnospac
 
The Knowledge Graph Conference 2022 - Bo Wu's Presentation
Katana Graph
 
MapReduce Algorithm Design - Parallel Reduce Operations
Jason J Pulikkottil
 
Comparing pregel related systems
Prashant Raaghav
 
Sun_MAPL_GNN.pptx
ssuser1760c0
 
Mod05lec23(map reduce tutorial)
Ankit Gupta
 
Map reduce programming model to solve graph problems
Nishant Gandhi
 
Benchmarking tool for graph algorithms
Yash Khandelwal
 
Ad

More from Nishant Gandhi (7)

PPTX
Customer Feedback Analytics for Starbucks
Nishant Gandhi
 
PDF
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Nishant Gandhi
 
PPT
Processing Large Graphs
Nishant Gandhi
 
DOCX
Neo4j vs giraph
Nishant Gandhi
 
DOCX
Packet tracer practical guide
Nishant Gandhi
 
DOCX
Hadoop Report
Nishant Gandhi
 
PPSX
Hadoop
Nishant Gandhi
 
Customer Feedback Analytics for Starbucks
Nishant Gandhi
 
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Nishant Gandhi
 
Processing Large Graphs
Nishant Gandhi
 
Neo4j vs giraph
Nishant Gandhi
 
Packet tracer practical guide
Nishant Gandhi
 
Hadoop Report
Nishant Gandhi
 
Ad

Recently uploaded (20)

PDF
Isro (Indian space research organization)
parineetaparineeta23
 
PPTX
Human-AI Interaction in Space: Insights from a Mars Analog Mission with the H...
Jean Vanderdonckt
 
PPTX
MEDICINAL CHEMISTRY PROSPECTIVES IN DESIGN OF EGFR INHIBITORS.pptx
40RevathiP
 
PDF
EV REGENERATIVE ACCELERATION INNOVATION SUMMARY PITCH June 13, 2025.pdf
Thane Heins NOBEL PRIZE WINNING ENERGY RESEARCHER
 
PDF
Electromagnetism 3.pdf - AN OVERVIEW ON ELECTROMAGNETISM
kaustavsahoo94
 
PDF
Can Consciousness Live and Travel Through Quantum AI?
Saikat Basu
 
PPSX
Overview of Stem Cells and Immune Modulation.ppsx
AhmedAtwa29
 
PDF
Human-to-Robot Handovers track - RGMC - ICRA 2025
Alessio Xompero
 
PPTX
The-Emergence-of-Social-Science-Disciplines-A-Historical-Journey.pptx
RomaErginaBachiller
 
PPTX
Organisms of oncogenic Potential.pptx
mrkoustavjana2003
 
PDF
HOW TO DEAL WITH THREATS FROM THE FORCES OF NATURE FROM OUTER SPACE.pdf
Faga1939
 
PDF
Study of Appropriate Information Combination in Image-based Obfuscated Malwar...
takahashi34
 
PDF
Impacts on Ocean Worlds Are Sufficiently Frequent and Energetic to Be of Astr...
Sérgio Sacani
 
PDF
SCH 4103_Fibre Technology & Dyeing_07012020.pdf
samwelngigi37
 
PDF
Disk Evolution Study Through Imaging of Nearby Young Stars (DESTINYS): Eviden...
Sérgio Sacani
 
PDF
Driving down costs for fermentation: Recommendations from techno-economic data
The Good Food Institute
 
PPTX
Instrumentation of IR and Raman Spectrophotometers.pptx
sngth2h2acc
 
PPTX
Comparative Testing of 2D Stroke Gesture Recognizers in Multiple Contexts of Use
Jean Vanderdonckt
 
PDF
Enzyme Kinetics_Lecture 8.5.2025 Enzymology.pdf
ayeshaalibukhari125
 
PDF
Evidence for a sub-Jovian planet in the young TWA 7 disk
Sérgio Sacani
 
Isro (Indian space research organization)
parineetaparineeta23
 
Human-AI Interaction in Space: Insights from a Mars Analog Mission with the H...
Jean Vanderdonckt
 
MEDICINAL CHEMISTRY PROSPECTIVES IN DESIGN OF EGFR INHIBITORS.pptx
40RevathiP
 
EV REGENERATIVE ACCELERATION INNOVATION SUMMARY PITCH June 13, 2025.pdf
Thane Heins NOBEL PRIZE WINNING ENERGY RESEARCHER
 
Electromagnetism 3.pdf - AN OVERVIEW ON ELECTROMAGNETISM
kaustavsahoo94
 
Can Consciousness Live and Travel Through Quantum AI?
Saikat Basu
 
Overview of Stem Cells and Immune Modulation.ppsx
AhmedAtwa29
 
Human-to-Robot Handovers track - RGMC - ICRA 2025
Alessio Xompero
 
The-Emergence-of-Social-Science-Disciplines-A-Historical-Journey.pptx
RomaErginaBachiller
 
Organisms of oncogenic Potential.pptx
mrkoustavjana2003
 
HOW TO DEAL WITH THREATS FROM THE FORCES OF NATURE FROM OUTER SPACE.pdf
Faga1939
 
Study of Appropriate Information Combination in Image-based Obfuscated Malwar...
takahashi34
 
Impacts on Ocean Worlds Are Sufficiently Frequent and Energetic to Be of Astr...
Sérgio Sacani
 
SCH 4103_Fibre Technology & Dyeing_07012020.pdf
samwelngigi37
 
Disk Evolution Study Through Imaging of Nearby Young Stars (DESTINYS): Eviden...
Sérgio Sacani
 
Driving down costs for fermentation: Recommendations from techno-economic data
The Good Food Institute
 
Instrumentation of IR and Raman Spectrophotometers.pptx
sngth2h2acc
 
Comparative Testing of 2D Stroke Gesture Recognizers in Multiple Contexts of Use
Jean Vanderdonckt
 
Enzyme Kinetics_Lecture 8.5.2025 Enzymology.pdf
ayeshaalibukhari125
 
Evidence for a sub-Jovian planet in the young TWA 7 disk
Sérgio Sacani
 

Graph Coloring Algorithms on Pregel Model using Hadoop

  • 1. Indian Institute of Technology, Patna Graph Coloring Algorithms on Pregel Model using Hadoop Supervisor Dr. Rajiv Misra Candidate Nishant M Gandhi Roll No: 1311CS05 March 29, 2015
  • 2. Contents • Introduction • Related Work • Pregel Graph Coloring Algorithms ◦ Algorithms ◦ Analysis/Result • Conclusion & Future Work • References 2 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 3. Introduction • Challange: ◦ Graph Coloring (Total Vertex Coloring) of Large Scale Graph on top of Hadoop • Graph Coloring: ◦ G = (V , E) undirected graph ◦ V is set of vertices and E is set of edges ◦ The problem of graph coloring is to assign color to each vertex such that for all (i, j) ∈ E; i and j does not get same color. 3 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 4. Introduction: Applications • Finding substructure in social network [Cha11] • Frequency Assignment [RPM05] • Content Delivery Network • Distibuted Resource Directory Service [Ko06] 4 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 5. Introduction • Motivation: ◦ MapReduce model is not suitable for iterative graph computation such as Graph Coloring. Pregel is more suitable for that. ◦ Existing work on Graph Coloring Algorithms on Pregel are like demonstration of Graph Coloring can also be implemented on Pregel. [SW14] ◦ Lack careful study of different Graph Coloring Algorithms on Pregel. 5 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 6. Introduction: My Work • Studies 5 Pregel Graph Coloring Algorithms ◦ Local Maxima First(LMF) ◦ Local Minima-Maxima First(LMMF) ◦ Local Largest Degree First(LLDF) ◦ Local Smallest-Largest Degree First(LSLDF) ◦ Local Incident Degree First(LIDF) • Being more suitable Pregel based open source platform [HDA+14], Apache Giraph is used to implement algorithms. • Evaluated performace of Pregel Graph Coloring Algorithms with large real-world graphs on 8 node Apache Hadoop cluster. 6 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 7. Background • Minimum number required to properly color graph is called chromatic number of that graph. • Finding chromatic number of a graph is well known NP-Hard Problem. [GJ79] • It is not possible to approximate chromatic number into considerable bound. [FK96] • Relax chromatic number and many polynomial time sequential algorithm exist for simple graph coloring problem. • Maximal Independent Set(MIS) algorithms, which can be easly parallelized can be used for solving graph coloring problem in parallel. 7 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 8. Related Work: MapReduce • Problem with MapReduce Graph Algorithm ◦ Iterative MR-Jobs ◦ High I/O ◦ Not intuitive for Graph Algorithm • No attempts are made in designing Graph Coloring Algorithm with MapReduce model • Pregel model is more suitable for iterative graph computation than MapReduce model on top of Hadoop [QWH12] 8 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 9. Related Work: Pregel • Pregel [MABD10], Graph Processing System ◦ In-memory Computation ◦ Vertex-Centic High-level programing model ◦ Batch oriented processing ◦ Based on Valient’s Bulk Synchronization Parallel Model [Val90] 9 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 10. Related Work: Pregel Model • Graph G=(V,E), Graph is mutable during execution of Algorithm. • The computation starts simultaneously in all vertices, and proceeds in discrete rounds. • The number of rounds that elapse from the beginning of the algorithm until its end is called the running time of the algorithm. • Vertices are allowed to perform unbounded local computations. • Each Vertex can be in either Active or Inactive State. Only Active vertices in each round take part in local computation. • In each round, each vertex v is allowed to send message to each of its neighbors. • A vertex is allowed to send distict messages to distict neighbor. • The vertices communicate over the edges of E in the synchronous manner. 10 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 11. Related Work: Pregel Model • Pregel works in iterations called Supersteps • Program Flow: For Superstep Si=S1,S2,S3,...,Sn ◦ For each Active Vertex, Execute Compute: • Messages are received • Local computation • Messages are Sent • Graph Mutation • VoteToHalt ◦ Termination: • All Vertices are in Inactive state • No Messages are sent 11 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 12. Related Work: Pregel Model • Vertex ◦ VertexId ◦ VertexValue • Edge ◦ Target Vertex ◦ Weight • Vertex State ◦ Active ◦ Inactive 12 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 13. Related Work: Distributed Algorithms • MIS algorithms colors the graph by repeatedly finding Independent Set • Randomized Algorithms to find MIS ◦ Luby’s MIS algorithm [Lub86] ◦ Jones-Placement algorithm [JP93] ◦ Welsh-Powell algorithm [WP67] ◦ E G Boman et al. algorithm [BBCG] 13 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 14. Pregel Graph Coloring Algorithms • Heuristic Approach • Does not give optimal solutions • Based on computing Maximul Independent Set in parallel • Certain assuptions are made for this algorithms. ◦ Graph is undirected and unweighted ◦ Each vertex has unique identifier ◦ Each vertex has one storage variable and assigned color is stored in that variable ◦ Instead of color, we assign number to vertices 14 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 15. Pregel Graph Coloring Algorithms: Local Maxima First(LMF) • Simple Heuristic Approach • Use only VertexId of Vertex • Among Active Vertices, Vertices with maximum VertexId in neighbors are selected • Each Supersteps generate one MIS and color it 15 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 16. Pregel Graph Coloring Algorithms: Local Maxima First(LMF) 16 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 17. Pregel Graph Coloring Algorithms: Local Minima-Maxima First(LMMF) • Improvement over LMF • Use only VertexId of Vertex • Among Active Vertices, Vertices with minimum and maximum VertexId in neighbors are selected • Each Supersteps generate one or two MIS and color them differently 17 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 18. Pregel Graph Coloring Algorithms: Local Minima-Maxima First(LMMF) 18 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 19. Pregel Graph Coloring Algorithms: Local Largest Degree First(LLDF) • Better Heuristic than previous approch • Use Degree of a Vertex • Each Supersteps generate one MIS and color it 19 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 20. Pregel Graph Coloring Algorithms: Local Largest Degree First(LLDF) 20 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 21. Pregel Graph Coloring Algorithms: Local Smallest Largest Degree First(LSLDF) • Improvement over LLDF • Use Degree of a Vertex • Each Supersteps generate one or two MIS and color them differently 21 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 22. Pregel Graph Coloring Algorithms: Local Smallest Largest Degree First(LSLDF) 22 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 23. Pregel Graph Coloring Algorithms: Local Smallest Largest Degree First(LSLDF) 23 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 24. Pregel Graph Coloring Algorithms: Local Incident Degree First(LIDF) • Dynemic Ordering based Heuristic • Use Incident Degree of a Vertex • Two Supersteps generate one MIS and color it 24 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 25. Pregel Graph Coloring Algorithms: Local Incident Degree First(LIDF) 25 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 26. Pregel Graph Coloring Algorithms: Local Incident Degree First(LIDF) 26 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 27. Experiments: Cluster Configuration Parameters Details Number of Nodes 8 RAM for Each Node 2 GB Hard Disk for Each Node 100 GB Operating System for Each Node Ubuntu Desktop 14.04 (Linux 3.13.0-24-generic) Hadoop Version 1.2.1 MR1 Pregel like System Name Apache Giraph Pregel like System Version 1.2.0 Configured Workers 4 per node Table : Hadoop Cluster Configuration Details 27 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 28. Experiments: Dataset Dataset |V | |E| Internet-Topology 1,696,415 11,095,298 35,455 Youtube 1,138,499 2,990,443 28,754 Texas Road Network 1,379,917 1,921,660 12 Flicker 1,715,255 22,613,981 27,236 Table : Real World Datasets from Stanford Network Analysis Platform 28 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 29. Experiments & Result: Performance on Color Color Used Internet-Topology Youtube Texas Road Network Flicker LMF 1586 704 344 4303 LMMF 1587 705 345 4303 LLDF 484 261 123 1667 LSLDF 478 267 139 1653 LIDF 648 283 19 3133 Table : Color Used by Different Graph Coloring Algorithm on Different Dataset • Performace of LLDF & LSLDF are better than others and very close to each other. • LMF & LMMF performace equaly worst than others. • LIDF has performance better than LMF,LMMF and worst than LLDF, LSLDF. 29 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 30. Experiments & Result: Run Time (second) Run Time Internet-Topology Youtube Texas Road Network Flicker LMF 2700 407 66 648122 LMMF 2460 233 49 218556 LLDF 1783 350 47 217031 LSLDF 1380 94 44 2113 LIDF 2597 343 51 1080588 Table : Time(in seconds) taken by Different Graph Coloring Algorithm 30 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 31. Experiments & Result: Supersteps Supersteps Internet-Topology Youtube Texas Road Network Flicker LMF 1587 705 345 4304 LMMF 794 353 173 2153 LLDF 485 262 124 1667 LSLDF 241 135 120 827 LIDF 1293 567 39 6267 Table : Time(in seconds) taken by Different Graph Coloring Algorithm 31 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 32. Conclusion • Effective Graph Coloring is possible using various Heuristic with Pregel on Hadoop • Among the algorithm presented, LLDF perform best in the matrix of Color used in most of the cases of social Netwrok Graphs. • LSLDF come out as overall best performer in terms of time and Color used. • LMF & LMMF are not good approach to color graph in general. • LIDF perform best in sparce graph but takes more time than others. 32 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 33. Future Work • Performance guarantee graph coloring algorithms on Pregel • Custom Graph partition for performance tuning 33 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 34. 34 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 35. References (1) Erik G Boman, Doruk Bozda˘g, Umit Catalyurek, and Gebremedhin, A scalable parallel graph coloring algorithm for distributed memory computers, Euro-Par 2005 Parallel Processing, Springer, pp. 241–251. David Chalupa, On the ability of graph coloring heuristics to find substructures in social networks, Information Sciences and Technologies, Bulletin of ACM Slovakia 3 (2011), no. 2, 51–54. Uriel Feige and Joe Kilian, Zero knowledge and the chromatic number, Computational Complexity, 1996. Proceedings., Eleventh Annual IEEE Conference on, IEEE, 1996, pp. 278–287. M R Garey and D S Johnson, Computers and intractability, Freeman (1979). Minyang Han, Khuzaima Daudjee, Khaled Ammar, M Tamer Ozsu, Xingfang Wang, and Tianqi Jin, An experimental comparison of pregel-like graph processing systems, Proceedings of the VLDB Endowment 7 (2014), no. 12, 1047–1058. 35 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 36. References (2) Mark T Jones and Paul E Plassmann, A parallel graph coloring heuristic, SIAM Journal on Scientific Computing 14 (1993), no. 3, 654–669. Bong Jun Ko, Distributed, self-organizing replica placement in large scale networks, Columbia University, 2006. Michael Luby, A simple parallel algorithm for the maximal independent set problem, SIAM journal on computing 15 (1986), no. 4, 1036–1053. Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, and Dehnert, Pregel: a system for large-scale graph processing, Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, ACM, 2010, pp. 135–146. Louise Quick, Paul Wilkinson, and David Hardcastle, Using pregel-like large scale graph processing frameworks for social network analysis, Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012), IEEE Computer Society, 2012, pp. 457–463. 36 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 37. References (3) Janne Riihij¨arvi, Marina Petrova, and Petri M¨ah¨onen, Frequency allocation for wlans using graph colouring techniques., WONS, vol. 5, 2005, pp. 216–222. Semih Salihoglu and Jennifer Widom, Optimizing graph algorithms on pregel-like systems. Leslie G Valiant, A bridging model for parallel computation, Communications of the ACM 33 (1990), no. 8, 103–111. Dominic JA Welsh and Martin B Powell, An upper bound for the chromatic number of a graph and its application to timetabling problems, The Computer Journal 10 (1967), no. 1, 85–86. 37 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop
  • 38. Thank You 38 of 38 Nishant M Gandhi, Roll No: 1311CS05 - Graph Coloring Algorithms on Pregel Model using Hadoop