SlideShare a Scribd company logo
2
Most read
4
Most read
6
Most read
Martin Coronel
The data-parallel model is one of the simplest algorithm models. In this model, the tasks are statically or semi-statically mapped onto processes and each task performs similar operations on different data. This type of parallelism that is a result of identical operations being applied concurrently on different data items is called data parallelism. The work may be done in phases and the data operated upon in different phases may be different. Typically, data-parallel computation phases are interspersed with interactions to synchronize the tasks or to get fresh data to the tasks. Since all tasks perform similar computations, the decomposition of the problem into tasks is usually based on data partitioning because a uniform partitioning of data followed by a static mapping is sufficient to guarantee load balance.
Data-parallel algorithms can be implemented in both shared-address-space and message-passing paradigms. However, the partitioned address-space in a message-passing paradigm may allow better control of placement, and thus may offer a better handle on locality. On the other hand, shared-address space can ease the programming effort, especially if the distribution of data is different in different phases of the algorithm. Interaction overheads in the data-parallel model can be minimized by choosing a locality preserving decomposition and, if applicable, by overlapping computation and interaction and by using optimized collective interaction routines. A key characteristic of data-parallel problems is that for most problems, the degree of data parallelism increases with the size of the problem, making it possible to use more processes to effectively solve larger problems.
 
the computations in any parallel algorithm can be viewed as a task-dependency graph. The task-dependency graph may be either trivial, as in the case of matrix multiplication, or nontrivial (Problem 3.5). However, in certain parallel algorithms, the task-dependency graph is explicitly used in mapping. In the task graph model, the interrelationships among the tasks are utilized to promote locality or to reduce interaction costs. This model is typically employed to solve problems in which the amount of data associated with the tasks is large relative to the amount of computation associated with them. Usually, tasks are mapped statically to help optimize the cost of data movement among tasks. Sometimes a decentralized dynamic mapping may be used, but even then, the mapping uses the information about the task-dependency graph structure and the interaction pattern of tasks to minimize interaction overhead. Work is more easily shared in paradigms with globally addressable space, but mechanisms are available to share work in disjoint address space. Typical interaction-reducing techniques applicable to this model include reducing the volume and frequency of interaction by promoting locality while mapping the tasks based on the interaction pattern of tasks, and using asynchronous interaction methods to overlap the interaction with computation.
The work pool or the task pool model is characterized by a dynamic mapping of tasks onto processes for load balancing in which any task may potentially be performed by any process. There is no desired premapping of tasks onto processes. The mapping may be centralized or decentralized. Pointers to the tasks may be stored in a physically shared list, priority queue, hash table, or tree, or they could be stored in a physically distributed data structure. The work may be statically available in the beginning, or could be dynamically generated; i.e., the processes may generate work and add it to the global (possibly distributed) work pool. If the work is generated dynamically and a decentralized mapping is used, then a termination detection algorithm ( Section 11.4.4 ) would be required so that all processes can actually detect the completion of the entire program (i.e., exhaustion of all potential tasks) and stop looking for more work. In the message-passing paradigm, the work pool model is typically used when the amount of data associated with tasks is relatively small compared to the computation associated with the tasks. As a result, tasks can be readily moved around without causing too much data interaction overhead. The granularity of the tasks can be adjusted to attain the desired level of tradeoff between load-imbalance and the overhead of accessing the work pool for adding and extracting tasks.
In the master-slave or the manager-worker model, one or more master processes generate work and allocate it to worker processes. The tasks may be allocated a priori if the manager can estimate the size of the tasks or if a random mapping can do an adequate job of load balancing. In another scenario, workers are assigned smaller pieces of work at different times. The latter scheme is preferred if it is time consuming for the master to generate work and hence it is not desirable to make all workers wait until the master has generated all work pieces. In some cases, work may need to be performed in phases, and work in each phase must finish before work in the next phases can be generated. In this case, the manager may cause all workers to synchronize after each phase. Usually, there is no desired premapping of work to processes, and any worker can do any job assigned to it. The manager-worker model can be generalized to the hierarchical or multi-level manager-worker model in which the top-level manager feeds large chunks of tasks to second-level managers, who further subdivide the tasks among their own workers and may perform part of the work themselves. This model is generally equally suitable to shared-address-space or message-passing paradigms since the interaction is naturally two-way; i.e., the manager knows that it needs to give out work and workers know that they need to get work from the manager.
While using the master-slave model, care should be taken to ensure that the master does not become a bottleneck, which may happen if the tasks are too small (or the workers are relatively fast). The granularity of tasks should be chosen such that the cost of doing work dominates the cost of transferring work and the cost of synchronization. Asynchronous interaction may help overlap interaction and the computation associated with work generation by the master. It may also reduce waiting times if the nature of requests from workers is non-deterministic.
In the pipeline model, a stream of data is passed on through a succession of processes, each of which perform some task on it. This simultaneous execution of different programs on a data stream is called stream parallelism. With the exception of the process initiating the pipeline, the arrival of new data triggers the execution of a new task by a process in the pipeline. The processes could form such pipelines in the shape of linear or multidimensional arrays, trees, or general graphs with or without cycles. A pipeline is a chain of producers and consumers. Each process in the pipeline can be viewed as a consumer of a sequence of data items for the process preceding it in the pipeline and as a producer of data for the process following it in the pipeline. The pipeline does not need to be a linear chain; it can be a directed graph. The pipeline model usually involves a static mapping of tasks onto processes. Load balancing is a function of task granularity. The larger the granularity, the longer it takes to fill up the pipeline, i.e. for the trigger produced by the first process in the chain to propagate to the last process, thereby keeping some of the processes waiting. However, too fine a granularity may increase interaction overheads because processes will need to interact to receive fresh data after smaller pieces of computation. The most common interaction reduction technique applicable to this model is overlapping interaction with computation.
In some cases, more than one model may be applicable to the problem at hand, resulting in a hybrid algorithm model. A hybrid model may be composed either of multiple models applied hierarchically or multiple models applied sequentially to different phases of a parallel algorithm. In some cases, an algorithm formulation may have characteristics of more than one algorithm model. For instance, data may flow in a pipelined manner in a pattern guided by a task-dependency graph. In another scenario, the major computation may be described by a task-dependency graph, but each node of the graph may represent a supertask comprising multiple subtasks that may be suitable for data-parallel or pipelined parallelism.
GRAMA, GUPTA. Introduction to Parallel Computing 2nd Edition. Pearson Addison Wesley.
Ad

Recommended

Interconnection Network
Interconnection Network
Ali A Jalil
 
Distributed Operating System,Network OS and Middle-ware.??
Distributed Operating System,Network OS and Middle-ware.??
Abdul Aslam
 
Physical and Logical Clocks
Physical and Logical Clocks
Dilum Bandara
 
IOT DATA MANAGEMENT AND COMPUTE STACK.pptx
IOT DATA MANAGEMENT AND COMPUTE STACK.pptx
MeghaShree665225
 
Signature files
Signature files
Deepali Raikar
 
Inter Process Communication
Inter Process Communication
Adeel Rasheed
 
Lecture6 text
Lecture6 text
Jay Patel
 
Chapter 4 data link layer
Chapter 4 data link layer
Naiyan Noor
 
Interconnection Network
Interconnection Network
Heman Pathak
 
Communication costs in parallel machines
Communication costs in parallel machines
Syed Zaid Irshad
 
Introduction to Distributed System
Introduction to Distributed System
Sunita Sahu
 
Multiprocessor Architecture (Advanced computer architecture)
Multiprocessor Architecture (Advanced computer architecture)
vani261
 
Uml Common Mechanism
Uml Common Mechanism
Satyamevjayte Haxor
 
GPRS : Architecture and Applications
GPRS : Architecture and Applications
Dr. Ramchandra Mangrulkar
 
Networking and Internetworking Devices
Networking and Internetworking Devices
21viveksingh
 
8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems
Dr Sandeep Kumar Poonia
 
Clock Synchronization in Distributed Systems
Clock Synchronization in Distributed Systems
Zbigniew Jerzak
 
Shared Memory Multi Processor
Shared Memory Multi Processor
babuece
 
Parallel Programing Model
Parallel Programing Model
Adlin Jeena
 
OSI MODEL
OSI MODEL
Ujjwal 'Shanu'
 
Static networks
Static networks
mohamed_awad121
 
Synchronization in distributed systems
Synchronization in distributed systems
SHATHAN
 
Operating system services 9
Operating system services 9
myrajendra
 
Distributed Deadlock Detection.ppt
Distributed Deadlock Detection.ppt
Babar Kamran Ahmed (LION)
 
Networking Models
Networking Models
Aftab Mirza
 
Dichotomy of parallel computing platforms
Dichotomy of parallel computing platforms
Syed Zaid Irshad
 
Mac addresses(media access control)
Mac addresses(media access control)
Ismail Mukiibi
 
Agent discovery& registration
Agent discovery& registration
rajisri2
 
Parallel Algorithms
Parallel Algorithms
Dr Sandeep Kumar Poonia
 
Parallel algorithms
Parallel algorithms
guest084d20
 

More Related Content

What's hot (20)

Interconnection Network
Interconnection Network
Heman Pathak
 
Communication costs in parallel machines
Communication costs in parallel machines
Syed Zaid Irshad
 
Introduction to Distributed System
Introduction to Distributed System
Sunita Sahu
 
Multiprocessor Architecture (Advanced computer architecture)
Multiprocessor Architecture (Advanced computer architecture)
vani261
 
Uml Common Mechanism
Uml Common Mechanism
Satyamevjayte Haxor
 
GPRS : Architecture and Applications
GPRS : Architecture and Applications
Dr. Ramchandra Mangrulkar
 
Networking and Internetworking Devices
Networking and Internetworking Devices
21viveksingh
 
8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems
Dr Sandeep Kumar Poonia
 
Clock Synchronization in Distributed Systems
Clock Synchronization in Distributed Systems
Zbigniew Jerzak
 
Shared Memory Multi Processor
Shared Memory Multi Processor
babuece
 
Parallel Programing Model
Parallel Programing Model
Adlin Jeena
 
OSI MODEL
OSI MODEL
Ujjwal 'Shanu'
 
Static networks
Static networks
mohamed_awad121
 
Synchronization in distributed systems
Synchronization in distributed systems
SHATHAN
 
Operating system services 9
Operating system services 9
myrajendra
 
Distributed Deadlock Detection.ppt
Distributed Deadlock Detection.ppt
Babar Kamran Ahmed (LION)
 
Networking Models
Networking Models
Aftab Mirza
 
Dichotomy of parallel computing platforms
Dichotomy of parallel computing platforms
Syed Zaid Irshad
 
Mac addresses(media access control)
Mac addresses(media access control)
Ismail Mukiibi
 
Agent discovery& registration
Agent discovery& registration
rajisri2
 
Interconnection Network
Interconnection Network
Heman Pathak
 
Communication costs in parallel machines
Communication costs in parallel machines
Syed Zaid Irshad
 
Introduction to Distributed System
Introduction to Distributed System
Sunita Sahu
 
Multiprocessor Architecture (Advanced computer architecture)
Multiprocessor Architecture (Advanced computer architecture)
vani261
 
Networking and Internetworking Devices
Networking and Internetworking Devices
21viveksingh
 
8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems
Dr Sandeep Kumar Poonia
 
Clock Synchronization in Distributed Systems
Clock Synchronization in Distributed Systems
Zbigniew Jerzak
 
Shared Memory Multi Processor
Shared Memory Multi Processor
babuece
 
Parallel Programing Model
Parallel Programing Model
Adlin Jeena
 
Synchronization in distributed systems
Synchronization in distributed systems
SHATHAN
 
Operating system services 9
Operating system services 9
myrajendra
 
Networking Models
Networking Models
Aftab Mirza
 
Dichotomy of parallel computing platforms
Dichotomy of parallel computing platforms
Syed Zaid Irshad
 
Mac addresses(media access control)
Mac addresses(media access control)
Ismail Mukiibi
 
Agent discovery& registration
Agent discovery& registration
rajisri2
 

Viewers also liked (20)

Parallel Algorithms
Parallel Algorithms
Dr Sandeep Kumar Poonia
 
Parallel algorithms
Parallel algorithms
guest084d20
 
Parallel sorting algorithm
Parallel sorting algorithm
Richa Kumari
 
Parallel algorithms
Parallel algorithms
Danish Javed
 
Parallel sorting Algorithms
Parallel sorting Algorithms
GARIMA SHAKYA
 
Parallel sorting
Parallel sorting
Mr. Vikram Singh Slathia
 
parallel Merging
parallel Merging
Richa Kumari
 
Parallel Algorithms
Parallel Algorithms
Dr Sandeep Kumar Poonia
 
Parallel Algorithms Advantages and Disadvantages
Parallel Algorithms Advantages and Disadvantages
Murtadha Alsabbagh
 
Parallel algorithm in linear algebra
Parallel algorithm in linear algebra
Harshana Madusanka Jayamaha
 
Parallel Computing
Parallel Computing
Ameya Waghmare
 
Applications of paralleL processing
Applications of paralleL processing
Page Maker
 
Introduction to parallel processing
Introduction to parallel processing
Page Maker
 
Parallel computing
Parallel computing
Vinay Gupta
 
Chapter 3 pc
Chapter 3 pc
Hanif Durad
 
NaveeN
NaveeN
Naveen Chinnasamy
 
Aa sort-v4
Aa sort-v4
Malithi Edirisinghe
 
Paralell
Paralell
Mark Vicuna
 
Equations Revision
Equations Revision
andrewhickson
 
Cramer’s Rule OF Matrix
Cramer’s Rule OF Matrix
Abi Malik
 
Ad

Similar to Parallel Algorithm Models (20)

L09
L09
premu44
 
Adaptive job scheduling with load balancing for workflow application
Adaptive job scheduling with load balancing for workflow application
iaemedu
 
Model Based Hierarchical and Distributed Control of Discrete Event Robotic Sy...
Model Based Hierarchical and Distributed Control of Discrete Event Robotic Sy...
Waqas Tariq
 
Load balancing in Distributed Systems
Load balancing in Distributed Systems
Richa Singh
 
Multithreaded sparse QR
Multithreaded sparse QR
Alfredo Buttari
 
Max Min Fair Scheduling Algorithm using In Grid Scheduling with Load Balancing
Max Min Fair Scheduling Algorithm using In Grid Scheduling with Load Balancing
IJORCS
 
Architectural Design in Software Engineering SE10
Architectural Design in Software Engineering SE10
koolkampus
 
Adaptive Execution Support for Malleable Computation
Adaptive Execution Support for Malleable Computation
Qian Lin
 
Pipelining understandingPipelining is running multiple stages of .pdf
Pipelining understandingPipelining is running multiple stages of .pdf
arasanlethers
 
Grid computing for load balancing strategies
Grid computing for load balancing strategies
International Journal of Science and Research (IJSR)
 
Pdp12
Pdp12
Vincenzo De Florio
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Subhajit Sahu
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Subhajit Sahu
 
Lecture 4 principles of parallel algorithm design updated
Lecture 4 principles of parallel algorithm design updated
Vajira Thambawita
 
Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...
Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...
eSAT Journals
 
International Journal of Computer Networks & Communications (IJCNC)
International Journal of Computer Networks & Communications (IJCNC)
sandromathew616
 
Deadline-Aware Task Scheduling Strategy for Reducing Network Contention in No...
Deadline-Aware Task Scheduling Strategy for Reducing Network Contention in No...
ijcncjournal019
 
Deadline-Aware Task Scheduling Strategy for Reducing Network Contention in No...
Deadline-Aware Task Scheduling Strategy for Reducing Network Contention in No...
IJCNCJournal
 
Deadline-Aware Task Scheduling Strategy for Reducing Network Contention in No...
Deadline-Aware Task Scheduling Strategy for Reducing Network Contention in No...
IJCNCJournal
 
1844 1849
1844 1849
Editor IJARCET
 
Adaptive job scheduling with load balancing for workflow application
Adaptive job scheduling with load balancing for workflow application
iaemedu
 
Model Based Hierarchical and Distributed Control of Discrete Event Robotic Sy...
Model Based Hierarchical and Distributed Control of Discrete Event Robotic Sy...
Waqas Tariq
 
Load balancing in Distributed Systems
Load balancing in Distributed Systems
Richa Singh
 
Max Min Fair Scheduling Algorithm using In Grid Scheduling with Load Balancing
Max Min Fair Scheduling Algorithm using In Grid Scheduling with Load Balancing
IJORCS
 
Architectural Design in Software Engineering SE10
Architectural Design in Software Engineering SE10
koolkampus
 
Adaptive Execution Support for Malleable Computation
Adaptive Execution Support for Malleable Computation
Qian Lin
 
Pipelining understandingPipelining is running multiple stages of .pdf
Pipelining understandingPipelining is running multiple stages of .pdf
arasanlethers
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Subhajit Sahu
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Subhajit Sahu
 
Lecture 4 principles of parallel algorithm design updated
Lecture 4 principles of parallel algorithm design updated
Vajira Thambawita
 
Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...
Cache mechanism to avoid dulpication of same thing in hadoop system to speed ...
eSAT Journals
 
International Journal of Computer Networks & Communications (IJCNC)
International Journal of Computer Networks & Communications (IJCNC)
sandromathew616
 
Deadline-Aware Task Scheduling Strategy for Reducing Network Contention in No...
Deadline-Aware Task Scheduling Strategy for Reducing Network Contention in No...
ijcncjournal019
 
Deadline-Aware Task Scheduling Strategy for Reducing Network Contention in No...
Deadline-Aware Task Scheduling Strategy for Reducing Network Contention in No...
IJCNCJournal
 
Deadline-Aware Task Scheduling Strategy for Reducing Network Contention in No...
Deadline-Aware Task Scheduling Strategy for Reducing Network Contention in No...
IJCNCJournal
 
Ad

More from Martin Coronel (10)

iGod: Chatting with the Almighty
iGod: Chatting with the Almighty
Martin Coronel
 
Estado del arte v3.0
Estado del arte v3.0
Martin Coronel
 
Tesis junio
Tesis junio
Martin Coronel
 
Web browsers and semantic metadata
Web browsers and semantic metadata
Martin Coronel
 
Anotaciones semanticas recursos
Anotaciones semanticas recursos
Martin Coronel
 
Anotaciones semanticas
Anotaciones semanticas
Martin Coronel
 
Presentacion Proyecto Multiprocesamiento
Presentacion Proyecto Multiprocesamiento
Martin Coronel
 
Open Innovation
Open Innovation
Martin Coronel
 
Trabajando con datos Compuestos
Trabajando con datos Compuestos
Martin Coronel
 
Cap I Plsql
Cap I Plsql
Martin Coronel
 
iGod: Chatting with the Almighty
iGod: Chatting with the Almighty
Martin Coronel
 
Web browsers and semantic metadata
Web browsers and semantic metadata
Martin Coronel
 
Anotaciones semanticas recursos
Anotaciones semanticas recursos
Martin Coronel
 
Anotaciones semanticas
Anotaciones semanticas
Martin Coronel
 
Presentacion Proyecto Multiprocesamiento
Presentacion Proyecto Multiprocesamiento
Martin Coronel
 
Trabajando con datos Compuestos
Trabajando con datos Compuestos
Martin Coronel
 

Recently uploaded (20)

“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
 
Daily Lesson Log MATATAG ICT TEchnology 8
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
"Scaling in space and time with Temporal", Andriy Lupa.pdf
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
 
Connecting Data and Intelligence: The Role of FME in Machine Learning
Connecting Data and Intelligence: The Role of FME in Machine Learning
Safe Software
 
cnc-processing-centers-centateq-p-110-en.pdf
cnc-processing-centers-centateq-p-110-en.pdf
AmirStern2
 
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
 
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
digitaljignect
 
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
 
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
revolcs10
 
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
 
Quantum AI: Where Impossible Becomes Probable
Quantum AI: Where Impossible Becomes Probable
Saikat Basu
 
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
Safe Software
 
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
 
The Growing Value and Application of FME & GenAI
The Growing Value and Application of FME & GenAI
Safe Software
 
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
 
Daily Lesson Log MATATAG ICT TEchnology 8
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
"Scaling in space and time with Temporal", Andriy Lupa.pdf
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
 
Connecting Data and Intelligence: The Role of FME in Machine Learning
Connecting Data and Intelligence: The Role of FME in Machine Learning
Safe Software
 
cnc-processing-centers-centateq-p-110-en.pdf
cnc-processing-centers-centateq-p-110-en.pdf
AmirStern2
 
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
 
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
digitaljignect
 
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
 
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
revolcs10
 
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
 
Quantum AI: Where Impossible Becomes Probable
Quantum AI: Where Impossible Becomes Probable
Saikat Basu
 
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
Safe Software
 
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
 
The Growing Value and Application of FME & GenAI
The Growing Value and Application of FME & GenAI
Safe Software
 

Parallel Algorithm Models

  • 2. The data-parallel model is one of the simplest algorithm models. In this model, the tasks are statically or semi-statically mapped onto processes and each task performs similar operations on different data. This type of parallelism that is a result of identical operations being applied concurrently on different data items is called data parallelism. The work may be done in phases and the data operated upon in different phases may be different. Typically, data-parallel computation phases are interspersed with interactions to synchronize the tasks or to get fresh data to the tasks. Since all tasks perform similar computations, the decomposition of the problem into tasks is usually based on data partitioning because a uniform partitioning of data followed by a static mapping is sufficient to guarantee load balance.
  • 3. Data-parallel algorithms can be implemented in both shared-address-space and message-passing paradigms. However, the partitioned address-space in a message-passing paradigm may allow better control of placement, and thus may offer a better handle on locality. On the other hand, shared-address space can ease the programming effort, especially if the distribution of data is different in different phases of the algorithm. Interaction overheads in the data-parallel model can be minimized by choosing a locality preserving decomposition and, if applicable, by overlapping computation and interaction and by using optimized collective interaction routines. A key characteristic of data-parallel problems is that for most problems, the degree of data parallelism increases with the size of the problem, making it possible to use more processes to effectively solve larger problems.
  • 4.  
  • 5. the computations in any parallel algorithm can be viewed as a task-dependency graph. The task-dependency graph may be either trivial, as in the case of matrix multiplication, or nontrivial (Problem 3.5). However, in certain parallel algorithms, the task-dependency graph is explicitly used in mapping. In the task graph model, the interrelationships among the tasks are utilized to promote locality or to reduce interaction costs. This model is typically employed to solve problems in which the amount of data associated with the tasks is large relative to the amount of computation associated with them. Usually, tasks are mapped statically to help optimize the cost of data movement among tasks. Sometimes a decentralized dynamic mapping may be used, but even then, the mapping uses the information about the task-dependency graph structure and the interaction pattern of tasks to minimize interaction overhead. Work is more easily shared in paradigms with globally addressable space, but mechanisms are available to share work in disjoint address space. Typical interaction-reducing techniques applicable to this model include reducing the volume and frequency of interaction by promoting locality while mapping the tasks based on the interaction pattern of tasks, and using asynchronous interaction methods to overlap the interaction with computation.
  • 6. The work pool or the task pool model is characterized by a dynamic mapping of tasks onto processes for load balancing in which any task may potentially be performed by any process. There is no desired premapping of tasks onto processes. The mapping may be centralized or decentralized. Pointers to the tasks may be stored in a physically shared list, priority queue, hash table, or tree, or they could be stored in a physically distributed data structure. The work may be statically available in the beginning, or could be dynamically generated; i.e., the processes may generate work and add it to the global (possibly distributed) work pool. If the work is generated dynamically and a decentralized mapping is used, then a termination detection algorithm ( Section 11.4.4 ) would be required so that all processes can actually detect the completion of the entire program (i.e., exhaustion of all potential tasks) and stop looking for more work. In the message-passing paradigm, the work pool model is typically used when the amount of data associated with tasks is relatively small compared to the computation associated with the tasks. As a result, tasks can be readily moved around without causing too much data interaction overhead. The granularity of the tasks can be adjusted to attain the desired level of tradeoff between load-imbalance and the overhead of accessing the work pool for adding and extracting tasks.
  • 7. In the master-slave or the manager-worker model, one or more master processes generate work and allocate it to worker processes. The tasks may be allocated a priori if the manager can estimate the size of the tasks or if a random mapping can do an adequate job of load balancing. In another scenario, workers are assigned smaller pieces of work at different times. The latter scheme is preferred if it is time consuming for the master to generate work and hence it is not desirable to make all workers wait until the master has generated all work pieces. In some cases, work may need to be performed in phases, and work in each phase must finish before work in the next phases can be generated. In this case, the manager may cause all workers to synchronize after each phase. Usually, there is no desired premapping of work to processes, and any worker can do any job assigned to it. The manager-worker model can be generalized to the hierarchical or multi-level manager-worker model in which the top-level manager feeds large chunks of tasks to second-level managers, who further subdivide the tasks among their own workers and may perform part of the work themselves. This model is generally equally suitable to shared-address-space or message-passing paradigms since the interaction is naturally two-way; i.e., the manager knows that it needs to give out work and workers know that they need to get work from the manager.
  • 8. While using the master-slave model, care should be taken to ensure that the master does not become a bottleneck, which may happen if the tasks are too small (or the workers are relatively fast). The granularity of tasks should be chosen such that the cost of doing work dominates the cost of transferring work and the cost of synchronization. Asynchronous interaction may help overlap interaction and the computation associated with work generation by the master. It may also reduce waiting times if the nature of requests from workers is non-deterministic.
  • 9. In the pipeline model, a stream of data is passed on through a succession of processes, each of which perform some task on it. This simultaneous execution of different programs on a data stream is called stream parallelism. With the exception of the process initiating the pipeline, the arrival of new data triggers the execution of a new task by a process in the pipeline. The processes could form such pipelines in the shape of linear or multidimensional arrays, trees, or general graphs with or without cycles. A pipeline is a chain of producers and consumers. Each process in the pipeline can be viewed as a consumer of a sequence of data items for the process preceding it in the pipeline and as a producer of data for the process following it in the pipeline. The pipeline does not need to be a linear chain; it can be a directed graph. The pipeline model usually involves a static mapping of tasks onto processes. Load balancing is a function of task granularity. The larger the granularity, the longer it takes to fill up the pipeline, i.e. for the trigger produced by the first process in the chain to propagate to the last process, thereby keeping some of the processes waiting. However, too fine a granularity may increase interaction overheads because processes will need to interact to receive fresh data after smaller pieces of computation. The most common interaction reduction technique applicable to this model is overlapping interaction with computation.
  • 10. In some cases, more than one model may be applicable to the problem at hand, resulting in a hybrid algorithm model. A hybrid model may be composed either of multiple models applied hierarchically or multiple models applied sequentially to different phases of a parallel algorithm. In some cases, an algorithm formulation may have characteristics of more than one algorithm model. For instance, data may flow in a pipelined manner in a pattern guided by a task-dependency graph. In another scenario, the major computation may be described by a task-dependency graph, but each node of the graph may represent a supertask comprising multiple subtasks that may be suitable for data-parallel or pipelined parallelism.
  • 11. GRAMA, GUPTA. Introduction to Parallel Computing 2nd Edition. Pearson Addison Wesley.