SlideShare a Scribd company logo
Parallel and Distributed
Computing
CST342-3
Vajira Thambawita
Learning Outcomes
At the end of the course, the students will be able to
• - define Parallel Algorithms
• - recognize parallel speedup and performance analysis
• - identify task decomposition techniques
• - perform Parallel Programming
• - apply acceleration strategies for algorithms
Contents
• Sequential Computing, History of Parallel Computation, Flynn’s
Taxonomy, Process, threads, Pipeline, parallel models, Shared
Memory UMA,NUMA, CCUMA, Ring ,Mesh , Hypercube topologies,
Cost and Complexity analysis of the interconnection networks, Task
Partition , Data Decomposition, Task Mapping, Tasks and
Decomposition , Processes and Mapping ,Processes Versus
Processors, Granularity, processing, elements, Speedup , Efficiency ,
overhead, Practical ,Introduction to Pthered library, CUDA program ,
MPICH, Introduction to Distributed Computing, Centralized System ,
Comparison , mini Computer ,Workstation models, Process pool ,
analysis, Distributed OS, Remote procedure call ,RPC, Sun RPC,
Distributed Resource Management, Fault Tolerance
References
• Ananth,G, Anshul,G, Karypis,G and Kumar,V, 2003, Introduction to
Parallel Computing , 2nd Edition , Addison Wesley
Optional References:
• CUDA Toolkit Documentation
• Introduction to Parallel Computing, Second Edition By Ananth Grama,
Anshul Gupta, George Karypis, Vipin Kumar
• Programming on Parallel Machines, Norm Matloff
• Introduction to High Performance Computing for Scientists and
Engineers, Georg Hager, Gerhard Wellein
Evaluation
• Continuous Assessment:
• 60% - Lab assignments, Tutorials, Quizzes,
• End Semester Examination:
• 40% - 2hrs or 3hrs paper
Knowledge
• Data structures and algorithms
• C programming
History of computing
Four decades of computing
• Batch Era
• Time sharing Era
• Desktop Era
• Network Era
Batch era
• Batch processing
• Is execution of a series of programs on a computer
without manual intervention
• The term originated in the days when users entered
programs on punch cards
Time-sharing Era
• time-sharing is the sharing of a computing
resource among many users by means of
multiprogramming and multi-tasking
• Developing a system that supported multiple
users at the same time
Desktop Era
• Personal Computers (PCs)
• With WAN
Network Era
• Systems with:
• Shared memory
• Distributed memory
• Example for parallel computers: Intel iPSC, nCUBE
FLYNN's taxonomy of computer
architecture
Two types of information flow into processor:
 Instructions
 Data
what are instructions and data?
FLYNN's taxonomy of computer
architecture
1. single-instruction single-data streams (SISD)
2. single-instruction multiple-data streams (SIMD)
3. multiple-instruction single-data streams (MISD)
4. multiple-instruction multiple-data streams (MIMD)
Parallel computing?
Serial computing
Parallel computing?
Parallel Computers
• all stand-alone computers today are parallel from a hardware
perspective
Parallel Computers
• Networks connect multiple stand-alone computers (nodes) to make
larger parallel computer clusters.
Why Use Parallel Computing?
• SAVE TIME AND/OR MONEY:
Why Use Parallel Computing?
• SOLVE LARGER / MORE COMPLEX PROBLEMS
Grand Challenge Problems ?
Why Use Parallel Computing?
• PROVIDE CONCURRENCY
Why Use Parallel Computing?
• TAKE ADVANTAGE OF NON-LOCAL RESOURCES:
Why Use Parallel Computing?
• MAKE BETTER USE OF UNDERLYING PARALLEL HARDWARE
• Modern computers, even laptops, are parallel in architecture with multiple
processors/cores
BACK to Flynn's Classical Taxonomy
Single Instruction Single Data
(SISD)
• A serial (non-parallel) computer
• This is the oldest type of computer
UNIVAC1
IBM 360
CRAY1 CDC 7600 PDP1
Single Instruction Multiple Data
(SIMD)
ILLIAC IV
MasPar
Cray X-MP
Cray Y-MP
Cell Processor (GPU)
Multiple Instruction Single Data
The Space Shuttle flight control computers
Multiple Instruction Multiple Data
(MIMD)
IBM POWER5
HP/Compaq Alphaserver
Intel IA32
AMD Opteron
What are we going to learn?
Shared Memory System
• A shared memory system typically accomplishes
interprocessor coordination through a global memory shared
by all processors.
• Ex: Server systems, GPGPU
Message Passing System
(Distributed Memory)
• This kind of systems typically combine the local
memory and processor at each node of the
interconnection network
• There is no global memory
• Use message passing technique to move data from
one local memory to another
Limits and Costs of Parallel Programming
• Amdahl's Law:
Amdahl's Law states that potential program speedup is defined by the
fraction of code (P) that can be parallelized:
𝑆𝑝𝑒𝑒𝑑𝑢𝑝 =
1
1 − 𝑝
• If none of the code can be parallelized, P = 0 and the speedup = 1 (no
speedup).
• If all of the code is parallelized, P = 1 and the speedup is infinite (in
theory).
Limits and Costs of Parallel Programming
• If 50% of the code can be parallelized, maximum speedup = 2,
meaning the code will run twice as fast.
Limits and Costs of Parallel Programming
• Introducing the number of processors performing the parallel fraction
of work, the relationship can be modeled by:
𝑠𝑝𝑒𝑒𝑑𝑢𝑝 =
1
𝑃
𝑁
+ 𝑆
• where P = parallel fraction, N = number of processors and S = serial
fraction
Limits and Costs of Parallel Programming
Next
• Parallel Computer Memory Architectures

More Related Content

What's hot (20)

OS - Process Concepts
OS - Process Concepts
Mukesh Chinta
 
Distributed file system
Distributed file system
Anamika Singh
 
System models in distributed system
System models in distributed system
ishapadhy
 
Distributed Systems Architecture in Software Engineering SE11
Distributed Systems Architecture in Software Engineering SE11
koolkampus
 
distributed shared memory
distributed shared memory
Ashish Kumar
 
Memory Management in OS
Memory Management in OS
vampugani
 
Parallel Algorithms
Parallel Algorithms
Dr Sandeep Kumar Poonia
 
Replication in Distributed Systems
Replication in Distributed Systems
Kavya Barnadhya Hazarika
 
Cloud Resource Management
Cloud Resource Management
NASIRSAYYED4
 
Parallel algorithms
Parallel algorithms
Danish Javed
 
Trends in distributed systems
Trends in distributed systems
Jayanthi Radhakrishnan
 
Kernel (OS)
Kernel (OS)
عطاءالمنعم اثیل شیخ
 
Free Space Management, Efficiency & Performance, Recovery and NFS
Free Space Management, Efficiency & Performance, Recovery and NFS
United International University
 
Introduction to Distributed System
Introduction to Distributed System
Sunita Sahu
 
Message passing in Distributed Computing Systems
Message passing in Distributed Computing Systems
Alagappa Govt Arts College, Karaikudi
 
Parallel Programming
Parallel Programming
Uday Sharma
 
program partitioning and scheduling IN Advanced Computer Architecture
program partitioning and scheduling IN Advanced Computer Architecture
Pankaj Kumar Jain
 
Code optimization in compiler design
Code optimization in compiler design
Kuppusamy P
 
Implementation levels of virtualization
Implementation levels of virtualization
Gokulnath S
 
Query trees
Query trees
Shefa Idrees
 
OS - Process Concepts
OS - Process Concepts
Mukesh Chinta
 
Distributed file system
Distributed file system
Anamika Singh
 
System models in distributed system
System models in distributed system
ishapadhy
 
Distributed Systems Architecture in Software Engineering SE11
Distributed Systems Architecture in Software Engineering SE11
koolkampus
 
distributed shared memory
distributed shared memory
Ashish Kumar
 
Memory Management in OS
Memory Management in OS
vampugani
 
Cloud Resource Management
Cloud Resource Management
NASIRSAYYED4
 
Parallel algorithms
Parallel algorithms
Danish Javed
 
Free Space Management, Efficiency & Performance, Recovery and NFS
Free Space Management, Efficiency & Performance, Recovery and NFS
United International University
 
Introduction to Distributed System
Introduction to Distributed System
Sunita Sahu
 
Parallel Programming
Parallel Programming
Uday Sharma
 
program partitioning and scheduling IN Advanced Computer Architecture
program partitioning and scheduling IN Advanced Computer Architecture
Pankaj Kumar Jain
 
Code optimization in compiler design
Code optimization in compiler design
Kuppusamy P
 
Implementation levels of virtualization
Implementation levels of virtualization
Gokulnath S
 

Similar to Lecture 1 introduction to parallel and distributed computing (20)

intro, definitions, basic laws+.pptx
intro, definitions, basic laws+.pptx
ssuser413a98
 
Chapter 1 - introduction - parallel computing
Chapter 1 - introduction - parallel computing
Heman Pathak
 
Parallel computing
Parallel computing
Vinay Gupta
 
Par com
Par com
tttoracle
 
unit_1.pdf
unit_1.pdf
JyotiChoudhary469897
 
Distributed Computing
Distributed Computing
Sudarsun Santhiappan
 
Parallel computing persentation
Parallel computing persentation
VIKAS SINGH BHADOURIA
 
Parallel Computing-Part-1.pptx
Parallel Computing-Part-1.pptx
krnaween
 
distributed system lab materials about ad
distributed system lab materials about ad
milkesa13
 
ceg4131_models.ppthjjjjjjjhhjhjhjhjhjhjhj
ceg4131_models.ppthjjjjjjjhhjhjhjhjhjhjhj
431m2rn14g
 
Chap2 GGKK.ppt
Chap2 GGKK.ppt
aminnezarat
 
Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)
Sudarshan Mondal
 
introduction to advanced distributed system
introduction to advanced distributed system
milkesa13
 
Asynchronous and Parallel Programming in .NET
Asynchronous and Parallel Programming in .NET
ssusere19c741
 
parallel computing.ppt
parallel computing.ppt
ssuser413a98
 
2 parallel processing presentation ph d 1st semester
2 parallel processing presentation ph d 1st semester
Rafi Ullah
 
20090720 smith
20090720 smith
Michael Karpov
 
Parallel Programming Models: Shared variable model, Message passing model, Da...
Parallel Programming Models: Shared variable model, Message passing model, Da...
SHASHIKANT346021
 
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
VAISHNAVI MADHAN
 
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
KRamasamy2
 
intro, definitions, basic laws+.pptx
intro, definitions, basic laws+.pptx
ssuser413a98
 
Chapter 1 - introduction - parallel computing
Chapter 1 - introduction - parallel computing
Heman Pathak
 
Parallel computing
Parallel computing
Vinay Gupta
 
Parallel Computing-Part-1.pptx
Parallel Computing-Part-1.pptx
krnaween
 
distributed system lab materials about ad
distributed system lab materials about ad
milkesa13
 
ceg4131_models.ppthjjjjjjjhhjhjhjhjhjhjhj
ceg4131_models.ppthjjjjjjjhhjhjhjhjhjhjhj
431m2rn14g
 
Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)
Sudarshan Mondal
 
introduction to advanced distributed system
introduction to advanced distributed system
milkesa13
 
Asynchronous and Parallel Programming in .NET
Asynchronous and Parallel Programming in .NET
ssusere19c741
 
parallel computing.ppt
parallel computing.ppt
ssuser413a98
 
2 parallel processing presentation ph d 1st semester
2 parallel processing presentation ph d 1st semester
Rafi Ullah
 
Parallel Programming Models: Shared variable model, Message passing model, Da...
Parallel Programming Models: Shared variable model, Message passing model, Da...
SHASHIKANT346021
 
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
VAISHNAVI MADHAN
 
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
Floating Point Operations , Memory Chip Organization , Serial Bus Architectur...
KRamasamy2
 
Ad

More from Vajira Thambawita (20)

Lecture 4 principles of parallel algorithm design updated
Lecture 4 principles of parallel algorithm design updated
Vajira Thambawita
 
Lecture 3 parallel programming platforms
Lecture 3 parallel programming platforms
Vajira Thambawita
 
Lecture 2 more about parallel computing
Lecture 2 more about parallel computing
Vajira Thambawita
 
Lecture 12 localization and navigation
Lecture 12 localization and navigation
Vajira Thambawita
 
Lecture 11 neural network principles
Lecture 11 neural network principles
Vajira Thambawita
 
Lecture 10 mobile robot design
Lecture 10 mobile robot design
Vajira Thambawita
 
Lecture 09 control
Lecture 09 control
Vajira Thambawita
 
Lecture 08 robots and controllers
Lecture 08 robots and controllers
Vajira Thambawita
 
Lecture 07 more about pic
Lecture 07 more about pic
Vajira Thambawita
 
Lecture 06 pic programming in c
Lecture 06 pic programming in c
Vajira Thambawita
 
Lecture 05 pic io port programming
Lecture 05 pic io port programming
Vajira Thambawita
 
Lecture 04 branch call and time delay
Lecture 04 branch call and time delay
Vajira Thambawita
 
Lecture 03 basics of pic
Lecture 03 basics of pic
Vajira Thambawita
 
Lecture 02 mechatronics systems
Lecture 02 mechatronics systems
Vajira Thambawita
 
Lecture 1 - Introduction to embedded system and Robotics
Lecture 1 - Introduction to embedded system and Robotics
Vajira Thambawita
 
Lec 09 - Registers and Counters
Lec 09 - Registers and Counters
Vajira Thambawita
 
Lec 08 - DESIGN PROCEDURE
Lec 08 - DESIGN PROCEDURE
Vajira Thambawita
 
Lec 07 - ANALYSIS OF CLOCKED SEQUENTIAL CIRCUITS
Lec 07 - ANALYSIS OF CLOCKED SEQUENTIAL CIRCUITS
Vajira Thambawita
 
Lec 06 - Synchronous Sequential Logic
Lec 06 - Synchronous Sequential Logic
Vajira Thambawita
 
Lec 05 - Combinational Logic
Lec 05 - Combinational Logic
Vajira Thambawita
 
Lecture 4 principles of parallel algorithm design updated
Lecture 4 principles of parallel algorithm design updated
Vajira Thambawita
 
Lecture 3 parallel programming platforms
Lecture 3 parallel programming platforms
Vajira Thambawita
 
Lecture 2 more about parallel computing
Lecture 2 more about parallel computing
Vajira Thambawita
 
Lecture 12 localization and navigation
Lecture 12 localization and navigation
Vajira Thambawita
 
Lecture 11 neural network principles
Lecture 11 neural network principles
Vajira Thambawita
 
Lecture 10 mobile robot design
Lecture 10 mobile robot design
Vajira Thambawita
 
Lecture 08 robots and controllers
Lecture 08 robots and controllers
Vajira Thambawita
 
Lecture 06 pic programming in c
Lecture 06 pic programming in c
Vajira Thambawita
 
Lecture 05 pic io port programming
Lecture 05 pic io port programming
Vajira Thambawita
 
Lecture 04 branch call and time delay
Lecture 04 branch call and time delay
Vajira Thambawita
 
Lecture 02 mechatronics systems
Lecture 02 mechatronics systems
Vajira Thambawita
 
Lecture 1 - Introduction to embedded system and Robotics
Lecture 1 - Introduction to embedded system and Robotics
Vajira Thambawita
 
Lec 09 - Registers and Counters
Lec 09 - Registers and Counters
Vajira Thambawita
 
Lec 07 - ANALYSIS OF CLOCKED SEQUENTIAL CIRCUITS
Lec 07 - ANALYSIS OF CLOCKED SEQUENTIAL CIRCUITS
Vajira Thambawita
 
Lec 06 - Synchronous Sequential Logic
Lec 06 - Synchronous Sequential Logic
Vajira Thambawita
 
Lec 05 - Combinational Logic
Lec 05 - Combinational Logic
Vajira Thambawita
 
Ad

Recently uploaded (20)

What are the benefits that dance brings?
What are the benefits that dance brings?
memi27
 
How to Configure Vendor Management in Lunch App of Odoo 18
How to Configure Vendor Management in Lunch App of Odoo 18
Celine George
 
Analysis of Quantitative Data Parametric and non-parametric tests.pptx
Analysis of Quantitative Data Parametric and non-parametric tests.pptx
Shrutidhara2
 
LDMMIA GRAD Student Check-in Orientation Sampler
LDMMIA GRAD Student Check-in Orientation Sampler
LDM & Mia eStudios
 
Sustainable Innovation with Immersive Learning
Sustainable Innovation with Immersive Learning
Leonel Morgado
 
Battle of Bookworms 2025 - U25 Literature Quiz by Pragya
Battle of Bookworms 2025 - U25 Literature Quiz by Pragya
Pragya - UEM Kolkata Quiz Club
 
Nice Dream.pdf /
Nice Dream.pdf /
ErinUsher3
 
GEOGRAPHY-Study Material [ Class 10th] .pdf
GEOGRAPHY-Study Material [ Class 10th] .pdf
SHERAZ AHMAD LONE
 
Revista digital preescolar en transformación
Revista digital preescolar en transformación
guerragallardo26
 
Vikas Bansal Himachal Pradesh: A Visionary Transforming Himachal’s Educationa...
Vikas Bansal Himachal Pradesh: A Visionary Transforming Himachal’s Educationa...
Himalayan Group of Professional Institutions (HGPI)
 
THERAPEUTIC COMMUNICATION included definition, characteristics, nurse patient...
THERAPEUTIC COMMUNICATION included definition, characteristics, nurse patient...
parmarjuli1412
 
Capitol Doctoral Presentation -June 2025.pptx
Capitol Doctoral Presentation -June 2025.pptx
CapitolTechU
 
Overview of Employee in Odoo 18 - Odoo Slides
Overview of Employee in Odoo 18 - Odoo Slides
Celine George
 
Exploring Ocean Floor Features for Middle School
Exploring Ocean Floor Features for Middle School
Marie
 
How to Manage Inventory Movement in Odoo 18 POS
How to Manage Inventory Movement in Odoo 18 POS
Celine George
 
LDMMIA Free Reiki Yoga S9 Grad Level Intuition II
LDMMIA Free Reiki Yoga S9 Grad Level Intuition II
LDM & Mia eStudios
 
Final Sketch Designs for poster production.pptx
Final Sketch Designs for poster production.pptx
bobby205207
 
BUSINESS QUIZ PRELIMS | QUIZ CLUB OF PSGCAS | 9 SEPTEMBER 2024
BUSINESS QUIZ PRELIMS | QUIZ CLUB OF PSGCAS | 9 SEPTEMBER 2024
Quiz Club of PSG College of Arts & Science
 
How to Manage Multi Language for Invoice in Odoo 18
How to Manage Multi Language for Invoice in Odoo 18
Celine George
 
Webcrawler_Mule_AIChain_MuleSoft_Meetup_Hyderabad
Webcrawler_Mule_AIChain_MuleSoft_Meetup_Hyderabad
Veera Pallapu
 
What are the benefits that dance brings?
What are the benefits that dance brings?
memi27
 
How to Configure Vendor Management in Lunch App of Odoo 18
How to Configure Vendor Management in Lunch App of Odoo 18
Celine George
 
Analysis of Quantitative Data Parametric and non-parametric tests.pptx
Analysis of Quantitative Data Parametric and non-parametric tests.pptx
Shrutidhara2
 
LDMMIA GRAD Student Check-in Orientation Sampler
LDMMIA GRAD Student Check-in Orientation Sampler
LDM & Mia eStudios
 
Sustainable Innovation with Immersive Learning
Sustainable Innovation with Immersive Learning
Leonel Morgado
 
Battle of Bookworms 2025 - U25 Literature Quiz by Pragya
Battle of Bookworms 2025 - U25 Literature Quiz by Pragya
Pragya - UEM Kolkata Quiz Club
 
Nice Dream.pdf /
Nice Dream.pdf /
ErinUsher3
 
GEOGRAPHY-Study Material [ Class 10th] .pdf
GEOGRAPHY-Study Material [ Class 10th] .pdf
SHERAZ AHMAD LONE
 
Revista digital preescolar en transformación
Revista digital preescolar en transformación
guerragallardo26
 
THERAPEUTIC COMMUNICATION included definition, characteristics, nurse patient...
THERAPEUTIC COMMUNICATION included definition, characteristics, nurse patient...
parmarjuli1412
 
Capitol Doctoral Presentation -June 2025.pptx
Capitol Doctoral Presentation -June 2025.pptx
CapitolTechU
 
Overview of Employee in Odoo 18 - Odoo Slides
Overview of Employee in Odoo 18 - Odoo Slides
Celine George
 
Exploring Ocean Floor Features for Middle School
Exploring Ocean Floor Features for Middle School
Marie
 
How to Manage Inventory Movement in Odoo 18 POS
How to Manage Inventory Movement in Odoo 18 POS
Celine George
 
LDMMIA Free Reiki Yoga S9 Grad Level Intuition II
LDMMIA Free Reiki Yoga S9 Grad Level Intuition II
LDM & Mia eStudios
 
Final Sketch Designs for poster production.pptx
Final Sketch Designs for poster production.pptx
bobby205207
 
How to Manage Multi Language for Invoice in Odoo 18
How to Manage Multi Language for Invoice in Odoo 18
Celine George
 
Webcrawler_Mule_AIChain_MuleSoft_Meetup_Hyderabad
Webcrawler_Mule_AIChain_MuleSoft_Meetup_Hyderabad
Veera Pallapu
 

Lecture 1 introduction to parallel and distributed computing

  • 2. Learning Outcomes At the end of the course, the students will be able to • - define Parallel Algorithms • - recognize parallel speedup and performance analysis • - identify task decomposition techniques • - perform Parallel Programming • - apply acceleration strategies for algorithms
  • 3. Contents • Sequential Computing, History of Parallel Computation, Flynn’s Taxonomy, Process, threads, Pipeline, parallel models, Shared Memory UMA,NUMA, CCUMA, Ring ,Mesh , Hypercube topologies, Cost and Complexity analysis of the interconnection networks, Task Partition , Data Decomposition, Task Mapping, Tasks and Decomposition , Processes and Mapping ,Processes Versus Processors, Granularity, processing, elements, Speedup , Efficiency , overhead, Practical ,Introduction to Pthered library, CUDA program , MPICH, Introduction to Distributed Computing, Centralized System , Comparison , mini Computer ,Workstation models, Process pool , analysis, Distributed OS, Remote procedure call ,RPC, Sun RPC, Distributed Resource Management, Fault Tolerance
  • 4. References • Ananth,G, Anshul,G, Karypis,G and Kumar,V, 2003, Introduction to Parallel Computing , 2nd Edition , Addison Wesley Optional References: • CUDA Toolkit Documentation • Introduction to Parallel Computing, Second Edition By Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar • Programming on Parallel Machines, Norm Matloff • Introduction to High Performance Computing for Scientists and Engineers, Georg Hager, Gerhard Wellein
  • 5. Evaluation • Continuous Assessment: • 60% - Lab assignments, Tutorials, Quizzes, • End Semester Examination: • 40% - 2hrs or 3hrs paper
  • 6. Knowledge • Data structures and algorithms • C programming
  • 8. Four decades of computing • Batch Era • Time sharing Era • Desktop Era • Network Era
  • 9. Batch era • Batch processing • Is execution of a series of programs on a computer without manual intervention • The term originated in the days when users entered programs on punch cards
  • 10. Time-sharing Era • time-sharing is the sharing of a computing resource among many users by means of multiprogramming and multi-tasking • Developing a system that supported multiple users at the same time
  • 11. Desktop Era • Personal Computers (PCs) • With WAN
  • 12. Network Era • Systems with: • Shared memory • Distributed memory • Example for parallel computers: Intel iPSC, nCUBE
  • 13. FLYNN's taxonomy of computer architecture Two types of information flow into processor:  Instructions  Data what are instructions and data?
  • 14. FLYNN's taxonomy of computer architecture 1. single-instruction single-data streams (SISD) 2. single-instruction multiple-data streams (SIMD) 3. multiple-instruction single-data streams (MISD) 4. multiple-instruction multiple-data streams (MIMD)
  • 17. Parallel Computers • all stand-alone computers today are parallel from a hardware perspective
  • 18. Parallel Computers • Networks connect multiple stand-alone computers (nodes) to make larger parallel computer clusters.
  • 19. Why Use Parallel Computing? • SAVE TIME AND/OR MONEY:
  • 20. Why Use Parallel Computing? • SOLVE LARGER / MORE COMPLEX PROBLEMS Grand Challenge Problems ?
  • 21. Why Use Parallel Computing? • PROVIDE CONCURRENCY
  • 22. Why Use Parallel Computing? • TAKE ADVANTAGE OF NON-LOCAL RESOURCES:
  • 23. Why Use Parallel Computing? • MAKE BETTER USE OF UNDERLYING PARALLEL HARDWARE • Modern computers, even laptops, are parallel in architecture with multiple processors/cores
  • 24. BACK to Flynn's Classical Taxonomy
  • 25. Single Instruction Single Data (SISD) • A serial (non-parallel) computer • This is the oldest type of computer UNIVAC1 IBM 360 CRAY1 CDC 7600 PDP1
  • 26. Single Instruction Multiple Data (SIMD) ILLIAC IV MasPar Cray X-MP Cray Y-MP Cell Processor (GPU)
  • 27. Multiple Instruction Single Data The Space Shuttle flight control computers
  • 28. Multiple Instruction Multiple Data (MIMD) IBM POWER5 HP/Compaq Alphaserver Intel IA32 AMD Opteron
  • 29. What are we going to learn?
  • 30. Shared Memory System • A shared memory system typically accomplishes interprocessor coordination through a global memory shared by all processors. • Ex: Server systems, GPGPU
  • 31. Message Passing System (Distributed Memory) • This kind of systems typically combine the local memory and processor at each node of the interconnection network • There is no global memory • Use message passing technique to move data from one local memory to another
  • 32. Limits and Costs of Parallel Programming • Amdahl's Law: Amdahl's Law states that potential program speedup is defined by the fraction of code (P) that can be parallelized: 𝑆𝑝𝑒𝑒𝑑𝑢𝑝 = 1 1 − 𝑝 • If none of the code can be parallelized, P = 0 and the speedup = 1 (no speedup). • If all of the code is parallelized, P = 1 and the speedup is infinite (in theory).
  • 33. Limits and Costs of Parallel Programming • If 50% of the code can be parallelized, maximum speedup = 2, meaning the code will run twice as fast.
  • 34. Limits and Costs of Parallel Programming • Introducing the number of processors performing the parallel fraction of work, the relationship can be modeled by: 𝑠𝑝𝑒𝑒𝑑𝑢𝑝 = 1 𝑃 𝑁 + 𝑆 • where P = parallel fraction, N = number of processors and S = serial fraction
  • 35. Limits and Costs of Parallel Programming
  • 36. Next • Parallel Computer Memory Architectures