SlideShare a Scribd company logo
Parallel Programming Patterns


                Аудиторія: розробники

                Олександр Павлишак, 2011
                pavlyshak@gmail.com
Зміст
-   Тренд
-   Основні терміни
-   Managing state
-   Паралелізм
-   Засоби
Вчора
Сьогодні
Завтра
Що відбувається?
-   Ріст частоти CPU вповільнився
-   Через фізичні обмеження
-   Free lunch is over
-   ПЗ більше не стає швидшим саме по собі
Сучасні тренди
- Manycore, multicore
- GPGPU, GPU acceleration, heterogeneous
  computing
- Distributed computing, HPC
Основні поняття
- Concurrency
  - Many interleaved threads of control
- Parallelism
  - Same result, but faster
- Concurrency != Parallelism
  - It is not always necessary to care about concurrency
    while implementing parallelism
- Multithreading
- Asynchrony
Задачі
- CPU-bound
  - number crunching
- I/O-bound
  - network, disk
Стан
- Shared
  - accessible by more than one thread
  - sharing is transitive
- Private
  - used by single thread only
Task-based program

         Application

       Tasks (CPU, I/O)

Runtime (queuing, scheduling)

Processors (threads, processes)
Managing state
Isolation
- Avoiding shared state
- Own copy of state
- Examples:
  - process isolation
  - intraprocess isolation
  - by convention
Immutability
-   Multiple read -- not a problem!
-   All functions are pure
-   Requires immutable collections
-   Functional way: Haskell, F#, Lisp
Synchronization
- The only thing that remains to deal with
  shared mutable state
- Kinds:
  - data synchronization
  - control synchronization
Data synchronization
- Why? To avoid race conditions and data
  corruption
- How? Mutual exclusion
- Data remains consistent
- Critical regions
  - locks, monitors, critical sections, spin locks
- Code-centered
  - rather than associated with data
Critical region
|Thread 1               |Thread 2
|// ...                 |// ...
|lock (locker)          |
|{                      |
|   // ...              |
|   data.Operation();   |
|   // ...              |
|}                      |
|// ...                 |lock (locker)
|                       |{
|                       |   // ...
|                       |   data.Operation();
                        |   // ...
Control synchronization
- To coordinate control flow
  - exchange data
  - orchestrate threads
- Waiting, notifications
  - spin waiting
  - events
  - alternative: continuations
Three ways to manage state
- Isolation: simple, loosely coupled, highly
scalable, right data structures, locality
- Immutability: avoids sync
- Synchronization: complex, runtime
overheads, contention

- in that order
Паралелізм
Підходи до розбиття задач
- Data parallelism
- Task parallelism
- Message based parallelism
Data parallelism
How?

- Data is divided up among hardware processors
- Same operation is performed on elements
- Optionally -- final aggregation
Data parallelism
When?

- Large amounts of data
- Processing operation is costly
- or both
Data parallelism
Why?

- To achieve speedup

- For example, with GPU acceleration:
  - hours instead of days!
Data parallelism
Embarrassingly parallel problems
- parallelizable loops
- image processing



Non-embarrassingly parallel problems
- parallel QuickSort
Data parallelism
                               ...
                         ...


  Thread 1    Thread 2
Data parallelism
Structured parallelism

- Well defined begin and end points
- Examples:
  - CoBegin
  - ForAll
CoBegin

var firstDataset = new DataItem[1000];
var secondDataset = new DataItem[1000];
var thirdDataset = new DataItem[1000];

Parallel.Invoke(
    () => Process(firstDataset),
    () => Process(secondDataset),
    () => Process(thirdDataset)
    );
Parallel For

var items = new DataItem[1000 * 1000];
// ...
Parallel.For(0, items.Length,
    i =>
        {
            Process(items[i]);
        });
Parallel ForEach

var tickers = GetNasdaqTickersStream();
Parallel.ForEach(tickers,
    ticker =>
        {
            Process(ticker);
        });
Striped Partitioning
                           ...




    Thread 1    Thread 2
Iterate complex data structures

var tree = new TreeNode();
// ...
Parallel.ForEach(
    TraversePreOrder(tree),
    node =>
        {
            Process(node);
        });
Iterate complex data
                Thread 1




                Thread 2




         ...
Declarative parallelism
var items = new DataItem[1000 * 1000];
// ...
var validItems =
    from item in items.AsParallel()
    let processedItem = Process(item)
    where processedItem.Property > 42
    select Convert(processedItem);

foreach (var item in validItems)
{
    // ...
}
Data parallelism
Challenges

-   Partitioning
-   Scheduling
-   Ordering
-   Merging
-   Aggregation
-   Concurrency hazards: data races, contention
Task parallelism
How?

- Programs are already functionally partitioned:
statements, methods etc.
- Run independent pieces in parallel
- Control synchronization
- State isolation
Task parallelism
Why?

- To achieve speedup
Task parallelism
Kinds
- Structured
  - clear begin and end points
- Unstructured
  - often demands explicit synchronization
Fork/join
-   Fork: launch tasks asynchronously
-   Join: wait until they complete
-   CoBegin, ForAll
-   Recursive decomposition
Fork/join
       Task 1



               Task 2



      Task 3




Seq                     Seq
Fork/join


Parallel.Invoke(
    () => LoadDataFromFile(),
    () => SavePreviousDataToDB(),
    () => RenewOtherDataFromWebService());
Fork/join
Task loadData =
    Task.Factory.StartNew(() => {
        // ...
    });
Task saveAnotherDataToDB =
    Task.Factory.StartNew(() => {
        // ...
    });
// ...
Task.WaitAll(loadData, saveAnotherDataToDB);
// ...
Fork/join
void Walk(TreeNode node) {
  var tasks = new[] {
      Task.Factory.StartNew(() =>
          Process(node.Value)),
      Task.Factory.StartNew(() =>
          Walk(node.Left)),
      Task.Factory.StartNew(() =>
          Walk(node.Right))
  };
  Task.WaitAll(tasks);
}
Fork/join recursive


       Root    Node

               Left
Seq    Left                 Seq
               Right

       Right   Node

               Left

               Right
Dataflow parallelism: Futures
Task<DataItem[]> loadDataFuture =
    Task.Factory.StartNew(() =>
    {
        //...
        return LoadDataFromFile();
    });

var dataIdentifier = SavePreviousDataToDB();
RenewOtherDataFromWebService(dataIdentifier);
//...
DisplayDataToUser(loadDataFuture.Result);
Dataflow parallelism: Futures

        Future




Seq              Seq     Seq
Dataflow parallelism: Futures

                                  Future

                         Future


                Future


Seq       Seq               Seq    Seq     Seq
Continuations

                       Task
                Task

       Task



Seq           Seq             Seq
Continuations
var loadData = Task.Factory.StartNew(() => {
        return LoadDataFromFile();
    });

var writeToDB = loadData.ContinueWith(dataItems =>
    {
        WriteToDatabase(dataItems.Result);
    });

var reportToUser = writeToDB.ContinueWith(t =>
    {
        // ...
    });
reportProgressToUser.Wait();
Producer/consumer
                   pipeline


reading           parsing            storing

                            parsed
          lines                                DB
                             lines
Producer/consumer
         pipeline

lines



        parsed
         lines




                 DB
Producer/consumer
var lines =
    new BlockingCollection<string>();

Task.Factory.StartNew(() =>
  {
    foreach (var line in File.ReadLines(...))
        lines.Add(line);
    lines.CompleteAdding();
  });
Producer/consumer
var dataItems =
  new BlockingCollection<DataItem>();

Task.Factory.StartNew(() =>
    {
        foreach (var line in
          lines.GetConsumingEnumerable()
        )
            dataItems.Add(Parse(line));
        dataItems.CompleteAdding();
    });
Producer/consumer
var dbTask = Task.Factory.StartNew(() =>
    {
        foreach (var item in
          dataItems.GetConsumingEnumerable()
        )
            WriteToDatabase(item);
    });

dbTask.Wait();
Task parallelism
Challenges

- Scheduling
- Cancellation
- Exception handling
- Concurrency hazards:
deadlocks, livelocks, priority inversions etc.
Message based parallelism
- Accessing shared state vs. local state
- No distinction, unfortunately
- Idea: encapsulate shared state changes into
  messages
- Async events
- Actors, agents
Засоби
Concurrent data structures
-   Concurrent Queues, Stacks, Sets, Lists
-   Blocking collections,
-   Work stealing queues
-   Lock free data structures
-   Immutable data structures
Synchronization primitives
-   Critical sections,
-   Monitors,
-   Auto- and Manual-Reset Events,
-   Coundown Events,
-   Mutexes,
-   Semaphores,
-   Timers,
-   RW locks
-   Barriers
Thread local state
- A way to achieve isolation


var parser = new ThreadLocal<Parser>(
    () => CreateParser());

Parallel.ForEach(items,
    item => parser.Value.Parse(item));
Thread pools
ThreadPool.QueueUserWorkItem(_ =>
    {
        // do some work
    });
Async
Task.Factory.StartNew(() =>
    {
        //...
        return LoadDataFromFile();
    })
    .ContinueWith(dataItems =>
        {
            WriteToDatabase(dataItems.Result);
        })
    .ContinueWith(t =>
        {
            // ...
        });
Async
var dataItems =
    await LoadDataFromFileAsync();

textBox.Text = dataItems.Count.ToString();

await WriteToDatabaseAsync(dataItems);

// continue work
Технології
-   TPL, PLINQ, C# async, TPL Dataflow
-   PPL, Intel TBB, OpenMP
-   CUDA, OpenCL, C++ AMP
-   Actors, STM
-   Many others
Підсумок
-   Програмування для багатьох CPU
-   Concurrency != parallelism
-   CPU-bound vs. I/O-bound tasks
-   Private vs. shared state
Підсумок
- Managing state:
  - Isolation
  - Immutability
  - Synchronization
     - Data: mutual exclusion
     - Control: notifications
Підсумок
- Паралелізм:
  - Data parallelism: scalable
  - Task parallelism: less scalable
  - Message based parallelism
Підсумок
- Data parallelism
  -   CoBegin
  -   Parallel ForAll
  -   Parallel ForEach
  -   Parallel ForEach over complex data structures
  -   Declarative data parallelism
- Challenges:
  partitioning, scheduling, ordering, merging, ag
  gregation, concurrency hazards
Підсумок
- Task parallelism: structured, unstructured
  - Fork/Join
     - CoBegin
     - Recursive decomposition
  - Futures
  - Continuations
  - Producer/consumer (pipelines)
- Challenges:
  scheduling, cancellation, exceptions, concurre
  ncy hazards
Підсумок
- Засоби/інструменти
  -   Компілятори, бібліотеки
  -   Concurrent data structures
  -   Synchronization primitives
  -   Thread local state
  -   Thread pools
  -   Async invocations
  -   ...
Q/A

More Related Content

PDF
MySQL innoDB split and merge pages
PDF
Accessing Data Through Hibernate; What DBAs Should Tell Developers and Vice V...
PDF
NoSQL @ CodeMash 2010
PDF
12c SQL Plan Directives
PPTX
Example R usage for oracle DBA UKOUG 2013
PPT
Jdbc oracle
PDF
Dive into Catalyst
PDF
Oracle NOLOGGING
MySQL innoDB split and merge pages
Accessing Data Through Hibernate; What DBAs Should Tell Developers and Vice V...
NoSQL @ CodeMash 2010
12c SQL Plan Directives
Example R usage for oracle DBA UKOUG 2013
Jdbc oracle
Dive into Catalyst
Oracle NOLOGGING

What's hot (13)

PPTX
Micro services workshop
PDF
Developing for Node.JS with MySQL and NoSQL
PDF
Stateful streaming data pipelines
PDF
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
PDF
Tulsa techfest Spark Core Aug 5th 2016
PDF
Oracle Join Methods and 12c Adaptive Plans
PPTX
Introduction to MapReduce and Hadoop
PDF
CBO choice between Index and Full Scan: the good, the bad and the ugly param...
PDF
Pig
PDF
Percona Live 2012PPT: MySQL Query optimization
PPT
Anatomy of classic map reduce in hadoop
PDF
Event Processing and Integration with IAS Data Processors
PDF
Nhibernate Part 2
Micro services workshop
Developing for Node.JS with MySQL and NoSQL
Stateful streaming data pipelines
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Tulsa techfest Spark Core Aug 5th 2016
Oracle Join Methods and 12c Adaptive Plans
Introduction to MapReduce and Hadoop
CBO choice between Index and Full Scan: the good, the bad and the ugly param...
Pig
Percona Live 2012PPT: MySQL Query optimization
Anatomy of classic map reduce in hadoop
Event Processing and Integration with IAS Data Processors
Nhibernate Part 2
Ad

Viewers also liked (20)

PPTX
Big data dive amazon emr processing
PDF
ATCI : 25th Anniversary: Ever Onward; Dinner Talk 18 September 2014
PDF
Big Data on Public Cloud
PDF
ผลงานวิจัยเชิงสำรวจ Cloud Computing in Thailand Readiness Survey 2014
PDF
Cloud Computing กับความ พร้อมภาครัฐพาประเทศไทยเข้าสู่ AEC
PDF
Big Data on Public Cloud Using Cloudera on GoGrid & Amazon EMR
PDF
Big data using Public Cloud
PDF
IT Technology Trends 2014
PDF
Hadoop Workshop on EC2 : March 2015
PDF
Forecast of Big Data Trends
PPTX
Python for Big Data Analytics
PDF
Big Data Programming Using Hadoop Workshop
PDF
การบริหารจัดการระบบ Cloud Computing สำหรับองค์กรธุรกิจ SME
PDF
Data Rules
PDF
Big data processing using Cloudera Quickstart
PDF
Big data Competitions by Komes Chandavimol
PDF
เทคโนโลยี Cloud Computing สำหรับงานสถาบันการศึกษา
PDF
Big Data Analytics Using Hadoop Cluster On Amazon EMR
PDF
Set up Hadoop Cluster on Amazon EC2
PDF
Big Data Analytics using Mahout
Big data dive amazon emr processing
ATCI : 25th Anniversary: Ever Onward; Dinner Talk 18 September 2014
Big Data on Public Cloud
ผลงานวิจัยเชิงสำรวจ Cloud Computing in Thailand Readiness Survey 2014
Cloud Computing กับความ พร้อมภาครัฐพาประเทศไทยเข้าสู่ AEC
Big Data on Public Cloud Using Cloudera on GoGrid & Amazon EMR
Big data using Public Cloud
IT Technology Trends 2014
Hadoop Workshop on EC2 : March 2015
Forecast of Big Data Trends
Python for Big Data Analytics
Big Data Programming Using Hadoop Workshop
การบริหารจัดการระบบ Cloud Computing สำหรับองค์กรธุรกิจ SME
Data Rules
Big data processing using Cloudera Quickstart
Big data Competitions by Komes Chandavimol
เทคโนโลยี Cloud Computing สำหรับงานสถาบันการศึกษา
Big Data Analytics Using Hadoop Cluster On Amazon EMR
Set up Hadoop Cluster on Amazon EC2
Big Data Analytics using Mahout
Ad

Similar to Parallel programming patterns (UA) (20)

PPTX
Hadoop Introduction
PDF
ITB2019 CBStreams : Accelerate your Functional Programming with the power of ...
PDF
CBStreams - Java Streams for ColdFusion (CFML)
PPTX
Simplifying Apache Cascading
PPTX
Sherlock Homepage - A detective story about running large web services (VISUG...
PPTX
Sherlock Homepage (Maarten Balliauw)
PPTX
Sherlock Homepage - A detective story about running large web services - NDC ...
PPTX
.NET Multithreading/Multitasking
DOCX
Parallel Programming With Dot Net
PPTX
Async and parallel patterns and application design - TechDays2013 NL
PPTX
Advanced .NET Data Access with Dapper
PPTX
Parallel Processing
PPT
JS everywhere 2011
PPTX
Pegasus - automate, recover, and debug scientific computations
PDF
A Deep Dive into Structured Streaming in Apache Spark
PPTX
Eagle from eBay at China Hadoop Summit 2015
PDF
Celery with python
PDF
Data Analytics Service Company and Its Ruby Usage
PPT
Hadoop institutes in Bangalore
PPTX
Binary Studio Academy: Concurrency in C# 5.0
Hadoop Introduction
ITB2019 CBStreams : Accelerate your Functional Programming with the power of ...
CBStreams - Java Streams for ColdFusion (CFML)
Simplifying Apache Cascading
Sherlock Homepage - A detective story about running large web services (VISUG...
Sherlock Homepage (Maarten Balliauw)
Sherlock Homepage - A detective story about running large web services - NDC ...
.NET Multithreading/Multitasking
Parallel Programming With Dot Net
Async and parallel patterns and application design - TechDays2013 NL
Advanced .NET Data Access with Dapper
Parallel Processing
JS everywhere 2011
Pegasus - automate, recover, and debug scientific computations
A Deep Dive into Structured Streaming in Apache Spark
Eagle from eBay at China Hadoop Summit 2015
Celery with python
Data Analytics Service Company and Its Ruby Usage
Hadoop institutes in Bangalore
Binary Studio Academy: Concurrency in C# 5.0

Recently uploaded (20)

PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
KodekX | Application Modernization Development
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
madgavkar20181017ppt McKinsey Presentation.pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Chapter 3 Spatial Domain Image Processing.pdf
Network Security Unit 5.pdf for BCA BBA.
The AUB Centre for AI in Media Proposal.docx
Reach Out and Touch Someone: Haptics and Empathic Computing
Dropbox Q2 2025 Financial Results & Investor Presentation
MYSQL Presentation for SQL database connectivity
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
“AI and Expert System Decision Support & Business Intelligence Systems”
KodekX | Application Modernization Development
The Rise and Fall of 3GPP – Time for a Sabbatical?
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Spectral efficient network and resource selection model in 5G networks
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Machine learning based COVID-19 study performance prediction
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
madgavkar20181017ppt McKinsey Presentation.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...

Parallel programming patterns (UA)