SlideShare a Scribd company logo
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: Daniel Rubio BonillaTMPA 2017
Using Functional Directives to Analyze code
Complexity and Communication
Daniel Rubio Bonilla
HLRS – University of Stuttgart
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 2
CPU Evolution
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 3
Hazel Hen
CPU
E5-2680 v3
12 Cores
30MiB Cache
2.5 GhZ
Node
2 CPUs – 24C
128 GB
Comp. Nodes 7712
Total Cores 185,088
Performance 7420 TFlops
Storage ~10 PB
Weight 61.5 T
Power 3200 KW
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 4
Amdahl Law
# processing units
speedup
100% parallelizable
98% parallelizable
90% parallelizable
50% parallelizable
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 5
Real Amdahl Law
# processing units
speedup
100% parallelizable
98% parallelizable
90% parallelizable
50% parallelizable
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 6
The Problem
Daniel Rubio Bonilla
In High Performance Computing…
• Performance is increased by
• Integrating more cores (millions!?)
• Using heterogeneous accelerators (GPU, FPGA, ...)
• Issues
• Programmability
• Portability
TMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 7
New Approaches
Different Programming Model
• Focused on mathematical problems
• Engineering
• Science
• To enable:
• Parallelization and concurrency
• Portability across different hardware and accelerators
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 8
Our Approach
To obtain the structural information of the application by
annotating the imperative code with a functional-like
directives (mathematical / algorithmic structure)
• The main difficulty in this approach are:
• “deriving” the structure of the application
• matching the structure to the source code
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 9
Higher Order Functions
• Higher Order functions are mathematical functions
• Takes one or more function as an argument
• Can return a function as a result
• Clear repetitive execution structure
• These structures can be transformed to equivalent ones
• But with different non-functional properties
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 10
map :: (a -> b) -> [a] -> [b]
map (*2) [1,2,3,4] = [2,4,6,8]
Higher Order Functions
• Apply to all:
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 11
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl (+) 0 [1,2,3,4] = 10
map :: (a -> b) -> [a] -> [b]
map (*2) [1,2,3,4] = [2,4,6,8]
Higher Order Functions
• Apply to all:
• Reduction:
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 12
Other Higher Order Functions
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 13
Higher Order Functions Transformations
total = foldl (+) 0 vs
One possible
transformation
Only if the operation is
associative and we know
its neutral element
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 14
Transformations
• Changes in the mathematical formulation
• Or the algorithm execution
• Produce equivalent code
• Change computing load
• Change memory distribution
• Modify communication
• Allow adaptation to different architectures
• While maintaining correctness!
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 15
Hierarchical Structures
• Functional annotations allow the construction of
multiple structural levels:
• Emerging complexity of the structural information
• We distinguish between:
• Output of one Higher Order Function is input of another
• This can be achieved by analyzing the data dependencies
between the functions
• The operator of one (Higher Order) Function is composed
of other functions
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 16
Flat Structure
Graph of a Complex Structure of two same level Higher
Order Functions (HOFs)
• The output of one HOF is the input for another HOF
foldl (+) 0 (map (*2) [0..n-1])
foldl :: (a -> b -> a) -> a -> [b] -> a
map :: (a -> b) -> [a] -> [b]
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 17
Hierarchical Structure
Graph of a Complex Hierarchical Structure of two different
level Higher Order Functions (HOF)
• The operator of one HOF is another HOF
map (foldl (+) 0) [[..]..[..]]
foldl :: (a -> b -> a) -> a -> [b] -> a
map :: (a -> b) -> [a] -> [b]
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 18
Other requirements
• Strong binding between directives and code
• Description of memory organization
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 19
Example - Heat
 
t
Daniel Rubio Bonilla
 
1-D heat dissipation function
Discretization
TMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 20
Complexity
Daniel Rubio BonillaTMPA 2017
O(N_ELEM)
O(N_ITER)
O(N_ELEM)
O(1)
O(1)
O(1)
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 21
Complexity
Daniel Rubio BonillaTMPA 2017
O(N_ELEM)
O(N_ITER)
O(N_ELEM)
O(1)
O(1)
O(1)
O(1) + O(1) + O(N_ITER) * (O(N_ELEM)*O(1) + O(N_ELEM))
O(N_ITER * N_ELEM)
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 22
Transformations – Partitioning 1
let heatDiffusion = itn HEATTIMESTEP hm_array N_ITER
PAR1 v = stencil1D TKernel 1 v
where TKernel x y z = y + K * (x - 2*y + z)
HEATTIMESTEP vs = map PAR1 vs
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 23
Transformations – Partitioning 2
let heatDiffusion = itn HEATTIMESTEP hm_array N_ITER
PAR2 v = stencil1D TKernel 1 v
where TKernel x y z = y + K * (x - 2*y + z)
PAR1 vs = map PAR2 vs
HEATTIMESTEP vss = map PAR1 vss
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 24
Platform Specific Transformations
• OpenMP:
• Relatively straightforward
• MPI:
• Communication
• Halos
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 25
Transformed Code
if (rank < size - 1)
MPI_Send(&hm[LOCAL_N_ELEM],1, MPI_FLOAT, rank + 1, 0, MPI_COMM_WORLD);
if (rank > 0)
MPI_Recv(&hm[0], 1, MPI_FLOAT, rank-1, 0, MPI_COMM_WORLD, &status);
if (rank > 0)
MPI_Send(&hm[1], 1, MPI_FLOAT, rank-1, 1, MPI_COMM_WORLD );
if (rank < size - 1)
MPI_Recv(&hm[LOCAL_N_ELEM+1],1,MPI_FLOAT, rank+1, 1, MPI_COMM_WORLD, 
&status);
#pragma polca stencil1D 1 G hm hm_tmp
#pragma omp parallel for
for(i=1; i<LOCAL_N_ELEM+1; i++)
{
#pragma polca G
#pragma polca input (hm[i-1] hm[i] hm[i+1]) output(hm_tmp[i])
hm_tmp[i] = hm[i] + K * (hm[i-1] + hm[i+1] - 2 * hm[i]);
}
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 26
Example - NBody
t
Daniel Rubio Bonilla
N-Body Problem
TMPA 2017
Three steps
1) Calculate Forces
2) Update Velocities
3) Calculate Position
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 27
Structure
Daniel Rubio BonillaTMPA 2017
O(1)
O(1)
O(1)
O(nIters)
O(nBodies)
O(nBodies)
O(nBodies)
O(nBodies)
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 28
Structure
Daniel Rubio BonillaTMPA 2017
O(nIters)
O(nBodies2
)
O(nBodies)
O(nBodies)
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 29
Structure
Daniel Rubio BonillaTMPA 2017
O(nIters) * (O(nBodies2
) + 2*O(nBodies))
O(nIters * nBodies2
)
O(nBodies2
)
O(nBodies)
O(nBodies)
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 30
Communication
Daniel Rubio BonillaTMPA 2017
Parallel
Parallel
Sequential
Parallel (with caution)
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 31
Conclusion
• Functional semantics can enable code:
• Transformation
• Adaptation
• But also...
• Algorithmic complexity analysis
• Communication patterns
• This information helps to predict application’s behavior
Daniel Rubio BonillaTMPA 2017
:: ::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: 32
Questions
Thank you!
Contact:
rubio@hlrs.de
Projects:
POLCA www.polca-project.eu
Smart-Dash www.dash-project.org
CλaSH www.clash-lang.org
Daniel Rubio BonillaTMPA 2017

More Related Content

PDF
TMPA-2017: Evolutionary Algorithms in Test Generation for digital systems
PDF
TMPA-2017: The Quest for Average Response Time
PDF
TMPA-2017: Technology and Tools for Developing Industrial Software Test Suite...
PDF
TMPA-2017: A Survey on Model-Based Testing Tools for Test Case Generation
PDF
TMPA-2017: Extended Context-Free Grammars Parsing with Generalized LL
PDF
TMPA-2017: 5W+1H Static Analysis Report Quality Measure
PDF
TMPA-2017: Layered Layouts for Software Systems Visualization
PDF
TMPA-2017: Functional Parser of Markdown Language Based on Monad Combining an...
TMPA-2017: Evolutionary Algorithms in Test Generation for digital systems
TMPA-2017: The Quest for Average Response Time
TMPA-2017: Technology and Tools for Developing Industrial Software Test Suite...
TMPA-2017: A Survey on Model-Based Testing Tools for Test Case Generation
TMPA-2017: Extended Context-Free Grammars Parsing with Generalized LL
TMPA-2017: 5W+1H Static Analysis Report Quality Measure
TMPA-2017: Layered Layouts for Software Systems Visualization
TMPA-2017: Functional Parser of Markdown Language Based on Monad Combining an...

Viewers also liked (20)

PDF
TMPA-2017: Conference Opening
PDF
TMPA-2017: Static Checking of Array Objects in JavaScript
PDF
TMPA-2017: Generating Cost Aware Covering Arrays For Free
PDF
TMPA-2017: Defect Report Classification in Accordance with Areas of Testing
PDF
TMPA-2017: Predicate Abstraction Based Configurable Method for Data Race Dete...
PDF
TMPA-2017: Vellvm - Verifying the LLVM
PDF
TMPA-2017: Regression Testing with Semiautomatic Test Selection for Auditing ...
PDF
TMPA-2017: A Survey of High-Performance Computing for Software Verification
PDF
TMPA-2017: Modeling of PLC-programs by High-level Coloured Petri Nets
PDF
TMPA-2017: Stemming Architectural Decay in Software Systems
PDF
TMPA-2017: Distributed Analysis of the BMC Kind: Making It Fit the Tornado Su...
PDF
TMPA-2017: Compositional Process Model Synthesis based on Interface Patterns
PDF
TMPA-2017: Live testing distributed system fault tolerance with fault injecti...
PDF
TMPA-2017: Unity Application Testing Automation with Appium and Image Recogni...
PDF
TMPA-2017: Dl-Check: Dynamic Potential Deadlock Detection Tool for Java Programs
PDF
TMPA-2017: Simple Type Based Alias Analysis for a VLIW Processor
PDF
TMPA-2015: Multi-Platform Approach to Reverse Debugging of Virtual Machines
PDF
TMPA-2015: Lexical analysis of dynamically formed string expressions
PDF
TMPA-2015: The Verification of Functional Programs by Applying Statechart Dia...
PPTX
TMPA-2015: Generation of Test Scenarios for Non Deterministic and Concurrent ...
TMPA-2017: Conference Opening
TMPA-2017: Static Checking of Array Objects in JavaScript
TMPA-2017: Generating Cost Aware Covering Arrays For Free
TMPA-2017: Defect Report Classification in Accordance with Areas of Testing
TMPA-2017: Predicate Abstraction Based Configurable Method for Data Race Dete...
TMPA-2017: Vellvm - Verifying the LLVM
TMPA-2017: Regression Testing with Semiautomatic Test Selection for Auditing ...
TMPA-2017: A Survey of High-Performance Computing for Software Verification
TMPA-2017: Modeling of PLC-programs by High-level Coloured Petri Nets
TMPA-2017: Stemming Architectural Decay in Software Systems
TMPA-2017: Distributed Analysis of the BMC Kind: Making It Fit the Tornado Su...
TMPA-2017: Compositional Process Model Synthesis based on Interface Patterns
TMPA-2017: Live testing distributed system fault tolerance with fault injecti...
TMPA-2017: Unity Application Testing Automation with Appium and Image Recogni...
TMPA-2017: Dl-Check: Dynamic Potential Deadlock Detection Tool for Java Programs
TMPA-2017: Simple Type Based Alias Analysis for a VLIW Processor
TMPA-2015: Multi-Platform Approach to Reverse Debugging of Virtual Machines
TMPA-2015: Lexical analysis of dynamically formed string expressions
TMPA-2015: The Verification of Functional Programs by Applying Statechart Dia...
TMPA-2015: Generation of Test Scenarios for Non Deterministic and Concurrent ...
Ad

Similar to TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication (20)

PDF
MS_Thesis
PDF
A Probabilistic Pointer Analysis For Speculative Optimizations
PDF
PDF
PDF
Data structures
PDF
452042223-Modern-Fortran-in-practice-pdf.pdf
PDF
Algorithms And Data Structures In VLSI Design
PPTX
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
PDF
On the necessity and inapplicability of python
PDF
On the Necessity and Inapplicability of Python
PDF
PDF
Trends In Functional Programming Henrik Nilsson
PDF
C++ For Quantitative Finance
PDF
Arvindsujeeth scaladays12
PDF
Modelling Time in Computation (Dynamic Systems)
PDF
Convex optimization user guide
PDF
L08-handout.pdf
PDF
optimization and preparation processes.pdf
PDF
programacion funcional.pdf
PDF
Trends in Functional Programming Meng Wang
MS_Thesis
A Probabilistic Pointer Analysis For Speculative Optimizations
Data structures
452042223-Modern-Fortran-in-practice-pdf.pdf
Algorithms And Data Structures In VLSI Design
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
On the necessity and inapplicability of python
On the Necessity and Inapplicability of Python
Trends In Functional Programming Henrik Nilsson
C++ For Quantitative Finance
Arvindsujeeth scaladays12
Modelling Time in Computation (Dynamic Systems)
Convex optimization user guide
L08-handout.pdf
optimization and preparation processes.pdf
programacion funcional.pdf
Trends in Functional Programming Meng Wang
Ad

More from Iosif Itkin (20)

PDF
Foundations of Software Testing Lecture 4
PPTX
QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...
PDF
Exactpro FinTech Webinar - Global Exchanges Test Oracles
PDF
Exactpro FinTech Webinar - Global Exchanges FIX Protocol
PDF
Operational Resilience in Financial Market Infrastructures
PDF
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
PDF
Testing the Intelligence of your AI
PDF
EXTENT 2019: Exactpro Quality Assurance for Financial Market Infrastructures
PDF
ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...
PPTX
EXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan Shamrai
PDF
EXTENT Talks QA Community Tbilisi 20 April 2019 - Conference Open
PDF
User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...
PPTX
QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...
PDF
QA Community Saratov: Past, Present, Future (2019-02-08)
PDF
Machine Learning and RoboCop Testing
PDF
Behaviour Driven Development: Oltre i limiti del possibile
PDF
2018 - Exactpro Year in Review
PPTX
Exactpro Discussion about Joy and Strategy
PPTX
FIX EMEA Conference 2018 - Post Trade Software Testing Challenges
PDF
BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)
Foundations of Software Testing Lecture 4
QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...
Exactpro FinTech Webinar - Global Exchanges Test Oracles
Exactpro FinTech Webinar - Global Exchanges FIX Protocol
Operational Resilience in Financial Market Infrastructures
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
Testing the Intelligence of your AI
EXTENT 2019: Exactpro Quality Assurance for Financial Market Infrastructures
ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...
EXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan Shamrai
EXTENT Talks QA Community Tbilisi 20 April 2019 - Conference Open
User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...
QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...
QA Community Saratov: Past, Present, Future (2019-02-08)
Machine Learning and RoboCop Testing
Behaviour Driven Development: Oltre i limiti del possibile
2018 - Exactpro Year in Review
Exactpro Discussion about Joy and Strategy
FIX EMEA Conference 2018 - Post Trade Software Testing Challenges
BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)

Recently uploaded (20)

PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Encapsulation theory and applications.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Approach and Philosophy of On baking technology
PDF
Machine learning based COVID-19 study performance prediction
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
A Presentation on Artificial Intelligence
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
NewMind AI Monthly Chronicles - July 2025
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Network Security Unit 5.pdf for BCA BBA.
Advanced methodologies resolving dimensionality complications for autism neur...
The AUB Centre for AI in Media Proposal.docx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Unlocking AI with Model Context Protocol (MCP)
Diabetes mellitus diagnosis method based random forest with bat algorithm
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Encapsulation theory and applications.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Review of recent advances in non-invasive hemoglobin estimation
Approach and Philosophy of On baking technology
Machine learning based COVID-19 study performance prediction
MYSQL Presentation for SQL database connectivity
A Presentation on Artificial Intelligence
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

  • 1. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: Daniel Rubio BonillaTMPA 2017 Using Functional Directives to Analyze code Complexity and Communication Daniel Rubio Bonilla HLRS – University of Stuttgart
  • 2. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 2 CPU Evolution Daniel Rubio BonillaTMPA 2017
  • 3. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 3 Hazel Hen CPU E5-2680 v3 12 Cores 30MiB Cache 2.5 GhZ Node 2 CPUs – 24C 128 GB Comp. Nodes 7712 Total Cores 185,088 Performance 7420 TFlops Storage ~10 PB Weight 61.5 T Power 3200 KW Daniel Rubio BonillaTMPA 2017
  • 4. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 4 Amdahl Law # processing units speedup 100% parallelizable 98% parallelizable 90% parallelizable 50% parallelizable Daniel Rubio BonillaTMPA 2017
  • 5. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 5 Real Amdahl Law # processing units speedup 100% parallelizable 98% parallelizable 90% parallelizable 50% parallelizable Daniel Rubio BonillaTMPA 2017
  • 6. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 6 The Problem Daniel Rubio Bonilla In High Performance Computing… • Performance is increased by • Integrating more cores (millions!?) • Using heterogeneous accelerators (GPU, FPGA, ...) • Issues • Programmability • Portability TMPA 2017
  • 7. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 7 New Approaches Different Programming Model • Focused on mathematical problems • Engineering • Science • To enable: • Parallelization and concurrency • Portability across different hardware and accelerators Daniel Rubio BonillaTMPA 2017
  • 8. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 8 Our Approach To obtain the structural information of the application by annotating the imperative code with a functional-like directives (mathematical / algorithmic structure) • The main difficulty in this approach are: • “deriving” the structure of the application • matching the structure to the source code Daniel Rubio BonillaTMPA 2017
  • 9. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 9 Higher Order Functions • Higher Order functions are mathematical functions • Takes one or more function as an argument • Can return a function as a result • Clear repetitive execution structure • These structures can be transformed to equivalent ones • But with different non-functional properties Daniel Rubio BonillaTMPA 2017
  • 10. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 10 map :: (a -> b) -> [a] -> [b] map (*2) [1,2,3,4] = [2,4,6,8] Higher Order Functions • Apply to all: Daniel Rubio BonillaTMPA 2017
  • 11. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 11 foldl :: (a -> b -> a) -> a -> [b] -> a foldl (+) 0 [1,2,3,4] = 10 map :: (a -> b) -> [a] -> [b] map (*2) [1,2,3,4] = [2,4,6,8] Higher Order Functions • Apply to all: • Reduction: Daniel Rubio BonillaTMPA 2017
  • 12. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 12 Other Higher Order Functions Daniel Rubio BonillaTMPA 2017
  • 13. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 13 Higher Order Functions Transformations total = foldl (+) 0 vs One possible transformation Only if the operation is associative and we know its neutral element Daniel Rubio BonillaTMPA 2017
  • 14. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 14 Transformations • Changes in the mathematical formulation • Or the algorithm execution • Produce equivalent code • Change computing load • Change memory distribution • Modify communication • Allow adaptation to different architectures • While maintaining correctness! Daniel Rubio BonillaTMPA 2017
  • 15. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 15 Hierarchical Structures • Functional annotations allow the construction of multiple structural levels: • Emerging complexity of the structural information • We distinguish between: • Output of one Higher Order Function is input of another • This can be achieved by analyzing the data dependencies between the functions • The operator of one (Higher Order) Function is composed of other functions Daniel Rubio BonillaTMPA 2017
  • 16. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 16 Flat Structure Graph of a Complex Structure of two same level Higher Order Functions (HOFs) • The output of one HOF is the input for another HOF foldl (+) 0 (map (*2) [0..n-1]) foldl :: (a -> b -> a) -> a -> [b] -> a map :: (a -> b) -> [a] -> [b] Daniel Rubio BonillaTMPA 2017
  • 17. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 17 Hierarchical Structure Graph of a Complex Hierarchical Structure of two different level Higher Order Functions (HOF) • The operator of one HOF is another HOF map (foldl (+) 0) [[..]..[..]] foldl :: (a -> b -> a) -> a -> [b] -> a map :: (a -> b) -> [a] -> [b] Daniel Rubio BonillaTMPA 2017
  • 18. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 18 Other requirements • Strong binding between directives and code • Description of memory organization Daniel Rubio BonillaTMPA 2017
  • 19. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 19 Example - Heat   t Daniel Rubio Bonilla   1-D heat dissipation function Discretization TMPA 2017
  • 20. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 20 Complexity Daniel Rubio BonillaTMPA 2017 O(N_ELEM) O(N_ITER) O(N_ELEM) O(1) O(1) O(1)
  • 21. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 21 Complexity Daniel Rubio BonillaTMPA 2017 O(N_ELEM) O(N_ITER) O(N_ELEM) O(1) O(1) O(1) O(1) + O(1) + O(N_ITER) * (O(N_ELEM)*O(1) + O(N_ELEM)) O(N_ITER * N_ELEM)
  • 22. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 22 Transformations – Partitioning 1 let heatDiffusion = itn HEATTIMESTEP hm_array N_ITER PAR1 v = stencil1D TKernel 1 v where TKernel x y z = y + K * (x - 2*y + z) HEATTIMESTEP vs = map PAR1 vs Daniel Rubio BonillaTMPA 2017
  • 23. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 23 Transformations – Partitioning 2 let heatDiffusion = itn HEATTIMESTEP hm_array N_ITER PAR2 v = stencil1D TKernel 1 v where TKernel x y z = y + K * (x - 2*y + z) PAR1 vs = map PAR2 vs HEATTIMESTEP vss = map PAR1 vss Daniel Rubio BonillaTMPA 2017
  • 24. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 24 Platform Specific Transformations • OpenMP: • Relatively straightforward • MPI: • Communication • Halos Daniel Rubio BonillaTMPA 2017
  • 25. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 25 Transformed Code if (rank < size - 1) MPI_Send(&hm[LOCAL_N_ELEM],1, MPI_FLOAT, rank + 1, 0, MPI_COMM_WORLD); if (rank > 0) MPI_Recv(&hm[0], 1, MPI_FLOAT, rank-1, 0, MPI_COMM_WORLD, &status); if (rank > 0) MPI_Send(&hm[1], 1, MPI_FLOAT, rank-1, 1, MPI_COMM_WORLD ); if (rank < size - 1) MPI_Recv(&hm[LOCAL_N_ELEM+1],1,MPI_FLOAT, rank+1, 1, MPI_COMM_WORLD, &status); #pragma polca stencil1D 1 G hm hm_tmp #pragma omp parallel for for(i=1; i<LOCAL_N_ELEM+1; i++) { #pragma polca G #pragma polca input (hm[i-1] hm[i] hm[i+1]) output(hm_tmp[i]) hm_tmp[i] = hm[i] + K * (hm[i-1] + hm[i+1] - 2 * hm[i]); } Daniel Rubio BonillaTMPA 2017
  • 26. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 26 Example - NBody t Daniel Rubio Bonilla N-Body Problem TMPA 2017 Three steps 1) Calculate Forces 2) Update Velocities 3) Calculate Position
  • 27. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 27 Structure Daniel Rubio BonillaTMPA 2017 O(1) O(1) O(1) O(nIters) O(nBodies) O(nBodies) O(nBodies) O(nBodies)
  • 28. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 28 Structure Daniel Rubio BonillaTMPA 2017 O(nIters) O(nBodies2 ) O(nBodies) O(nBodies)
  • 29. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 29 Structure Daniel Rubio BonillaTMPA 2017 O(nIters) * (O(nBodies2 ) + 2*O(nBodies)) O(nIters * nBodies2 ) O(nBodies2 ) O(nBodies) O(nBodies)
  • 30. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 30 Communication Daniel Rubio BonillaTMPA 2017 Parallel Parallel Sequential Parallel (with caution)
  • 31. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 31 Conclusion • Functional semantics can enable code: • Transformation • Adaptation • But also... • Algorithmic complexity analysis • Communication patterns • This information helps to predict application’s behavior Daniel Rubio BonillaTMPA 2017
  • 32. :: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: 32 Questions Thank you! Contact: [email protected] Projects: POLCA www.polca-project.eu Smart-Dash www.dash-project.org CλaSH www.clash-lang.org Daniel Rubio BonillaTMPA 2017