SlideShare a Scribd company logo
High-Performance Python
Python is fast!
• Python is fast to write, but natively 10x - 100x slower than C.
• Python has great C interop, so you can use C for the slow parts.
• This makes Python competitive with C.
Before you try this at home…
• “Premature optimization is the root of all evil.”
• Use external standards for how fast your code needs to be.
• Remember: performance is a tradeoff against readability, 

maintainability, and developer time.
Part 1:
General Optimization
Profile Your Code
• 95%+ of your code is irrelevant to performance.
• A profiler will tells you which 5% is important.
Profile Your Code
In Python, use cProfile:
source: https://ymichael.com/2014/03/08/profiling-python-with-cprofile.html
Basics
• Make sure your Big-O performance is optimal.
• Move operations outside of loops.
• Use cacheing for repeated calculations.
• Apply algebraic simplifications.
Accidentally Quadratic
The *most* common issue:
def find_intersection(list_one, list_two):
intersection = []
for a in list_one:
if a in list_two:
intersection.append(a)
return intersection
Accidentally Quadratic
The *most* common issue:
def find_intersection(list_one, list_two):
intersection = []
for a in list_one:
if a in list_two:
intersection.append(a)
return intersection
def find_intersection(list_one, list_two):
intersection = []
list_two = set(list_two)
for a in list_one:
if a in list_two:
intersection.append(a)
return intersection
Business Logic
Leverage business logic. You’ll often have 

NP-Complete optimizations to make.
The underlying business reasoning should
guide your approximations.
Part II:
Python Optimization
Libraries
• Use numpy, scipy, pandas, scikit-learn, etc.
• Incredible built-in functionality.



If you need something esoteric, try combining 

built-ins or adapting a more general built-in
approach.
• Extremely fast, thoroughly optimized, and best of all,
already written.
Pure Python Tips
• Function calls are expensive. Avoid them and avoid recursion.
• Check the runtime of built-in data types.
• Make variables local. Global lookups are expensive.
• Use map/filter/reduce instead of for loops, they’re written in C.
• Vectorize! numpy arrays are much faster than lists.
Mixed Tips
• Vectorize! numpy arrays are much faster than lists.
Mixed Tips
def complex_sum(in_list):
in_list = [(a + 2) for a
in in_list]
# more transformations
return sum(in_list)
def complex_sum(in_list):
in_list = np.array(in_list)
in_list += 2
# more transformations
return in_list.sum()
Mixed Tips
• Vectorize! numpy arrays are much faster than lists.
• Array allocation can be a bottleneck. 

Try moving it outside of loops.
Mixed Tips
• Vectorize! numpy arrays are much faster than lists.
• Array allocation can be a bottleneck. 

Try moving it outside of loops.
n = 10 ** 3
output = 0
for i in xrange(10**9):
result = np.zeros(n)
## calculations ##
output += result.sum()
result = np.zeros(10**3)
output = 0
for i in xrange(10**9):
result[:] = 0 # zero out array
## calculations ##
output += result.sum()
• Cython: inline C code directly into Python.
Last Resort: C
def fib(int n):
cdef int a, b, temp
a = 0
b = 1
while b < n:
temp = b
b = a + b
a = temp
• Cython: inline C code directly into Python.
Last Resort: C
def fib(n):
a = 0
b = 1
while b < n:
temp = b
b = a + b
a = temp
return b
• Cython: inline C code directly into Python.
Last Resort: C
def fib(int n):
cdef int a, b, temp
a = 0
b = 1
while b < n:
temp = b
b = a + b
a = temp
return b
Last Resort: C
• Cython: inline C code directly into Python.
• C extensions: write C and call it from Python.
Last Resort: C
• Cython: inline C code directly into Python.
• C extensions: write C and call it from Python.
• Limit these techniques to hot loops.
Things I haven’t mentioned
• multithreading: basically doesn’t work in Python
• pypy: A Python JIT compiler with a different ecosystem
Warning
Optimization is addictive.
Conclusions
• Avoid premature optimizations!

Have objective benchmarks you’re trying to hit.
• Profile your code.

You will be surprised by the results.
• The gold standard for performance is highly-tuned C
(that’s already been written by someone else)
Resources
• Programming Pearls (Jon Bentley)
• accidentallyquadratic.tumblr.com
• Performance Engineering of Software
Systems, 6.172, MIT OpenCourseWare
• cProfile Docs
• Cython Docs
• Guido Van Rossum’s advice:

python.org/doc/essays/list2str
General Python Specific
Contact me: ben@caffeinatedanalytics.com

More Related Content

PDF
Data Science Challenges in Personal Program Analysis
PDF
What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
PDF
Scaling Analysis Responsibly
PDF
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
PPTX
Reproducibility with Checkpoint & RRO
PDF
AI Library - An Open Source Machine Learning Framework
PDF
Adopting software design practices for better machine learning
PPTX
Capgemini - Project industrialization with apache spark
Data Science Challenges in Personal Program Analysis
What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
Scaling Analysis Responsibly
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
Reproducibility with Checkpoint & RRO
AI Library - An Open Source Machine Learning Framework
Adopting software design practices for better machine learning
Capgemini - Project industrialization with apache spark

What's hot (20)

PDF
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PDF
Reactive Programming
PDF
Does reporting takes lots of time
PDF
GraphQL With Relay Part Deux
PPTX
Refactoring Design Patterns the Functional Way (in Scala)
PPTX
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
PDF
Nobody Knows What It’s Like To Be the Bad Man: The Development Process for th...
PPTX
Functional programming (Let's fall back in love with Programming)
PPTX
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
PDF
GraphQL & Relay
PPTX
GraphQL Misconfiguration
PPTX
Attacking GraphQL
PDF
Introduction to GraphQL
PDF
GraphQL IndyJS April 2016
PPTX
Bug prediction + sdlc automation
PDF
Web Applications of the Future with TypeScript and GraphQL
PPTX
Graph ql vs rest
PPTX
How to NLProc from .NET
PPTX
Deep API Learning (FSE 2016)
PPTX
From Python to smartphones: neural nets @ Saint-Gobain, François Sausset
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
Reactive Programming
Does reporting takes lots of time
GraphQL With Relay Part Deux
Refactoring Design Patterns the Functional Way (in Scala)
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
Nobody Knows What It’s Like To Be the Bad Man: The Development Process for th...
Functional programming (Let's fall back in love with Programming)
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
GraphQL & Relay
GraphQL Misconfiguration
Attacking GraphQL
Introduction to GraphQL
GraphQL IndyJS April 2016
Bug prediction + sdlc automation
Web Applications of the Future with TypeScript and GraphQL
Graph ql vs rest
How to NLProc from .NET
Deep API Learning (FSE 2016)
From Python to smartphones: neural nets @ Saint-Gobain, François Sausset
Ad

Viewers also liked (19)

PDF
A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare S...
PDF
Dr. Datascience or: How I Learned to Stop Munging and Love Tests
PDF
Iterating over statistical models: NCAA tournament edition
PDF
Using R at NYT Graphics
PDF
Thinking Small About Big Data
PDF
Julia + R for Data Science
PDF
R for Everything
PDF
Improving Data Interoperability for Python and R
PDF
Building Scalable Prediction Services in R
PDF
I Don't Want to Be a Dummy! Encoding Predictors for Trees
PDF
Reflection on the Data Science Profession in NYC
PDF
The Political Impact of Social Penumbras
PDF
One Algorithm to Rule Them All: How to Automate Statistical Computation
PDF
R Packages for Time-Varying Networks and Extremal Dependence
PDF
Broom: Converting Statistical Models to Tidy Data Frames
PDF
Analyzing NYC Transit Data
PDF
The Feels
PPTX
Inside the R Consortium
PDF
Scaling Data Science at Airbnb
A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare S...
Dr. Datascience or: How I Learned to Stop Munging and Love Tests
Iterating over statistical models: NCAA tournament edition
Using R at NYT Graphics
Thinking Small About Big Data
Julia + R for Data Science
R for Everything
Improving Data Interoperability for Python and R
Building Scalable Prediction Services in R
I Don't Want to Be a Dummy! Encoding Predictors for Trees
Reflection on the Data Science Profession in NYC
The Political Impact of Social Penumbras
One Algorithm to Rule Them All: How to Automate Statistical Computation
R Packages for Time-Varying Networks and Extremal Dependence
Broom: Converting Statistical Models to Tidy Data Frames
Analyzing NYC Transit Data
The Feels
Inside the R Consortium
Scaling Data Science at Airbnb
Ad

Similar to High-Performance Python (20)

PDF
Python高级编程(二)
PPTX
Fast Python: Master the Basics to Write Faster Code
PPTX
Algorithms 101 for Data Scientists (Part 2)
PPTX
Intro To C++ - Class #17: Pointers!, Objects Talking To Each Other
PPT
OpenMP-Quinn17_L4bOpen <MP_Open MP_Open MP
PDF
Python and Pytorch tutorial and walkthrough
PDF
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
PPT
Embedded _c_
PPTX
CPP03 - Repetition
PDF
07 control+structures
PDF
Scratching the itch, making Scratch for the Raspberry Pie
PPTX
2CPP02 - C++ Primer
PDF
Cs4hs2008 track a-programming
PDF
Q-Step_WS_02102019_Practical_introduction_to_Python.pdf
PDF
Lecture1
PDF
On the Necessity and Inapplicability of Python
PDF
On the necessity and inapplicability of python
PPTX
Introduction to C ++.pptx
PDF
Numba Overview
PPTX
Effective C++
Python高级编程(二)
Fast Python: Master the Basics to Write Faster Code
Algorithms 101 for Data Scientists (Part 2)
Intro To C++ - Class #17: Pointers!, Objects Talking To Each Other
OpenMP-Quinn17_L4bOpen <MP_Open MP_Open MP
Python and Pytorch tutorial and walkthrough
PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime...
Embedded _c_
CPP03 - Repetition
07 control+structures
Scratching the itch, making Scratch for the Raspberry Pie
2CPP02 - C++ Primer
Cs4hs2008 track a-programming
Q-Step_WS_02102019_Practical_introduction_to_Python.pdf
Lecture1
On the Necessity and Inapplicability of Python
On the necessity and inapplicability of python
Introduction to C ++.pptx
Numba Overview
Effective C++

More from Work-Bench (8)

PDF
2017 Enterprise Almanac
PDF
AI to Enable Next Generation of People Managers
PDF
Startup Recruiting Workbook: Sourcing and Interview Process
PDF
Cloud Native Infrastructure Management Solutions Compared
PPTX
Building a Demand Generation Machine at MongoDB
PPTX
How to Market Your Startup to the Enterprise
PDF
Marketing & Design for the Enterprise
PDF
Playing the Marketing Long Game
2017 Enterprise Almanac
AI to Enable Next Generation of People Managers
Startup Recruiting Workbook: Sourcing and Interview Process
Cloud Native Infrastructure Management Solutions Compared
Building a Demand Generation Machine at MongoDB
How to Market Your Startup to the Enterprise
Marketing & Design for the Enterprise
Playing the Marketing Long Game

Recently uploaded (20)

PDF
Data Engineering Interview Questions & Answers Data Modeling (3NF, Star, Vaul...
DOCX
Factor Analysis Word Document Presentation
PPTX
Managing Community Partner Relationships
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPTX
IMPACT OF LANDSLIDE.....................
PPTX
Business_Capability_Map_Collection__pptx
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PDF
Introduction to Data Science and Data Analysis
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PDF
Navigating the Thai Supplements Landscape.pdf
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PPTX
CYBER SECURITY the Next Warefare Tactics
PPTX
modul_python (1).pptx for professional and student
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PPTX
A Complete Guide to Streamlining Business Processes
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PPTX
DS-40-Pre-Engagement and Kickoff deck - v8.0.pptx
PPTX
SET 1 Compulsory MNH machine learning intro
Data Engineering Interview Questions & Answers Data Modeling (3NF, Star, Vaul...
Factor Analysis Word Document Presentation
Managing Community Partner Relationships
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
IMPACT OF LANDSLIDE.....................
Business_Capability_Map_Collection__pptx
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx
Introduction to Data Science and Data Analysis
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
Navigating the Thai Supplements Landscape.pdf
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
CYBER SECURITY the Next Warefare Tactics
modul_python (1).pptx for professional and student
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
A Complete Guide to Streamlining Business Processes
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
DS-40-Pre-Engagement and Kickoff deck - v8.0.pptx
SET 1 Compulsory MNH machine learning intro

High-Performance Python

  • 2. Python is fast! • Python is fast to write, but natively 10x - 100x slower than C. • Python has great C interop, so you can use C for the slow parts. • This makes Python competitive with C.
  • 3. Before you try this at home… • “Premature optimization is the root of all evil.” • Use external standards for how fast your code needs to be. • Remember: performance is a tradeoff against readability, 
 maintainability, and developer time.
  • 5. Profile Your Code • 95%+ of your code is irrelevant to performance. • A profiler will tells you which 5% is important.
  • 6. Profile Your Code In Python, use cProfile: source: https://ymichael.com/2014/03/08/profiling-python-with-cprofile.html
  • 7. Basics • Make sure your Big-O performance is optimal. • Move operations outside of loops. • Use cacheing for repeated calculations. • Apply algebraic simplifications.
  • 8. Accidentally Quadratic The *most* common issue: def find_intersection(list_one, list_two): intersection = [] for a in list_one: if a in list_two: intersection.append(a) return intersection
  • 9. Accidentally Quadratic The *most* common issue: def find_intersection(list_one, list_two): intersection = [] for a in list_one: if a in list_two: intersection.append(a) return intersection def find_intersection(list_one, list_two): intersection = [] list_two = set(list_two) for a in list_one: if a in list_two: intersection.append(a) return intersection
  • 10. Business Logic Leverage business logic. You’ll often have 
 NP-Complete optimizations to make. The underlying business reasoning should guide your approximations.
  • 12. Libraries • Use numpy, scipy, pandas, scikit-learn, etc. • Incredible built-in functionality.
 
 If you need something esoteric, try combining 
 built-ins or adapting a more general built-in approach. • Extremely fast, thoroughly optimized, and best of all, already written.
  • 13. Pure Python Tips • Function calls are expensive. Avoid them and avoid recursion. • Check the runtime of built-in data types. • Make variables local. Global lookups are expensive. • Use map/filter/reduce instead of for loops, they’re written in C.
  • 14. • Vectorize! numpy arrays are much faster than lists. Mixed Tips
  • 15. • Vectorize! numpy arrays are much faster than lists. Mixed Tips def complex_sum(in_list): in_list = [(a + 2) for a in in_list] # more transformations return sum(in_list) def complex_sum(in_list): in_list = np.array(in_list) in_list += 2 # more transformations return in_list.sum()
  • 16. Mixed Tips • Vectorize! numpy arrays are much faster than lists. • Array allocation can be a bottleneck. 
 Try moving it outside of loops.
  • 17. Mixed Tips • Vectorize! numpy arrays are much faster than lists. • Array allocation can be a bottleneck. 
 Try moving it outside of loops. n = 10 ** 3 output = 0 for i in xrange(10**9): result = np.zeros(n) ## calculations ## output += result.sum() result = np.zeros(10**3) output = 0 for i in xrange(10**9): result[:] = 0 # zero out array ## calculations ## output += result.sum()
  • 18. • Cython: inline C code directly into Python. Last Resort: C
  • 19. def fib(int n): cdef int a, b, temp a = 0 b = 1 while b < n: temp = b b = a + b a = temp • Cython: inline C code directly into Python. Last Resort: C def fib(n): a = 0 b = 1 while b < n: temp = b b = a + b a = temp return b
  • 20. • Cython: inline C code directly into Python. Last Resort: C def fib(int n): cdef int a, b, temp a = 0 b = 1 while b < n: temp = b b = a + b a = temp return b
  • 21. Last Resort: C • Cython: inline C code directly into Python. • C extensions: write C and call it from Python.
  • 22. Last Resort: C • Cython: inline C code directly into Python. • C extensions: write C and call it from Python. • Limit these techniques to hot loops.
  • 23. Things I haven’t mentioned • multithreading: basically doesn’t work in Python • pypy: A Python JIT compiler with a different ecosystem
  • 25. Conclusions • Avoid premature optimizations!
 Have objective benchmarks you’re trying to hit. • Profile your code.
 You will be surprised by the results. • The gold standard for performance is highly-tuned C (that’s already been written by someone else)
  • 26. Resources • Programming Pearls (Jon Bentley) • accidentallyquadratic.tumblr.com • Performance Engineering of Software Systems, 6.172, MIT OpenCourseWare • cProfile Docs • Cython Docs • Guido Van Rossum’s advice:
 python.org/doc/essays/list2str General Python Specific Contact me: [email protected]