SlideShare a Scribd company logo
Debugging Live Python Web Applications
      Graham Dumpleton / Amjith Ramanujam
            PyCon AU - August 2012
Follow along.




   http://www.slideshare.net/GrahamDumpleton
What is debugging?



   Debugging is a methodical process of
    finding and reducing the number of
  bugs, or defects, in a computer program
   or a piece of electronic hardware, thus
       making it behave as expected.
            http://en.wikipedia.org/wiki/Debugging
Common types of computer bugs.



   Arithmetic bugs               Logic bugs                     Syntax bugs

                            Multi-threading
    Resource bugs                                           Interfacing bugs
                           programming bugs

   Performance bugs        Teamworking bugs

                    http://en.wikipedia.org/wiki/Software_bug
Things we want to avoid.


•   Crashing the whole web site.
•   Corrupt all your customer data.
•   Make you customer data visible to everyone.
•   Loose your company lots of money.
•   Loose your own job because you did something stupid.
•   Cause all your work mates to loose their jobs as well.
•   Getting what you did posted on Slashdot.
Managing risk.


•   Use software to restrict what you can do.
•   Script changes and procedures to avoid errors.
•   Test what you are going to do on a separate system.
•   Develop and document contingency plans.
Passive monitoring.


•   Collection of log file information.
•   Collection of details about Python exceptions.
•   Collection of performance data for the server host.
•   Collection of performance data for the web server.
•   Collection of performance data for the web application.
Log file collation and analysis.


•   Open Source
    •   logstash (http://logstash.net)
    •   graylog2 (http://www.graylog2.org)

•   Commercial Services
    •   Loggly (http://www.loggly.com)
    •   Splunk (http://www.splunk.com)
    •   LogLogic (http://www/loglogic.com)
Recording Python exceptions.


•   Open Source
    •   Sentry (http://pypi.python.org/pypi/sentry) - Also as paid service.

•   Commercial Services
    •   New Relic (http://newrelic.com) - Pro feature.
Server monitoring.


•   Open Source
    •   Monit (http://mmonit.com)
    •   Munin (http://munin-monitoring.org)
    •   Cacti (http://www.cacti.net)
    •   Nagios (http://www.nagios.org)

•   Commercial Services
    •   New Relic (http://newrelic.com) - Free feature.
Application performance monitoring.


•   Commercial Services
    •   New Relic (http://newrelic.com) - Lite (Free), Standard and Pro subscriptions.
Your web site is slow.
Web page performance analysis.


•   Online services.
    •   YSlow (http://developer.yahoo.com/yslow/)
    •   Google PageSpeed (https://developers.google.com/speed/pagespeed/)
    •   WebPageTest (http://www.webpagetest.org/)


•   Browser plugins.
    •   YSlow for FireFox (https://addons.mozilla.org/en-US/firefox/addon/yslow/)
    •   FireBug (http://getfirebug.com/)
World Wide Web Consortium.


•   Resource timing specification.
    •   W3 Resource Timing Specification (http://www.w3.org/TR/resource-timing/)
Application performance analysis.
Slow transaction summary.
Slow transaction details.
Unknown consumers of time.




                             ????
Instrumentation via code change.



  import newrelic.agent

  class _Database(UserDict.DictMixin):

      @newrelic.agent.function_trace()
      def _commit(self):
          ...

  @newrelic.agent.function_trace()
  def open(file, flag=None, mode=0666):
      ...
Instrumentation via configuration.




    [newrelic]
    transaction_tracer.function_trace =
            dumbdbm:open
            dumbdbm:_Database._commit
Instrumentation by monkey patching.



[import-hook:dumbdbm]
enabled = true
execute = dumbdbm_instrumentation:instrument

# dumbdbm_instrument.py

from newrelic.api.function_trace import 
        wrap_function_trace

def instrument(module):
     wrap_function_trace(module, 'open')
     wrap_function_trace(module, '_Database._commit')
Thread sampling.
Profiling tools.


•   Thread sampling.
    •   plop (http://tech.dropbox.com/?p=272)
    •   statprof (http://pypi.python.org/pypi/statprof/)


•   Full profiling.
    •   cprofile (http://docs.python.org/library/profile.html)
    •   pytrace (http://pypi.python.org/pypi/pytrace)
Targeted function profiling.




 @function_profile(filename=/'tmp/profile.dat',
         delay=1.0, checkpoint=30)
 def open(file, flag=None, mode=0666):
     ...
Controlling what is profiled.



   class FunctionProfile(object):

       def __init__(self, profile):
           self.profile = profile

       def __enter__(self):
           self.profile.enable()
           return self

       def __exit__(self, exc, value, tb):
           self.profile.disable()
Manual metric collection.


•   Open Source
    •   metrology - http://metrology.readthedocs.org/en/latest/index.html
    •   mmstats - https://github.com/schmichael/mmstats
    •   pymetrics - https://github.com/jgardner1/Python-Metrics
    •   django-app-metrics - http://pypi.python.org/pypi/django-app-metrics
    •   django-statsd - http://django-statsd.readthedocs.org/en/latest/
Interacting via the browser.


•   Open Source
    •   Paste Error Middleware - http://pythonpaste.org/modules/exceptions.html
    •   django-debug-toolbar - https://github.com/django-debug-toolbar/django-debug-toolbar/
    •   Paste Debugger - http://pythonpaste.org/modules/evalexception.html
    •   Flask Debugger - http://werkzeug.pocoo.org/docs/debug/
Application backdoors.




    import logging
    import logging.config

    logging.config.fileConfig("logging.conf")
    backdoor = logging.config.listen()
    backdoor.start()
Interactive access.


•   Embedded interpreter prompt.
    •   eventlet.backdoor - http://eventlet.net/doc/modules/backdoor.html
    •   guppy.heapy.Console - http://guppy-pe.sourceforge.net
    •   twisted.manhole - http://www.lothar.com/tech/twisted/manhole.xhtml

•   Code injection mechanisms.
    •   pyrasite - http://pyrasite.readthedocs.org/en/latest/index.html

•   Remote code debuggers.
    •   Komodo IDE - http://www.activestate.com/komodo-ide
    •   PyCharm IDE - http://www.jetbrains.com/pycharm/
    •   Wing IDE - http://wingware.com/
    •   PyDev IDE - http://pydev.org/
Introducing ispyd.


•   Download site.
    •   https://github.com/GrahamDumpleton/wsgi-shell


•   Aims of the package.
    •   Provide a generic framework for implementing an interactive console.
    •   The commands you can run are targeted at a specific purpose.
    •   Plugin based so can control what is available and also extendable.
    •   Remotely accessible and execution of commands scriptable.
Connecting to processes.



    $ ispy ispyd.ini

    (ispyd) servers
    1: (1, '/tmp/ispyd-14905.sock')
    2: (1, '/tmp/ispyd-14906.sock')
    3: (1, '/tmp/ispyd-14907.sock')

    (ispyd) connect 1

    (ispyd:ll345) plugins
    ['debugger', 'process', 'python', 'wsgi']
Executing commands.




(ispyd:ll345) shell process

(process:ll345) help

Documented commands (type help <topic>):
========================================
cwd egid euid exit gid help pid prompt     uid

(process:ll345) cwd
/Users/graham
Power users.

(ispyd:ll345) shell python

(python:ll345) console
Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license"
for more information.
(EmbeddedConsole)
>>> import os
>>> os.getcwd()
'/Users/graham'
>>> exit()
Post-mortem debugging.


(ispyd:ll345) shell debugger

(debugger:11345) insert __main__:function

(debugger:11345) tracebacks
{'__main__:function': <traceback object at
0x1013a11b8>}
(debugger:11345) debug __main__:function
> /Users/graham/wsgi.py(15)function()
-> raise RuntimeError('xxx')
(Pdb) dir()
[]
(Pdb) __file__
'wsgi.py'
Extending what is monitored.




(ispyd:ll345) shell newrelic

(newrelic:ll345) function_trace dumbdbm:open
(newrelic:ll345) function_trace dumbdbm:_Database._commit
Capacity Analysis




   Rolling server restart.
Was the customer impacted?
Stuck requests and backlog.
Active requests.

  (ispyd:ll345) shell requests

  (debugger:11345) requests
  ==== 707 ====

  thread_id = 140735076232384
  start_time = Mon Apr 9 21:49:54 2012
  duration = 0.013629 seconds

  CONTENT_LENGTH = ''
  ...

  File: "wsgi.py", line 25, in <module>
    application.run(host='0.0.0.0', port=port)
  ...
Multiprocess web applications.



    $ ispy --batch - ispyd.ini << EOF
    prompt off
    shell requests
    requests
    exit
    shell newrelic
    function_trace dumbdbm:open
    function_trace dumbdbm:_Database._commit
    exit
    exit
    EOF
Creating plugins.

import psutil

class Shell(object):

    name = 'psutil'

    def do_num_cpus(self, line):
        print >> self.stdout, psutil.NUM_CPUS

    def do_cpu_times(self, line):
        print >> self.stdout, psutil.cpu_times()

    def do_virtual_memory(self, line):
        print >> self.stdout, psutil.virtual_memory()

    def do_swap_memory(self, line):
        print >> self.stdout, psutil.swap_memory()
Ideas for third party plugins.


•   Memory.
    •   Process memory usage.
    •   Statistics on objects in use (heapy).
    •   State of the garbage collector.

•   Profiling.
    •   Initiate sampled profiling for selected functions.

•   Django.
    •   Current configuration.
    •   Details of loaded applications.
    •   Details of registered middleware.
    •   Details of template libraries.
    •   Testing URLs against URL resolver.
    •   Statistics on cache usage.
What am I trying to say?


•   Use monitoring so you know when problems arise.
•   One tool alone is not going to provide everything.
•   Use complimentary tools to get a full picture.
•   Build in mechanisms that allow deeper debugging.
•   Treat debugging like any other defined process.
New Relic




    30 Day Free Pro Trail
   http://newrelic.com/30


  Graham.Dumpleton@gmail.com
       @GrahamDumpleton
Ad

Recommended

87559489 auth
87559489 auth
homeworkping4
 
Bank Management System
Bank Management System
Vinoth Ratnam Sudalaimuthu
 
Autonomous vehicles
Autonomous vehicles
vishnum379
 
Presentation on java project (bank management system)
Presentation on java project (bank management system)
Gopal Sheel
 
Final Year Project Presentation
Final Year Project Presentation
Syed Absar
 
Quiz application
Quiz application
Harsh Verma
 
Atm System
Atm System
Nila Kamal Nayak
 
Bank management system
Bank management system
Dakshata Gavand
 
Detection of phishing websites
Detection of phishing websites
m srikanth
 
Image classification using convolutional neural network
Image classification using convolutional neural network
KIRAN R
 
Safety app for woman
Safety app for woman
SMNajrulHowlader
 
Software Testing
Software Testing
Mousmi Pawar
 
Bank management system
Bank management system
sumanadas37
 
Detection of plant diseases
Detection of plant diseases
Muneesh Wari
 
Bank management system c++
Bank management system c++
Akshay Sorathia
 
Quiz app android ppt
Quiz app android ppt
Aditya Nag
 
Networking in java, Advanced programming
Networking in java, Advanced programming
Gera Paulos
 
functional testing
functional testing
bharathanche
 
Internship on web development
Internship on web development
Rajendra Kandel
 
State Space Representation and Search
State Space Representation and Search
Hitesh Mohapatra
 
A machine learning model for average fuel consumption in heavy vehicles
A machine learning model for average fuel consumption in heavy vehicles
Venkat Projects
 
Abstraction in java
Abstraction in java
sawarkar17
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
EUDAT
 
Sonarqube
Sonarqube
Peerapat Asoktummarungsri
 
Python Summer Internship
Python Summer Internship
Atul Kumar
 
Food donation project report II
Food donation project report II
Dhananjaysinh Jhala
 
oops concept in java | object oriented programming in java
oops concept in java | object oriented programming in java
CPD INDIA
 
Selenium Automation Framework
Selenium Automation Framework
Mindfire Solutions
 
PyConUK 2014 - PostMortem Debugging and Web Development Updated
PyConUK 2014 - PostMortem Debugging and Web Development Updated
Alessandro Molina
 
Post-Mortem Debugging and Web Development
Post-Mortem Debugging and Web Development
Alessandro Molina
 

More Related Content

What's hot (20)

Detection of phishing websites
Detection of phishing websites
m srikanth
 
Image classification using convolutional neural network
Image classification using convolutional neural network
KIRAN R
 
Safety app for woman
Safety app for woman
SMNajrulHowlader
 
Software Testing
Software Testing
Mousmi Pawar
 
Bank management system
Bank management system
sumanadas37
 
Detection of plant diseases
Detection of plant diseases
Muneesh Wari
 
Bank management system c++
Bank management system c++
Akshay Sorathia
 
Quiz app android ppt
Quiz app android ppt
Aditya Nag
 
Networking in java, Advanced programming
Networking in java, Advanced programming
Gera Paulos
 
functional testing
functional testing
bharathanche
 
Internship on web development
Internship on web development
Rajendra Kandel
 
State Space Representation and Search
State Space Representation and Search
Hitesh Mohapatra
 
A machine learning model for average fuel consumption in heavy vehicles
A machine learning model for average fuel consumption in heavy vehicles
Venkat Projects
 
Abstraction in java
Abstraction in java
sawarkar17
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
EUDAT
 
Sonarqube
Sonarqube
Peerapat Asoktummarungsri
 
Python Summer Internship
Python Summer Internship
Atul Kumar
 
Food donation project report II
Food donation project report II
Dhananjaysinh Jhala
 
oops concept in java | object oriented programming in java
oops concept in java | object oriented programming in java
CPD INDIA
 
Selenium Automation Framework
Selenium Automation Framework
Mindfire Solutions
 
Detection of phishing websites
Detection of phishing websites
m srikanth
 
Image classification using convolutional neural network
Image classification using convolutional neural network
KIRAN R
 
Bank management system
Bank management system
sumanadas37
 
Detection of plant diseases
Detection of plant diseases
Muneesh Wari
 
Bank management system c++
Bank management system c++
Akshay Sorathia
 
Quiz app android ppt
Quiz app android ppt
Aditya Nag
 
Networking in java, Advanced programming
Networking in java, Advanced programming
Gera Paulos
 
functional testing
functional testing
bharathanche
 
Internship on web development
Internship on web development
Rajendra Kandel
 
State Space Representation and Search
State Space Representation and Search
Hitesh Mohapatra
 
A machine learning model for average fuel consumption in heavy vehicles
A machine learning model for average fuel consumption in heavy vehicles
Venkat Projects
 
Abstraction in java
Abstraction in java
sawarkar17
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
EUDAT
 
Python Summer Internship
Python Summer Internship
Atul Kumar
 
oops concept in java | object oriented programming in java
oops concept in java | object oriented programming in java
CPD INDIA
 

Similar to PyCon AU 2012 - Debugging Live Python Web Applications (20)

PyConUK 2014 - PostMortem Debugging and Web Development Updated
PyConUK 2014 - PostMortem Debugging and Web Development Updated
Alessandro Molina
 
Post-Mortem Debugging and Web Development
Post-Mortem Debugging and Web Development
Alessandro Molina
 
Python Load Testing - Pygotham 2012
Python Load Testing - Pygotham 2012
Dan Kuebrich
 
Monitoring and Debugging your Live Applications
Monitoring and Debugging your Live Applications
Robert Coup
 
SFScon 21 - Duc Ly Vu - LastPyMile: a lightweight approach for securing Pytho...
SFScon 21 - Duc Ly Vu - LastPyMile: a lightweight approach for securing Pytho...
South Tyrol Free Software Conference
 
Tips for Happier Python Debugging
Tips for Happier Python Debugging
Chun-Hao Chang
 
Ln monitoring repositories
Ln monitoring repositories
snyff
 
High Performance Python 2nd Edition Micha Gorelick
High Performance Python 2nd Edition Micha Gorelick
danuzakhiem
 
Migrating from matlab to python
Migrating from matlab to python
ActiveState
 
PyData Texas 2015 Keynote
PyData Texas 2015 Keynote
Peter Wang
 
Python in Industry
Python in Industry
Dharmit Shah
 
Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s view
Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s view
Security Bootcamp
 
Python and R for quantitative finance
Python and R for quantitative finance
Luca Sbardella
 
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
Alexandre Moneger
 
Debugging of (C)Python applications
Debugging of (C)Python applications
Roman Podoliaka
 
My talk on Piter Py 2016
My talk on Piter Py 2016
Alex Chistyakov
 
Python debugging techniques
Python debugging techniques
Tuomas Suutari
 
Introduction to python
Introduction to python
Rajesh Rajamani
 
PyParis2017 / Function-as-a-service - a pythonic perspective on severless com...
PyParis2017 / Function-as-a-service - a pythonic perspective on severless com...
Pôle Systematic Paris-Region
 
How Python Empowers Ethical Hackers by Supriya Kumar Mitra
How Python Empowers Ethical Hackers by Supriya Kumar Mitra
null - The Open Security Community
 
PyConUK 2014 - PostMortem Debugging and Web Development Updated
PyConUK 2014 - PostMortem Debugging and Web Development Updated
Alessandro Molina
 
Post-Mortem Debugging and Web Development
Post-Mortem Debugging and Web Development
Alessandro Molina
 
Python Load Testing - Pygotham 2012
Python Load Testing - Pygotham 2012
Dan Kuebrich
 
Monitoring and Debugging your Live Applications
Monitoring and Debugging your Live Applications
Robert Coup
 
SFScon 21 - Duc Ly Vu - LastPyMile: a lightweight approach for securing Pytho...
SFScon 21 - Duc Ly Vu - LastPyMile: a lightweight approach for securing Pytho...
South Tyrol Free Software Conference
 
Tips for Happier Python Debugging
Tips for Happier Python Debugging
Chun-Hao Chang
 
Ln monitoring repositories
Ln monitoring repositories
snyff
 
High Performance Python 2nd Edition Micha Gorelick
High Performance Python 2nd Edition Micha Gorelick
danuzakhiem
 
Migrating from matlab to python
Migrating from matlab to python
ActiveState
 
PyData Texas 2015 Keynote
PyData Texas 2015 Keynote
Peter Wang
 
Python in Industry
Python in Industry
Dharmit Shah
 
Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s view
Nguyen Huu Trung - Building a web vulnerability scanner - From a hacker’s view
Security Bootcamp
 
Python and R for quantitative finance
Python and R for quantitative finance
Luca Sbardella
 
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
Alexandre Moneger
 
Debugging of (C)Python applications
Debugging of (C)Python applications
Roman Podoliaka
 
My talk on Piter Py 2016
My talk on Piter Py 2016
Alex Chistyakov
 
Python debugging techniques
Python debugging techniques
Tuomas Suutari
 
PyParis2017 / Function-as-a-service - a pythonic perspective on severless com...
PyParis2017 / Function-as-a-service - a pythonic perspective on severless com...
Pôle Systematic Paris-Region
 
Ad

More from Graham Dumpleton (15)

Implementing a decorator for thread synchronisation.
Implementing a decorator for thread synchronisation.
Graham Dumpleton
 
Not Tom Eastman
Not Tom Eastman
Graham Dumpleton
 
Data analytics in the cloud with Jupyter notebooks.
Data analytics in the cloud with Jupyter notebooks.
Graham Dumpleton
 
“warpdrive”, making Python web application deployment magically easy.
“warpdrive”, making Python web application deployment magically easy.
Graham Dumpleton
 
Hear no evil, see no evil, patch no evil: Or, how to monkey-patch safely.
Hear no evil, see no evil, patch no evil: Or, how to monkey-patch safely.
Graham Dumpleton
 
OpenShift, Docker, Kubernetes: The next generation of PaaS
OpenShift, Docker, Kubernetes: The next generation of PaaS
Graham Dumpleton
 
Automated Image Builds in OpenShift and Kubernetes
Automated Image Builds in OpenShift and Kubernetes
Graham Dumpleton
 
PyCon HK 2015 - Monitoring the performance of python web applications
PyCon HK 2015 - Monitoring the performance of python web applications
Graham Dumpleton
 
PyCon AU 2015 - Using benchmarks to understand how wsgi servers work
PyCon AU 2015 - Using benchmarks to understand how wsgi servers work
Graham Dumpleton
 
PyCon NZ 2013 - Advanced Methods For Creating Decorators
PyCon NZ 2013 - Advanced Methods For Creating Decorators
Graham Dumpleton
 
PyCon US 2013 Making Apache suck less for hosting Python web applications
PyCon US 2013 Making Apache suck less for hosting Python web applications
Graham Dumpleton
 
PyCon AU 2010 - Getting Started With Apache/mod_wsgi.
PyCon AU 2010 - Getting Started With Apache/mod_wsgi.
Graham Dumpleton
 
PyCon US 2012 - State of WSGI 2
PyCon US 2012 - State of WSGI 2
Graham Dumpleton
 
PyCon US 2012 - Web Server Bottlenecks and Performance Tuning
PyCon US 2012 - Web Server Bottlenecks and Performance Tuning
Graham Dumpleton
 
DjangoCon US 2011 - Monkeying around at New Relic
DjangoCon US 2011 - Monkeying around at New Relic
Graham Dumpleton
 
Implementing a decorator for thread synchronisation.
Implementing a decorator for thread synchronisation.
Graham Dumpleton
 
Data analytics in the cloud with Jupyter notebooks.
Data analytics in the cloud with Jupyter notebooks.
Graham Dumpleton
 
“warpdrive”, making Python web application deployment magically easy.
“warpdrive”, making Python web application deployment magically easy.
Graham Dumpleton
 
Hear no evil, see no evil, patch no evil: Or, how to monkey-patch safely.
Hear no evil, see no evil, patch no evil: Or, how to monkey-patch safely.
Graham Dumpleton
 
OpenShift, Docker, Kubernetes: The next generation of PaaS
OpenShift, Docker, Kubernetes: The next generation of PaaS
Graham Dumpleton
 
Automated Image Builds in OpenShift and Kubernetes
Automated Image Builds in OpenShift and Kubernetes
Graham Dumpleton
 
PyCon HK 2015 - Monitoring the performance of python web applications
PyCon HK 2015 - Monitoring the performance of python web applications
Graham Dumpleton
 
PyCon AU 2015 - Using benchmarks to understand how wsgi servers work
PyCon AU 2015 - Using benchmarks to understand how wsgi servers work
Graham Dumpleton
 
PyCon NZ 2013 - Advanced Methods For Creating Decorators
PyCon NZ 2013 - Advanced Methods For Creating Decorators
Graham Dumpleton
 
PyCon US 2013 Making Apache suck less for hosting Python web applications
PyCon US 2013 Making Apache suck less for hosting Python web applications
Graham Dumpleton
 
PyCon AU 2010 - Getting Started With Apache/mod_wsgi.
PyCon AU 2010 - Getting Started With Apache/mod_wsgi.
Graham Dumpleton
 
PyCon US 2012 - State of WSGI 2
PyCon US 2012 - State of WSGI 2
Graham Dumpleton
 
PyCon US 2012 - Web Server Bottlenecks and Performance Tuning
PyCon US 2012 - Web Server Bottlenecks and Performance Tuning
Graham Dumpleton
 
DjangoCon US 2011 - Monkeying around at New Relic
DjangoCon US 2011 - Monkeying around at New Relic
Graham Dumpleton
 
Ad

Recently uploaded (20)

Cluster-Based Multi-Objective Metamorphic Test Case Pair Selection for Deep N...
Cluster-Based Multi-Objective Metamorphic Test Case Pair Selection for Deep N...
janeliewang985
 
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
Safe Software
 
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Safe Software
 
Securing Account Lifecycles in the Age of Deepfakes.pptx
Securing Account Lifecycles in the Age of Deepfakes.pptx
FIDO Alliance
 
Mastering AI Workflows with FME by Mark Döring
Mastering AI Workflows with FME by Mark Döring
Safe Software
 
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
cnc-processing-centers-centateq-p-110-en.pdf
cnc-processing-centers-centateq-p-110-en.pdf
AmirStern2
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
 
Connecting Data and Intelligence: The Role of FME in Machine Learning
Connecting Data and Intelligence: The Role of FME in Machine Learning
Safe Software
 
2025_06_18 - OpenMetadata Community Meeting.pdf
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
 
The Future of Technology: 2025-2125 by Saikat Basu.pdf
The Future of Technology: 2025-2125 by Saikat Basu.pdf
Saikat Basu
 
Quantum AI: Where Impossible Becomes Probable
Quantum AI: Where Impossible Becomes Probable
Saikat Basu
 
Lessons Learned from Developing Secure AI Workflows.pdf
Lessons Learned from Developing Secure AI Workflows.pdf
Priyanka Aash
 
You are not excused! How to avoid security blind spots on the way to production
You are not excused! How to avoid security blind spots on the way to production
Michele Leroux Bustamante
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
OpenPOWER Foundation & Open-Source Core Innovations
OpenPOWER Foundation & Open-Source Core Innovations
IBM
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
"Scaling in space and time with Temporal", Andriy Lupa.pdf
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
 
Cluster-Based Multi-Objective Metamorphic Test Case Pair Selection for Deep N...
Cluster-Based Multi-Objective Metamorphic Test Case Pair Selection for Deep N...
janeliewang985
 
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
Safe Software
 
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Safe Software
 
Securing Account Lifecycles in the Age of Deepfakes.pptx
Securing Account Lifecycles in the Age of Deepfakes.pptx
FIDO Alliance
 
Mastering AI Workflows with FME by Mark Döring
Mastering AI Workflows with FME by Mark Döring
Safe Software
 
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
cnc-processing-centers-centateq-p-110-en.pdf
cnc-processing-centers-centateq-p-110-en.pdf
AmirStern2
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
 
Connecting Data and Intelligence: The Role of FME in Machine Learning
Connecting Data and Intelligence: The Role of FME in Machine Learning
Safe Software
 
2025_06_18 - OpenMetadata Community Meeting.pdf
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
 
The Future of Technology: 2025-2125 by Saikat Basu.pdf
The Future of Technology: 2025-2125 by Saikat Basu.pdf
Saikat Basu
 
Quantum AI: Where Impossible Becomes Probable
Quantum AI: Where Impossible Becomes Probable
Saikat Basu
 
Lessons Learned from Developing Secure AI Workflows.pdf
Lessons Learned from Developing Secure AI Workflows.pdf
Priyanka Aash
 
You are not excused! How to avoid security blind spots on the way to production
You are not excused! How to avoid security blind spots on the way to production
Michele Leroux Bustamante
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
OpenPOWER Foundation & Open-Source Core Innovations
OpenPOWER Foundation & Open-Source Core Innovations
IBM
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
"Scaling in space and time with Temporal", Andriy Lupa.pdf
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
 

PyCon AU 2012 - Debugging Live Python Web Applications

  • 1. Debugging Live Python Web Applications Graham Dumpleton / Amjith Ramanujam PyCon AU - August 2012
  • 2. Follow along. http://www.slideshare.net/GrahamDumpleton
  • 3. What is debugging? Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece of electronic hardware, thus making it behave as expected. http://en.wikipedia.org/wiki/Debugging
  • 4. Common types of computer bugs. Arithmetic bugs Logic bugs Syntax bugs Multi-threading Resource bugs Interfacing bugs programming bugs Performance bugs Teamworking bugs http://en.wikipedia.org/wiki/Software_bug
  • 5. Things we want to avoid. • Crashing the whole web site. • Corrupt all your customer data. • Make you customer data visible to everyone. • Loose your company lots of money. • Loose your own job because you did something stupid. • Cause all your work mates to loose their jobs as well. • Getting what you did posted on Slashdot.
  • 6. Managing risk. • Use software to restrict what you can do. • Script changes and procedures to avoid errors. • Test what you are going to do on a separate system. • Develop and document contingency plans.
  • 7. Passive monitoring. • Collection of log file information. • Collection of details about Python exceptions. • Collection of performance data for the server host. • Collection of performance data for the web server. • Collection of performance data for the web application.
  • 8. Log file collation and analysis. • Open Source • logstash (http://logstash.net) • graylog2 (http://www.graylog2.org) • Commercial Services • Loggly (http://www.loggly.com) • Splunk (http://www.splunk.com) • LogLogic (http://www/loglogic.com)
  • 9. Recording Python exceptions. • Open Source • Sentry (http://pypi.python.org/pypi/sentry) - Also as paid service. • Commercial Services • New Relic (http://newrelic.com) - Pro feature.
  • 10. Server monitoring. • Open Source • Monit (http://mmonit.com) • Munin (http://munin-monitoring.org) • Cacti (http://www.cacti.net) • Nagios (http://www.nagios.org) • Commercial Services • New Relic (http://newrelic.com) - Free feature.
  • 11. Application performance monitoring. • Commercial Services • New Relic (http://newrelic.com) - Lite (Free), Standard and Pro subscriptions.
  • 12. Your web site is slow.
  • 13. Web page performance analysis. • Online services. • YSlow (http://developer.yahoo.com/yslow/) • Google PageSpeed (https://developers.google.com/speed/pagespeed/) • WebPageTest (http://www.webpagetest.org/) • Browser plugins. • YSlow for FireFox (https://addons.mozilla.org/en-US/firefox/addon/yslow/) • FireBug (http://getfirebug.com/)
  • 14. World Wide Web Consortium. • Resource timing specification. • W3 Resource Timing Specification (http://www.w3.org/TR/resource-timing/)
  • 18. Unknown consumers of time. ????
  • 19. Instrumentation via code change. import newrelic.agent class _Database(UserDict.DictMixin): @newrelic.agent.function_trace() def _commit(self): ... @newrelic.agent.function_trace() def open(file, flag=None, mode=0666): ...
  • 20. Instrumentation via configuration. [newrelic] transaction_tracer.function_trace = dumbdbm:open dumbdbm:_Database._commit
  • 21. Instrumentation by monkey patching. [import-hook:dumbdbm] enabled = true execute = dumbdbm_instrumentation:instrument # dumbdbm_instrument.py from newrelic.api.function_trace import wrap_function_trace def instrument(module): wrap_function_trace(module, 'open') wrap_function_trace(module, '_Database._commit')
  • 23. Profiling tools. • Thread sampling. • plop (http://tech.dropbox.com/?p=272) • statprof (http://pypi.python.org/pypi/statprof/) • Full profiling. • cprofile (http://docs.python.org/library/profile.html) • pytrace (http://pypi.python.org/pypi/pytrace)
  • 24. Targeted function profiling. @function_profile(filename=/'tmp/profile.dat', delay=1.0, checkpoint=30) def open(file, flag=None, mode=0666): ...
  • 25. Controlling what is profiled. class FunctionProfile(object): def __init__(self, profile): self.profile = profile def __enter__(self): self.profile.enable() return self def __exit__(self, exc, value, tb): self.profile.disable()
  • 26. Manual metric collection. • Open Source • metrology - http://metrology.readthedocs.org/en/latest/index.html • mmstats - https://github.com/schmichael/mmstats • pymetrics - https://github.com/jgardner1/Python-Metrics • django-app-metrics - http://pypi.python.org/pypi/django-app-metrics • django-statsd - http://django-statsd.readthedocs.org/en/latest/
  • 27. Interacting via the browser. • Open Source • Paste Error Middleware - http://pythonpaste.org/modules/exceptions.html • django-debug-toolbar - https://github.com/django-debug-toolbar/django-debug-toolbar/ • Paste Debugger - http://pythonpaste.org/modules/evalexception.html • Flask Debugger - http://werkzeug.pocoo.org/docs/debug/
  • 28. Application backdoors. import logging import logging.config logging.config.fileConfig("logging.conf") backdoor = logging.config.listen() backdoor.start()
  • 29. Interactive access. • Embedded interpreter prompt. • eventlet.backdoor - http://eventlet.net/doc/modules/backdoor.html • guppy.heapy.Console - http://guppy-pe.sourceforge.net • twisted.manhole - http://www.lothar.com/tech/twisted/manhole.xhtml • Code injection mechanisms. • pyrasite - http://pyrasite.readthedocs.org/en/latest/index.html • Remote code debuggers. • Komodo IDE - http://www.activestate.com/komodo-ide • PyCharm IDE - http://www.jetbrains.com/pycharm/ • Wing IDE - http://wingware.com/ • PyDev IDE - http://pydev.org/
  • 30. Introducing ispyd. • Download site. • https://github.com/GrahamDumpleton/wsgi-shell • Aims of the package. • Provide a generic framework for implementing an interactive console. • The commands you can run are targeted at a specific purpose. • Plugin based so can control what is available and also extendable. • Remotely accessible and execution of commands scriptable.
  • 31. Connecting to processes. $ ispy ispyd.ini (ispyd) servers 1: (1, '/tmp/ispyd-14905.sock') 2: (1, '/tmp/ispyd-14906.sock') 3: (1, '/tmp/ispyd-14907.sock') (ispyd) connect 1 (ispyd:ll345) plugins ['debugger', 'process', 'python', 'wsgi']
  • 32. Executing commands. (ispyd:ll345) shell process (process:ll345) help Documented commands (type help <topic>): ======================================== cwd egid euid exit gid help pid prompt uid (process:ll345) cwd /Users/graham
  • 33. Power users. (ispyd:ll345) shell python (python:ll345) console Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. (EmbeddedConsole) >>> import os >>> os.getcwd() '/Users/graham' >>> exit()
  • 34. Post-mortem debugging. (ispyd:ll345) shell debugger (debugger:11345) insert __main__:function (debugger:11345) tracebacks {'__main__:function': <traceback object at 0x1013a11b8>} (debugger:11345) debug __main__:function > /Users/graham/wsgi.py(15)function() -> raise RuntimeError('xxx') (Pdb) dir() [] (Pdb) __file__ 'wsgi.py'
  • 35. Extending what is monitored. (ispyd:ll345) shell newrelic (newrelic:ll345) function_trace dumbdbm:open (newrelic:ll345) function_trace dumbdbm:_Database._commit
  • 36. Capacity Analysis Rolling server restart.
  • 37. Was the customer impacted?
  • 38. Stuck requests and backlog.
  • 39. Active requests. (ispyd:ll345) shell requests (debugger:11345) requests ==== 707 ==== thread_id = 140735076232384 start_time = Mon Apr 9 21:49:54 2012 duration = 0.013629 seconds CONTENT_LENGTH = '' ... File: "wsgi.py", line 25, in <module> application.run(host='0.0.0.0', port=port) ...
  • 40. Multiprocess web applications. $ ispy --batch - ispyd.ini << EOF prompt off shell requests requests exit shell newrelic function_trace dumbdbm:open function_trace dumbdbm:_Database._commit exit exit EOF
  • 41. Creating plugins. import psutil class Shell(object): name = 'psutil' def do_num_cpus(self, line): print >> self.stdout, psutil.NUM_CPUS def do_cpu_times(self, line): print >> self.stdout, psutil.cpu_times() def do_virtual_memory(self, line): print >> self.stdout, psutil.virtual_memory() def do_swap_memory(self, line): print >> self.stdout, psutil.swap_memory()
  • 42. Ideas for third party plugins. • Memory. • Process memory usage. • Statistics on objects in use (heapy). • State of the garbage collector. • Profiling. • Initiate sampled profiling for selected functions. • Django. • Current configuration. • Details of loaded applications. • Details of registered middleware. • Details of template libraries. • Testing URLs against URL resolver. • Statistics on cache usage.
  • 43. What am I trying to say? • Use monitoring so you know when problems arise. • One tool alone is not going to provide everything. • Use complimentary tools to get a full picture. • Build in mechanisms that allow deeper debugging. • Treat debugging like any other defined process.
  • 44. New Relic 30 Day Free Pro Trail http://newrelic.com/30 [email protected] @GrahamDumpleton

Editor's Notes

  • #2: \n
  • #3: \n
  • #4: So you have written what you believe is the most amazing web site in the world and deployed it to production. Real customers are using it, it is making money for you, but something is going wrong with it. You don&apos;t quite know what and because it is a real live production web site, you can&apos;t necessarily just go in and start playing with it. What are you going to do, how are you going to debug the problems?\n
  • #5: For some types of problems where you get a nice Python exception traceback the cause may be obvious, but the cause of other things such as data corruption, memory leaks, thread locking issues and general performance problems can be more elusive. Trying to duplicate issues in a development system may sometimes work, but more often than not things only show up once code is deployed to production.\n
  • #6: As developers we would love to be able to just dive in and start poking around in the live web application, but operations staff aren&apos;t going to like that one bit. If we are going to try and do things with a live web application, it has to be things that aren&apos;t going to make things worse. The results of the things we do need to be predictable, with the effect of doing them able to be validated in advance.\n
  • #7: Whatever we do, it is all about managing risk. We don&apos;t want a loose cannon that is going to cause more damage than good. There is no reason though why we can&apos;t do things which do have some level of risk. We just need to be controlled in what we do and make sure we understand the consequences. If making changes, script the actions you are going to take, test them before hand and develop contingency plans to cope with when things do go pear shaped. \n
  • #8: The most benign thing you can do is passive monitoring. That is where you setup in advance mechanisms to collect data on a continual basis. In the event of a problem, you at least then have some forensic information to try and analyse what went wrong. Monitoring can take many forms. This can include collecting log files, details of application exceptions or quite specific performance data.\n
  • #9: In the case of log files, they can come from many sources including the operating system, your web server, your web application, backend application services and databases. These can be spread all over the place. To make sense of them and make it easier to find and correlate information, various free and commercial products exist to help. These tools in simple terms are search engines for log information.\n
  • #10: Log file analysis can only work though if an application actually logs something about an event. In web applications, exceptions often are translated to HTTP 500 errors and no details are logged. In this situation an extra step needs to be taken to configure the framework to record details of exceptions, or to add in additional tools which can intercept exceptions and report them back to a service for storage and later analysis.\n
  • #11: When we move up to server monitoring there are a range of open source choices. What these monitor can be quite extensive but they also can be quite hard to setup and manage depending on the product. For many users the simplicity of a pre configured solution can be just as beneficial, if not easier to deal with, than a highly configurable and highly complex solution. Your mileage may therefore vary depending on the product chosen.\n
  • #12: Want to dive deeper into what is going on inside of your Python web application and New Relic is definitely your friend. In addition to providing server monitoring, New Relic provides real user monitoring and application performance monitoring. For your web application it gives a deeper level of introspection into where time is being spent within your application code, as well as including time spent calling out to external databases and web services.\n
  • #13: So you can easily bring together a set of monitoring tools. The question then is what value are they in debugging an issue as opposed to telling you there is a problem in the first place. The big ticket item with web sites is performance. A high level view which looks across end user time, application time and that of back end services allows you to quickly drill down to where the problem may lie.\n
  • #14: End user monitoring can help you realise that the actual issue is with the page content you are generating rather than the mechanism of generating it. From there you can use various web page performance analysis tools. Keep in mind though that these operate not from the perspective of your actual users but where the online service is located, or your own browser if using a browser plugin.\n
  • #15: In the future, advances like the browser resource timing specification coming out of the World Wide Web Consortium could make such analysis more representative of what the real users are seeing, as it would then be technically possible to report such information direct from users browsers, giving you a much larger data set to work from.\n
  • #16: What now for where the problem is in your application. If using New Relic you can start to drill down and look at performance of individual request handlers, seeing their throughput and response times. You can also get a more detailed view of individual sample slow transactions.\n
  • #17: The performance breakdown in a slow transaction summary gives you a high level overview of where time is being spent for that specific slow transaction. The summary doesn&apos;t necessarily though provide you with any context of where in your code the time consuming operation was made.\n
  • #18: Some level of context can be obtained by drilling down and looking at the details of slow transaction traces, but it is limited to those functions which have been deemed of interest. It needs to be limited in this way to ensure that the overhead of monitoring does not impact the performance of your web application. To do full profiling is just going to be too big of an overhead and affect application performance.\n
  • #19: Because instrumentation is targeted only to areas such as time spent in middleware, view handlers, template rendering and template blocks, eventually you get situations where you get blocks of time where you lack sufficient detail. This is where a monitoring tool can need a bit more help through you indicating what else is of interest in your specific application.\n
  • #20: You have a few choices of how you can do this. The first is to make changes to your actual code base. You can apply function decorators to existing functions, or you can use context manager objects to time within blocks of code within a function. Such changes are obviously intrusive though which could be an issue. Plus it also doesn&apos;t help when you want to time spent in third party code.\n
  • #21: A second approach is to nominate functions of interest by way of a configuration file. This avoids you needing to change code and so can be used with any Python code no matter the origin. It would usually though be limited to simple function tracing.\n
  • #22: A final option is monkey patching. Here you specify a function to be called when a specific module is imported. That function would then go in and monkey patch the code. Which ever approach is used, the problem here is that to get added visibility you need to make a change of some sort and redeploy and restart your application before you will see the additional instrumented functions. It does not provide you a here and now way of delving down any further.\n
  • #23: A partial solution is thread sampling. This is where when required you start up a profiling session, taking a periodic snapshot of what each thread is doing at a specific point at time and from that produce a call tree showing what percentage of time code at a specific point was executing. Unfortunately right now, New Relic at least doesn&apos;t do this for Python, although we have been looking at doing it for a while.\n
  • #24: Separate thread sampling tools do though exist. Dropbox recently announced &apos;plop&apos; along with a pretty visualisation tool to try and make sense of the data. Another is &apos;statprof&apos;, which advertises itself as being able to trace down to line level. The premise behind sampling at least is that the overhead is lower than traditional full profiling such as provided by Python profile modules.\n
  • #25: Ultimately, thread sampling is still an estimate and not as accurate as full profiling. A middle ground though is not to run profiling all the time, but collect samples there as well. That is, don&apos;t profile the whole program, target specific functions and only collect a full profile sample for a call every so often. We could for instance have the criteria be that we collect samples a minimum of 1 second apart and write out the aggregated results after 30 successive calls.\n
  • #26: This can be achieved using the &apos;cProfile&apos; module, a decorator and a bit of context manager magic. Add in a gating mechanism to control how often it is done and we can achieve full profiling for a function of interest, but where it is done infrequently enough that the overhead need not necessarily be a factor in the context of the overall web application. \n
  • #27: New Relic is by no means the only way of instrumenting web applications to collect metrics, although it arguably gives you the most value out of the box with immediate actionable data. Whatever the solution used, at this level we have the same problem. You still need to manually modify your code to add new instrumentation to further explore a problem and then redeploy your web application. Getting more in depth useful data can therefore be a long process.\n
  • #28: What is lacking is the ability to prod your live web application to get it to start yielding the additional data you need while the problem is occurring. Some tools give you this interactivity, but they are only suitable for development environments as they display data back into the browser the request is made from. Sentry provides separate analysis of tracebacks and stack variables after the fact but we still don&apos;t have a way of changing the way the application is running.\n
  • #29: Application backdoors to effect change are not new. The logging module in Python even supplies such a back door. Enable this and it will listen on a socket for connections and allow you to pass the application a new configuration for the logging subsystem. Dangers do exist with such mechanisms. The logging module actually runs eval() on parts of the configuration file meaning that you can actually inject arbitrary code into your application.\n
  • #30: Not concerned about execution of arbitrary code and you could instead elect to expose a full embedded Python interpreter prompt. Go a step further again and you have the rather scary concept of pyrasite, which uses gdb to perform code injection into an arbitrary unmodified Python process. We want something that allows realtime interaction but we also want that access to be more controlled than a full on interpreter or debugger.\n
  • #31: Providing a means for interactive access to running processes is something I have toyed with in trying to help people debug WSGI applications. Following on from PyCon US this year I finally sat down and created a package incorporating some of the ideas I had played with and had code lying around for. Initially it was intended as a shell for WSGI applications but it can be used in any long running services. Eventually the package was called ispyd.\n
  • #32: Depending on your application architecture, the process would listen on either an INET or UNIX domain socket. To hide the details, an ispy client program is used to make and manage the connection. The command interface is driven using the cmd module from Python. Once connected you can list all the plugins which you have configured the system to make available.\n
  • #33: Change to the context of a specific plugin and you can then issue the specific commands which the plugin makes available. Because it isn&apos;t a full interpreter prompt, you can control via what plugins you enable, what commands are available. This way you restrict what can be done and ensure that you can&apos;t do too much damage.\n
  • #34: If you are addicted to power however, then no problem, enable from the configuration file the optional embedded interpreter support and you can jump into the plugin for Python, fire one up and do as much damage as you want.\n
  • #35: If you are comfortable monkey patching a live web application there are a range of other things one could do. One could introduce a wrapper that catches details of exceptions and enables you to the perform post-mortem debugging within the live process. This is similar to tools like the Flask debugger, but done using pdb directly in the live process.\n
  • #36: Finally, monkey patching can also help with our original problem of how does one change what is being monitored by a live web application without a restart. With an interactive console like this it becomes feasible to have commands that would allow us to monkey patch the live system to add the additional function traces. These would only exist until the process exited, but it does at least provide us some coverage until we can make a more permanent change.\n
  • #37: A further problem area where monitoring can be useful is in answering the perennial question of how many processes/threads should I configure my WSGI server to use. Capacity can be viewed relative to normal traffic loads, but can also be used to gauge whether you have sufficient capacity in a farm of servers when you need to perform a rolling restart during a deploy.\n
  • #38: If you have done your homework and have the available capacity, then although you will see a jump in how much of your capacity is used when some servers are taken offline, the effect on application response times will not be affected. Get it wrong though and you could start to see a backlog, with an increase in request queuing time, overall response times and with users subsequently getting increasingly frustrated as the site slows down.\n
  • #39: A further cause of back logging due to inadequate capacity is when requests block and the effective number of available threads drops. Monitoring systems will though often only report on a web transaction once it completes. If a request never completes, you will not get any metrics nor a slow transaction trace.\n
  • #40: This is where an interactive console can again help. In particular you could run a command to dump out details on all active WSGI requests, including request environ details and a Python stack trace. You then just need to find those which have been running for a longer than expected time and see where in the code they appear to be stuck.\n
  • #41: Being an interactive console though we can only talk to one process at a time. What do we do about multi process web applications? Obviously if interacting with an embedded interpreter or debugger session the answer is that there is nothing we can do. What though if we only wish to dump out details of a process or perform monkey patching. What we want here is an ability in the client program to automatically apply a set of commands across a set of servers.\n
  • #42: Because a console oriented interface is being used rather than trying to wrap up things in some higher level message oriented service abstraction, writing new plugins is relatively easy. All that is necessary is to provide a method for each command that writes the response to the output stream object setup for that instance of the shell. For more complicated plugins which require further input, such as an embedded interpreter, the input stream would also be used.\n
  • #43: By virtue of ispyd trying to define one generic infrastructure for managing the console and interaction with it, the goal then is that the wider community will get behind it and develop additional plugins which could be downloaded from PyPi. One could see lots of useful plugins being developed. A good in process memory analysis tool for tracking memory growth would for example be particularly interesting and valuable when trying to debug memory problems.\n
  • #44: In conclusion, what am I trying to say? That is that production systems need not be treated as this special sanctum that only the anointed operations people can touch. Use monitoring systems so you know what problems arise, but be prepared and also put in place mechanisms to help you debug the issues that do arise. Do it in a way though that is controllable and scriptable so that results are predictable. Doing debugging then becomes a normal procedure in the same way deploys are.\n
  • #45: Obviously we would hope that you would see New Relic as a part of your tool set. Whatever you do though, use some level of monitoring. If you have no monitoring at all then not only will you not know immediately when there is a problem, but you will not even know where to start looking to debug it. So become a data nerd and deploy New Relic today. If you are interested in ispyd and want to help with that then contact me afterwards.\n