SlideShare a Scribd company logo
The State of Open
Source Monitoring
Tools
Michael Richardson (@m_richo)
Energized Work
What tools are we currently using to
monitor and troubleshoot our systems?
What tools are we currently using to
monitor and troubleshoot our systems?

•
•
•
•

Nagios
ssh + grep <something_bad> /some/random/log/file.log
tail –f /some/random/log/file.log
Others?
Nagios
Nagios – The lovers
Nagios – The lovers
Nagios – The lovers
Nagios – The lovers
Nagios Love-meter

0

10
Nagios Love-meter
Where are you on the Scale?

0

10
Nagios Love-meter
Where are you on the Scale?

0
Nagios shits
me to tears

10
Sign me up to
Nagios World
Conference 2013!!!!
Alternatives ?
Alternatives ?
Yep, there’s lots
Alternatives ?

Yep, there’s lots
some are better and
some are worse
Today let’s check out
• Graphite
• Statsd
• Logstash
• Sensu
Graphite
Graphite
•
•
•
•
•

Metric storage
Complex graph creation
http://graphite.wikidot.com
Apache 2.0 license
Send time-series data that you are interested in graphing
Graphite
Components
1. Web
2. Whisper
3. Carbon
Graphite
•

Everything stored in graphite has a path with
components delimited by dots. Eg

servers.HOSTNAME.METRIC
applications.APPNAME.METRIC

servers.database01.memfree
applications.trading.loginattempts
Graphite
•
•

No need to pre-define metric end-points
Determine granularity of data upfront.

/opt/graphite/conf/storage-schemas.conf
[stats]
pattern = ^stats.*
retentions = 10:2160,60:10080,600:262974
[catchall]
priority = 0
pattern = ^.*
retentions = 30:86400,300:525600
Graphite
What should I graph/trend?
1. Application Profiling Data
2. Operational Profiling Data
3. Regression Testing (releases)
Why should I Graph/trend?
1. Trends can tell you when something is about to break.
2. …instead of hearing from your customers that it’s broken
3. Data can tell you when something is already broken but
you don’t yet know it (regression).

Source: Jason Dixon (@obfuscurity)
Graphite
Demo

Image source - http://joemiller.me/2011/11/05/correlating-puppet-changes-to-events-in-your-infrastructure/
StatsD
StatsD
•
•
•
•

Measure Anything, Measure Everything
Created and released by Etsy
Aggregate counters and timers
http://github.com/etsy/statsd
StatsD
• Written in node.js
• ~400 lines of javascript
• Listens to statistics (counters & timers),
and sends aggregates to backend
services (like graphite).
• simple
StatsD
Don’t like Javascript or Node.js??
StatsD
Don’t like Javascript or Node.js??
Google “statsd alternatives”…..
StatsD
Don’t like Javascript or Node.js??
Google “statsd alternatives”…..

20+ rewrites/clones for you including..
Ruby, python, scala, python+twisted,
erlang, clojure, C, groovy
StatsD
Concepts
• Buckets (a name that translates to graphite end-point)
• Values
• Flush (default 10 seconds)
Counter metrics
successfullogins:1|c|@0.1
Timing metrics
apitimer:320|ms
StatsD
Counter examples
• Successful customer login attempts
• Failed customer login attempts
• Register a new customer
• Hit 3rd party API
StatsD
Timer examples
• How fast is our function blah()
• How fast is a database query
• How fast is our 3rd party API service
• How fast is our internet access
• How fast are our page response times.
StatsD

demo
LogStash
LogStash
•
•
•
•
•

Tool for managing Events and logs
http://logstash.net
https://github.com/logstash/logstash
Apache 2.0 license
Created by Jordan Sissel
(@jordansissel)
LogStash
• Written in ruby.
• Built with jruby and ships as a jar file.
LogStash
LogStash agent is an Event pipeline with 3
parts.
1. Inputs
2. Filters
3. Outputs
LogStash
1. Inputs – generate events
1. Filters – modify them
1. Outputs – ship them somewhere
LogStash
Inputs include :
amqp, drupal_dblog, eventlog, exec, file,
ganglia, gelf, gemfire, generator, heroku,
irc, log4j, lumberjack, pipe, redis, relp, sqs,
stdin, stomp, syslog, tcp, twitter, udp, xmpp,
zenoss, zeromq
LogStash
Filters include :
alter, anonymize, checksum, csv, date, dns,
environment, gelfify, geoip, grep, grok,
grokdiscovery, json, kv, metrics, multiline,
mutate, noop, split, syslog_pri, urldecode,
xml, zeromq
LogStash
Outputs include :
amqp, boundary, circonus, cloudwatch,
datadog, elasticsearch, elasticsearch_http,
elasticsearch_river, email, exec, file,
ganglia, gelf, gemfire, graphite, graphtastic,
http, internal, irc, juggernaut, librato, loggly,
lumberjack, metriccatcher, mongodb,
nagios, nagios_nsca, null, opentsdb,
pagerduty, pipe, redis, riak, riemann, sns,
sqs, statsd, stdout, stomp, syslog, tcp,
websocket, xmpp, zabbix, zeromq
LogStash
Typical setup
LogStash
Shipper alternatives?
LogStash
Shipper alternatives?
• Syslog (rsyslog, syslog-ng,)
• Lumberjack
https://github.com/jordansissel/lumberjack

• Beaver
https://github.com/josegonzalez/beaver

• Woodchuck
https://github.com/danryan/woodchuck
LogStash
Kibana
• Web interface for viewing logstash
records stored in elastic search
• http://kibana.org/
• http://github.com/rashidkpc/Kibana
• Search for records
• Stream records (near realtime)
• Create RSS feeds based on search
results
• Score, trend data
LogStash
Kibana – search data

Image source - http://kibana.org/
LogStash
Kibana – trend data

Image source - http://kibana.org/
LogStash
Demo
(Syslog & Apache access logs)
LogStash
TIP – Go buy the Logstash Book –
http://logstashbook.com/
James Turnbull (@kartar)
It’s a great introduction to how to use
Logstash.
Open Source Monitoring Tools
Sensu
•
•
•
•
•

https://github.com/sensu/sensu
Creator – Sean Porter (@portertech)
Ruby, RabbitMQ, Redis
<1200 lines of code
Omnibus installation packages
Sensu
Components
• Sensu-server
• Sensu-client
• Sensu-api
• Sensu-dashboard
Sensu
• Message oriented architecture
(messages are JSON objects)
• Described as a monitoring router
• Connects “check” scripts on Sensu
Clients to “handler” scripts on Sensu
Servers
Sensu
Checks can
• Determine if a service like apache up
and running? (check exit code)
• Collect metrics like page views or
database cache usage.
Sensu
Output of checks are router to 1 or more
handlers who determine what to do.
Sensu
Output of checks are router to 1 or more
handlers who determine what to do.
• Send alerts via email, pagerduty, IRC,
twitter, basecamp, xmpp, hipchat,
campfire, etc, etc
Sensu
Output of checks are router to 1 or more
handlers who determine what to do.
• Send alerts via email, pagerduty, IRC,
twitter, basecamp, xmpp, hipchat,
campfire, etc, etc
• Feed metrics to backend services like
graphite, librato, opentsdb, etc, etc
Sensu
demo
Questions??
Thank you

More Related Content

PDF
Open Source Logging and Metric Tools
PPTX
Grafana and MySQL - Benefits and Challenges
PDF
Andrew Nelson - Zabbix and SNMP on Linux
PDF
Open Source Logging and Monitoring Tools
PDF
Elasitcsearch + Logstash + Kibana 日誌監控
PDF
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
PDF
Logmanagement with Icinga2 and ELK
PDF
Logs aggregation and analysis
Open Source Logging and Metric Tools
Grafana and MySQL - Benefits and Challenges
Andrew Nelson - Zabbix and SNMP on Linux
Open Source Logging and Monitoring Tools
Elasitcsearch + Logstash + Kibana 日誌監控
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
Logmanagement with Icinga2 and ELK
Logs aggregation and analysis

What's hot (20)

PDF
ELK introduction
PDF
Easily create dashboards to manage your databases with OVH
PDF
Monitoramento com ELK - Elasticsearch - Logstash - Kibana
PDF
Dave Williams - Nagios Log Server - Practical Experience
PDF
Monitoring Big Data Systems - "The Simple Way"
ODP
Log aggregation and analysis
PPTX
Log analysis using Logstash,ElasticSearch and Kibana
PDF
Logging Application Behavior to MongoDB
PDF
Logstash family introduction
PDF
Tracing Microservices with Zipkin
PDF
ELK Wrestling (Leeds DevOps)
PDF
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
PDF
Monitoring the ELK stack using Zabbix and Grafana (Dennis Kanbier / 26-11-2015)
PPTX
More kibana
PPTX
Elk stack
PDF
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
PDF
Voldemortの紹介
PDF
Fluentd - Flexible, Stable, Scalable
PDF
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
ELK introduction
Easily create dashboards to manage your databases with OVH
Monitoramento com ELK - Elasticsearch - Logstash - Kibana
Dave Williams - Nagios Log Server - Practical Experience
Monitoring Big Data Systems - "The Simple Way"
Log aggregation and analysis
Log analysis using Logstash,ElasticSearch and Kibana
Logging Application Behavior to MongoDB
Logstash family introduction
Tracing Microservices with Zipkin
ELK Wrestling (Leeds DevOps)
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring the ELK stack using Zabbix and Grafana (Dennis Kanbier / 26-11-2015)
More kibana
Elk stack
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
Voldemortの紹介
Fluentd - Flexible, Stable, Scalable
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Ad

Viewers also liked (20)

ODP
Sensu at brightpearl
KEY
Monitoring solutions comparison
PDF
Comparative Analysis of IT Monitoring Tools
PPTX
Time to say goodbye to your Nagios based setup
PDF
Stop using Nagios (so it can die peacefully)
PDF
OSMC 2014: Using elasticsearch, logstash & kibana in system administration | ...
PDF
Processing Big Data in Realtime
PDF
PuppetCamp Sydney 2012 - Building a Multimaster Environment
PDF
Monitoring using Sensu
PPTX
Logstash
PDF
Splunk vs ELK
PDF
Data Driven Monitoring
PDF
Monitor your Atlassian stack like the NSA
PDF
Distributed Stream Processing on Fluentd / #fluentd
PDF
Nagios
PPTX
WhatsUp® Gold 2017 is IT monitoring reimagined
PDF
Présentation Séminaire Supervision 2009
PDF
M&L Webinar: “Open Source ILIAS Plugin: Interactive Videos"
KEY
Writing Your First Plugin
PDF
Cubes - Lightweight OLAP Framework
Sensu at brightpearl
Monitoring solutions comparison
Comparative Analysis of IT Monitoring Tools
Time to say goodbye to your Nagios based setup
Stop using Nagios (so it can die peacefully)
OSMC 2014: Using elasticsearch, logstash & kibana in system administration | ...
Processing Big Data in Realtime
PuppetCamp Sydney 2012 - Building a Multimaster Environment
Monitoring using Sensu
Logstash
Splunk vs ELK
Data Driven Monitoring
Monitor your Atlassian stack like the NSA
Distributed Stream Processing on Fluentd / #fluentd
Nagios
WhatsUp® Gold 2017 is IT monitoring reimagined
Présentation Séminaire Supervision 2009
M&L Webinar: “Open Source ILIAS Plugin: Interactive Videos"
Writing Your First Plugin
Cubes - Lightweight OLAP Framework
Ad

Similar to Open Source Monitoring Tools (20)

KEY
London devops logging
PPTX
MySQL performance monitoring using Statsd and Graphite (PLUK2013)
KEY
Trending with Purpose
PDF
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
PDF
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
PDF
Managing your black friday logs Voxxed Luxembourg
PDF
Elastic Data Analytics Platform @Datadog
PDF
MySQL Performance Monitoring
PDF
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
PDF
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
PDF
Graphite
PDF
Data-Driven Development Era and Its Technologies
PDF
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
PDF
Managing your Black Friday Logs NDC Oslo
PPTX
Analysing GitHub commits with R
PPTX
Introduction to real time big data with Apache Spark
PDF
Lessons learned while building Omroep.nl
PDF
Workshop: Big Data Visualization for Security
PDF
Managing your black friday logs - Code Europe
PDF
J-Day Kraków: Listen to the sounds of your application
London devops logging
MySQL performance monitoring using Statsd and Graphite (PLUK2013)
Trending with Purpose
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Managing your black friday logs Voxxed Luxembourg
Elastic Data Analytics Platform @Datadog
MySQL Performance Monitoring
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Graphite
Data-Driven Development Era and Its Technologies
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs NDC Oslo
Analysing GitHub commits with R
Introduction to real time big data with Apache Spark
Lessons learned while building Omroep.nl
Workshop: Big Data Visualization for Security
Managing your black friday logs - Code Europe
J-Day Kraków: Listen to the sounds of your application

More from m_richardson (10)

PPTX
Persistence in the cloud with bosh
PPTX
bootstrapping containers with confd
PPTX
Docker Service Registration and Discovery
PPTX
Puppetcamp Melbourne - puppetdb
PPTX
Node collaboration - sharing information between your systems
PPTX
Node collaboration - Exported Resources and PuppetDB
PPTX
Serverspec and Sensu - Testing and Monitoring collide
PPTX
Cooking with Chef
PPT
System Availability Talk
PPT
Chef - managing yours servers with Code
Persistence in the cloud with bosh
bootstrapping containers with confd
Docker Service Registration and Discovery
Puppetcamp Melbourne - puppetdb
Node collaboration - sharing information between your systems
Node collaboration - Exported Resources and PuppetDB
Serverspec and Sensu - Testing and Monitoring collide
Cooking with Chef
System Availability Talk
Chef - managing yours servers with Code

Recently uploaded (20)

PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Machine learning based COVID-19 study performance prediction
PDF
Empathic Computing: Creating Shared Understanding
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Advanced Soft Computing BINUS July 2025.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Sensors and Actuators in IoT Systems using pdf
PDF
Advanced IT Governance
PDF
Modernizing your data center with Dell and AMD
PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
madgavkar20181017ppt McKinsey Presentation.pdf
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Advanced methodologies resolving dimensionality complications for autism neur...
Machine learning based COVID-19 study performance prediction
Empathic Computing: Creating Shared Understanding
Chapter 3 Spatial Domain Image Processing.pdf
Spectral efficient network and resource selection model in 5G networks
Diabetes mellitus diagnosis method based random forest with bat algorithm
Advanced Soft Computing BINUS July 2025.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Mobile App Security Testing_ A Comprehensive Guide.pdf
Sensors and Actuators in IoT Systems using pdf
Advanced IT Governance
Modernizing your data center with Dell and AMD
GamePlan Trading System Review: Professional Trader's Honest Take
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
The Rise and Fall of 3GPP – Time for a Sabbatical?
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
Understanding_Digital_Forensics_Presentation.pptx
madgavkar20181017ppt McKinsey Presentation.pdf
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows

Open Source Monitoring Tools

Editor's Notes

  • #5: Anyone want a quick rundown of how it works?Fault detection, notifictations, escalations, acknowledgements, adding new nodes, no ajax
  • #17: Graphite is a highly scalable real-time graphing systemwritten in pythonapache 2.0 license
  • #18: Graphite is a highly scalable real-time graphing systemwritten in pythonapache 2.0 license
  • #19: Web – djangoWhisper – metrics database format (similar to RRDTool). Accepts out-of-order data and supports pipelining of data in a single operation.Carbon – storage engine (agent + cache + persister)
  • #20: Web – djangoWhisper – database for storing time series dataCarbon – listening service for capturing data
  • #21: Web – djangoWhisper – database for storing time series dataCarbon – listening service for capturing data
  • #22: Why Graphing and trendingApplication profiling dataOperational profiling data
  • #23: Why Graphing and trendingApplication profiling dataOperational profiling data
  • #30: Counter example add 1 to the particular bucket. Count is sent at flush interval and reset to 0tells statsd that counter is sampled every 1/10th of the time.Timing exampleAPI service took 320ms to completeStatsd determines percentiles, average (mean), standard deviation, sum, lower and upper bounds for the flush intervalCan support storing histogram of values too (not default)
  • #32: Mean, upper, lower, stddev, upper 90, lower 90, count
  • #42: Embedded web server and embedded elastic searchLead in shipper alternatives
  • #51: Designed with CM in mind
  • #52: Designed with CM in mind
  • #53: Designed with CM in mindDescribe how client registers with server.
  • #54: Reuse nagios plugins