SlideShare a Scribd company logo
Application Metrics
with Prometheus examples
Rafael Dohms @rdohms
How do you do
metrics?
“The Prometheus 

Scientist Method”
I hope not.
jobs.usabilla.com
Rafael Dohms
Staff Engineer
rdohmsdoh.ms
FeedbackFeedback
jobs.usabilla.com
Rafael Dohms
Staff Engineer
rdohmsdoh.ms
We are hiring!

jobs.usabilla.com
Let’s talk about metrics. 



But let’s do it with a
concrete example.
Kafka / DDD / Autonomous Microservices / Monitoring
Kafka / DDD / Autonomous Microservices / Monitoring
Kafka / DDD / Autonomous Microservices / Monitoring
Metrics are insights into
the current state of your
application.
Metrics tell you if your
service is healthy.
Canary Deploys
OksanaLatysheva
Metrics tell you what
is wrong.
Metrics tell you what
is right.
Metrics tell you what
will soon be wrong.
Metrics tell you where
to start looking.
Site Reliability Engineering
SLIs SLOs
◎
SLAs
SLIs
Service Level Indicators
“A quantitative measure of some
aspect of your application”
The response time of a request was 150ms
Source: Site Reliability Engineering - O’Reilly
SLOs
◎
Service Level Objectives
“A target value or a range of values
for something measured by an SLI”
Request response times should be below 200ms
Source: Site Reliability Engineering - O’Reilly
Help you drive architectural
decisions, like optimisation
SLOs
◎
Response time SLO: 150 ms

95th Percentile of Processing time (PHP time): 5ms



As a result we decided to invest more time in exploring the
problem domain and not optimising our stack.
SLAs
Service Level Agreements
“An explicit or implicit contract with
your customer,that includes
consequences of missing their SLOs”
The 99th percentile of requests response times should meet our SLO,or
we will refund users
Source: Site Reliability Engineering - O’Reilly
Measuring
–Etsy Engineering
“If it moves, we track it.”
https://codeascraft.com/2011/02/15/measure-anything-measure-everything/
Metrics
Statistics
What is happening right
now?
How often does this
happen?
Telemetry
Telemetry
“the process of recording and transmitting the readings of an instrument”
Statistics / Analytics
“the practice of collecting and analysing numerical data in large quantities”
Statistics / Analytics
“the practice of collecting and analysing numerical data in large quantities”
I really miss Ayrton Senna
Statistics / Analytics
“the practice of collecting and analysing numerical data in large quantities”
Statistics
Incoming feedback items
with origin information
Telemetry
response time of public
endpoints
“If it moves, we track it.”
Request Latency
System Throughput
Error Rate
Availability
Resource Usage
“If it moves, we track it.”
Request Latency
System Throughput
Error Rate
Availability
Resource Usage
“If it moves, we track it.”
Incoming Data
Peak frequency
CPU
Memory
Disk Space
Bandwith
node
PHP
NginX
Database
Request Latency
System Throughput
Error Rate
Availability
Resource Usage
“If it moves, we track it.”
Incoming Data
Peak frequency
CPU
Memory
Disk Space
Bandwith
node
PHP
NginX
Database
Measure Monitoring
Measure measurements
Metrics,Everywhere.
Application Metrics (with Prometheus examples) #PHPDD18
SLIs
Picking good SLIs
SLIs may change
according to who is
looking at the data.
Understanding the
nature of your system
User-Facing 

serving system?
availability,throughput,latency
Storage System?
availability,durability,latency
Big Data Systems?
throughput,end-to-end latency
User-Facing and Big Data Systems
๏SLIs
- Response time in the“receive”endpoint
- Turn around time,from“receive” to“show”.
- Individual processing time per step
- Data counting: how many,what nature
User-Facing and Big Data Systems
๏SLIs
- Response time in the“receive”endpoint
- Turn around time,from“receive” to“show”.
- Individual processing time per step
- Data counting: how many,what nature
User-Facing and Big Data Systems
More relevant to
development team
๏SLIs
- Response time in the“receive”endpoint
- Turn around time,from“receive” to“show”.
- Individual processing time per step
- Data counting: how many,what nature
๏Other Metrics
- node,nginx,php-fpm,java metrics
- server metrics: cpu,memory,disk space
- Size of cluster
- Kafka health
User-Facing and Big Data Systems
More relevant to
development team
๏SLIs
- Response time in the“receive”endpoint
- Turn around time,from“receive” to“show”.
- Individual processing time per step
- Data counting: how many,what nature
๏Other Metrics
- node,nginx,php-fpm,java metrics
- server metrics: cpu,memory,disk space
- Size of cluster
- Kafka health
User-Facing and Big Data Systems
More relevant to
development team
More relevant to
Infrastructure team
Picking Targets
Target value
SLI value >= target
Target Range
lower bound <= SLI value <= upper bound
Don’t pick a target based
on current performance
What is the business need?
What are users trying to achieve?
How much impact does it have on the user experience?
How long can it take between
the user clicking submit and a confirmation
that our servers received the data?
How long can it take between
the user clicking submit and a confirmation
that our servers received the data?
“Immediate"
“We sell as
real time”
“500ms,too
much HTML“
“I don’t know”
How long can it take between
the user clicking submit and a confirmation
that our servers received the data?
“Immediate"
“We sell as
real time”
“500ms,too
much HTML“
“I don’t know”
What is human perception of
immediate? 100ms
Collection API should respond within 150ms
Some, but not too many.
can you settle an argument or priority based on it?
Don’t over achieve.
The Chubby example.
Adapt. Evolve.
re-define SLO’s as your product evolves.
Meeting Expectations.
Attach consequences
to your Objectives.
The night is dark and
full of loopholes.
take a friend from legal with you.
Safety Margins.
like setting the alarm 5 minutes before the meeting.
Metrics in Practice.
prometheus.io
Push Model
scale this!
Pull Model
scale this!
Prometheus
Telemetry Statistics
Prometheus
StatsD,InfluxDB,etc…
+
Long Term Storage
GaugeHistogramCounter Summary
Cumulative
metric the
represents a
single number
that only
increases
Samples and
count of
observations
over time
A counter,that
can go up or
down
Same as a
histogram but
with stream of
quantiles over a
sliding window.
jimdo/prometheus_client_php
reads from /metrics
reads from local storage
writes to local storage
your code
/metrics
<?php
use PrometheusCounter;
use PrometheusHistogram;
use PrometheusStorageAPC;

require_once 'vendor/autoload.php';
$adapter = new APC();
$histogram = new Histogram(
$adapter,
'my_app',
'response_time_ms',
'This measures ....',
['status', 'url'],
[0, 10, 50, 100]
);
$histogram->observe(15, ['200', '/url']);
$counter = new Counter($adapter, 'my_app', 'count_total',
'How many...', ['status', 'url']);
$counter->inc(['200', '/url']);
$counter->incBy(5, ['200', '/url']);
<?php
use PrometheusCounter;
use PrometheusHistogram;
use PrometheusStorageAPC;

require_once 'vendor/autoload.php';
$adapter = new APC();
$histogram = new Histogram(
$adapter,
'my_app',
'response_time_ms',
'This measures ....',
['status', 'url'],
[0, 10, 50, 100]
);
$histogram->observe(15, ['200', '/url']);
$counter = new Counter($adapter, 'my_app', 'count_total',
'How many...', ['status', 'url']);
$counter->inc(['200', '/url']);
$counter->incBy(5, ['200', '/url']);
<?php
use PrometheusCounter;
use PrometheusHistogram;
use PrometheusStorageAPC;

require_once 'vendor/autoload.php';
$adapter = new APC();
$histogram = new Histogram(
$adapter,
'my_app',
'response_time_ms',
'This measures ....',
['status', 'url'],
[0, 10, 50, 100]
);
$histogram->observe(15, ['200', '/url']);
$counter = new Counter($adapter, 'my_app', 'count_total',
'How many...', ['status', 'url']);
$counter->inc(['200', '/url']);
$counter->incBy(5, ['200', '/url']);
APC / APCu
Redis
<?php
use PrometheusCounter;
use PrometheusHistogram;
use PrometheusStorageAPC;

require_once 'vendor/autoload.php';
$adapter = new APC();
$histogram = new Histogram(
$adapter,
'my_app',
'response_time_ms',
'This measures ....',
['status', 'url'],
[0, 10, 50, 100]
);
$histogram->observe(15, ['200', '/url']);
$counter = new Counter($adapter, 'my_app', 'count_total',
'How many...', ['status', 'url']);
$counter->inc(['200', '/url']);
$counter->incBy(5, ['200', '/url']);
namespace
metric name
help
label names
buckets
<?php
use PrometheusCounter;
use PrometheusHistogram;
use PrometheusStorageAPC;

require_once 'vendor/autoload.php';
$adapter = new APC();
$histogram = new Histogram(
$adapter,
'my_app',
'response_time_ms',
'This measures ....',
['status', 'url'],
[0, 10, 50, 100]
);
$histogram->observe(15, ['200', '/url']);
$counter = new Counter($adapter, 'my_app', 'count_total',
'How many...', ['status', 'url']);
$counter->inc(['200', '/url']);
$counter->incBy(5, ['200', '/url']);
measurement
label values
<?php
use PrometheusCounter;
use PrometheusHistogram;
use PrometheusStorageAPC;

require_once 'vendor/autoload.php';
$adapter = new APC();
$histogram = new Histogram(
$adapter,
'my_app',
'response_time_ms',
'This measures ....',
['status', 'url'],
[0, 10, 50, 100]
);
$histogram->observe(15, ['200', '/url']);
$counter = new Counter($adapter, 'my_app', 'count_total',
'How many...', ['status', 'url']);
$counter->inc(['200', '/url']);
$counter->incBy(5, ['200', '/url']);
namespace
metric name
help
labels
<?php
use PrometheusCounter;
use PrometheusHistogram;
use PrometheusStorageAPC;

require_once 'vendor/autoload.php';
$adapter = new APC();
$histogram = new Histogram(
$adapter,
'my_app',
'response_time_ms',
'This measures ....',
['status', 'url'],
[0, 10, 50, 100]
);
$histogram->observe(15, ['200', '/url']);
$counter = new Counter($adapter, 'my_app', 'count_total',
'How many...', ['status', 'url']);
$counter->inc(['200', '/url']);
$counter->incBy(5, ['200', '/url']);
<?php
use PrometheusCounter;
use PrometheusHistogram;
use PrometheusStorageAPC;

require_once 'vendor/autoload.php';
$adapter = new APC();
$histogram = new Histogram(
$adapter,
'my_app',
'response_time_ms',
'This measures ....',
['status', 'url'],
[0, 10, 50, 100]
);
$histogram->observe(15, ['200', '/url']);
$counter = new Counter($adapter, 'my_app', 'count_total',
'How many...', ['status', 'url']);
$counter->inc(['200', '/url']);
$counter->incBy(5, ['200', '/url']);
<?php
use PrometheusRenderTextFormat;
use PrometheusStorageAPC;
require_once 'vendor/autoload.php';
$adapter = new APC();
$renderer = new RenderTextFormat();
$result = $renderer->render($adapter->collect());
echo $result;
<?php
use PrometheusRenderTextFormat;
use PrometheusStorageAPC;
require_once 'vendor/autoload.php';
$adapter = new APC();
$renderer = new RenderTextFormat();
$result = $renderer->render($adapter->collect());
echo $result;
<?php
use PrometheusRenderTextFormat;
use PrometheusStorageAPC;
require_once 'vendor/autoload.php';
$adapter = new APC();
# HELP my_app_count_total How many...
# TYPE my_app_count_total counter
my_app_count_total{status="200",url="/url"} 6
# HELP my_app_response_time_ms This measures ....
# TYPE my_app_response_time_ms histogram
my_app_response_time_ms_bucket{status="200",url="/url",le="0"} 0
my_app_response_time_ms_bucket{status="200",url="/url",le="10"} 0
my_app_response_time_ms_bucket{status="200",url="/url",le="50"} 1
my_app_response_time_ms_bucket{status="200",url="/url",le="100"} 1
my_app_response_time_ms_bucket{status="200",url="/url",le="+Inf"} 1
my_app_response_time_ms_count{status="200",url="/url"} 1
my_app_response_time_ms_sum{status="200",url="/url"} 16
$renderer = new RenderTextFormat();
$result = $renderer->render($adapter->collect());
echo $result;
–Also Rafael (today)
“I’ll just try this live demo
again.”
http://localhost:9090/graph http://localhost:8180/metrics
–Rafael (yesterday)
“Demos always fail.”
http://localhost:8180/index
https://github.com/rdohms/talk-app-metrics
You can’t act on what
you can’t see.
Application Metrics (with Prometheus examples) #PHPDD18
Application Metrics (with Prometheus examples) #PHPDD18
Metrics without
actionability are just
numbers on a screen.
Act as soon as an 

SLO is threatened .
Thank you.
Drop me some 

feedback at Usabilla 

and make this talk 

better.
@rdohms

http://slides.doh.ms
https://joind.in/talk/56e55

More Related Content

PDF
Application metrics - Confoo 2019
PDF
Application metrics with Prometheus - DPC18
PDF
Application Metrics (with Prometheus examples)
PDF
Splunk conf2014 - Onboarding Data Into Splunk
PDF
Splunk conf2014 - Lesser Known Commands in Splunk Search Processing Language ...
PPTX
Scylla Summit 2017: Managing 10,000 Node Storage Clusters at Twitter
PPTX
Approaches for application request throttling - dotNetCologne
PDF
데이터 기반 의사결정을 통한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
Application metrics - Confoo 2019
Application metrics with Prometheus - DPC18
Application Metrics (with Prometheus examples)
Splunk conf2014 - Onboarding Data Into Splunk
Splunk conf2014 - Lesser Known Commands in Splunk Search Processing Language ...
Scylla Summit 2017: Managing 10,000 Node Storage Clusters at Twitter
Approaches for application request throttling - dotNetCologne
데이터 기반 의사결정을 통한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)

What's hot (16)

PPTX
IoT Austin CUG talk
PDF
初探 OpenTelemetry - 蒐集遙測數據的新標準
PDF
Streamlio and IoT analytics with Apache Pulsar
PPTX
Performing Network & Security Analytics with Hadoop
PDF
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
PDF
NS1 - Pulsar
PPTX
A Cluster Is Only As Strong As its Weakest Link
PPT
UnConference for Georgia Southern Computer Science March 31, 2015
PDF
Kudu austin oct 2015.pptx
PPTX
Exploring .NET memory management - A trip down memory lane - Copenhagen .NET ...
PPTX
Getting Started with Splunk Enterprise
PDF
Data Onboarding
PDF
Provisioning and Capacity Planning Workshop (Dogpatch Labs, September 2015)
PDF
Managing 10,000 Node Storage Clusters at Twitter
DOCX
Kafka Spark Realtime stream processing and analytics in 6 steps
PDF
Lambda at Weather Scale by Robbie Strickland
IoT Austin CUG talk
初探 OpenTelemetry - 蒐集遙測數據的新標準
Streamlio and IoT analytics with Apache Pulsar
Performing Network & Security Analytics with Hadoop
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
NS1 - Pulsar
A Cluster Is Only As Strong As its Weakest Link
UnConference for Georgia Southern Computer Science March 31, 2015
Kudu austin oct 2015.pptx
Exploring .NET memory management - A trip down memory lane - Copenhagen .NET ...
Getting Started with Splunk Enterprise
Data Onboarding
Provisioning and Capacity Planning Workshop (Dogpatch Labs, September 2015)
Managing 10,000 Node Storage Clusters at Twitter
Kafka Spark Realtime stream processing and analytics in 6 steps
Lambda at Weather Scale by Robbie Strickland
Ad

Similar to Application Metrics (with Prometheus examples) #PHPDD18 (20)

PDF
Application Metrics - IPC2023
PDF
Observability foundations in dynamically evolving architectures
PPTX
Deep Dive: AWS X-Ray London Summit 2017
PPTX
Apache Spark Streaming -Real time web server log analytics
PDF
Serverless Apps with AWS Step Functions
PDF
02_Chapter_WorkLoads_DataModeling_Mongodb.pdf
PDF
02_Chapter_WorkLoads_DataModeling_Mongodb.pdf
PDF
Presto at Tivo, Boston Hadoop Meetup
PDF
Elasticsearch in Netflix
PPTX
The journy to real time analytics
PPTX
Application Security at DevOps Speed and Portfolio Scale
PPTX
Real-time Analytics for Data-Driven Applications
PDF
Hadoop application architectures - Fraud detection tutorial
PPTX
Thing you didn't know you could do in Spark
PDF
Big Data LDN 2018: USING FAST DATA AND STREAM PROCESSING TO OPERATIONALISE MA...
PPTX
Performance Forensics - Understanding Application Performance
PDF
Big data on_aws in korea by abhishek sinha (lunch and learn)
PDF
Tame the Mesh An intro to cross-platform tracing and troubleshooting.pdf
PPT
Integris Security - Hacking With Glue ℠
PDF
Introduction to Machine Learning - From DBA's to Data Scientists - OGBEMEA
Application Metrics - IPC2023
Observability foundations in dynamically evolving architectures
Deep Dive: AWS X-Ray London Summit 2017
Apache Spark Streaming -Real time web server log analytics
Serverless Apps with AWS Step Functions
02_Chapter_WorkLoads_DataModeling_Mongodb.pdf
02_Chapter_WorkLoads_DataModeling_Mongodb.pdf
Presto at Tivo, Boston Hadoop Meetup
Elasticsearch in Netflix
The journy to real time analytics
Application Security at DevOps Speed and Portfolio Scale
Real-time Analytics for Data-Driven Applications
Hadoop application architectures - Fraud detection tutorial
Thing you didn't know you could do in Spark
Big Data LDN 2018: USING FAST DATA AND STREAM PROCESSING TO OPERATIONALISE MA...
Performance Forensics - Understanding Application Performance
Big data on_aws in korea by abhishek sinha (lunch and learn)
Tame the Mesh An intro to cross-platform tracing and troubleshooting.pdf
Integris Security - Hacking With Glue ℠
Introduction to Machine Learning - From DBA's to Data Scientists - OGBEMEA
Ad

More from Rafael Dohms (20)

PDF
The Individual Contributor Path - DPC2024
PDF
How'd we get here? A guide to Architectural Decision Records
PDF
Architectural Decision Records - PHPConfBR
PDF
Writing code you won’t hate tomorrow - PHPCE18
PDF
“Writing code that lasts” … or writing code you won’t hate tomorrow. - PHPKonf
PDF
“Writing code that lasts” … or writing code you won’t hate tomorrow. - PHP Yo...
PDF
Composer The Right Way - 010PHP
PDF
Writing Code That Lasts - #Magento2Seminar, Utrecht
PDF
Composer the Right Way - PHPSRB16
PDF
“Writing code that lasts” … or writing code you won’t hate tomorrow. - #PHPSRB16
PDF
Composer the Right Way - MM16NL
PDF
Composer The Right Way - PHPUGMRN
PDF
Composer the Right Way - PHPBNL16
PDF
“Writing code that lasts” … or writing code you won’t hate tomorrow.
PDF
A Journey into your Lizard Brain - PHP Conference Brasil 2015
PDF
“Writing code that lasts” … or writing code you won’t hate tomorrow.
PDF
“Writing code that lasts” … or writing code you won’t hate tomorrow.
PDF
“Writing code that lasts” … or writing code you won’t hate tomorrow.
PDF
Journey into your Lizard Brain - PHPJHB15
PDF
Composer The Right Way #PHPjhb15
The Individual Contributor Path - DPC2024
How'd we get here? A guide to Architectural Decision Records
Architectural Decision Records - PHPConfBR
Writing code you won’t hate tomorrow - PHPCE18
“Writing code that lasts” … or writing code you won’t hate tomorrow. - PHPKonf
“Writing code that lasts” … or writing code you won’t hate tomorrow. - PHP Yo...
Composer The Right Way - 010PHP
Writing Code That Lasts - #Magento2Seminar, Utrecht
Composer the Right Way - PHPSRB16
“Writing code that lasts” … or writing code you won’t hate tomorrow. - #PHPSRB16
Composer the Right Way - MM16NL
Composer The Right Way - PHPUGMRN
Composer the Right Way - PHPBNL16
“Writing code that lasts” … or writing code you won’t hate tomorrow.
A Journey into your Lizard Brain - PHP Conference Brasil 2015
“Writing code that lasts” … or writing code you won’t hate tomorrow.
“Writing code that lasts” … or writing code you won’t hate tomorrow.
“Writing code that lasts” … or writing code you won’t hate tomorrow.
Journey into your Lizard Brain - PHPJHB15
Composer The Right Way #PHPjhb15

Recently uploaded (20)

PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Encapsulation theory and applications.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
cuic standard and advanced reporting.pdf
PPT
Teaching material agriculture food technology
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Dropbox Q2 2025 Financial Results & Investor Presentation
Big Data Technologies - Introduction.pptx
Understanding_Digital_Forensics_Presentation.pptx
Encapsulation theory and applications.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Network Security Unit 5.pdf for BCA BBA.
Diabetes mellitus diagnosis method based random forest with bat algorithm
Mobile App Security Testing_ A Comprehensive Guide.pdf
cuic standard and advanced reporting.pdf
Teaching material agriculture food technology
Unlocking AI with Model Context Protocol (MCP)
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
The AUB Centre for AI in Media Proposal.docx
MIND Revenue Release Quarter 2 2025 Press Release
Review of recent advances in non-invasive hemoglobin estimation
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Encapsulation_ Review paper, used for researhc scholars
Per capita expenditure prediction using model stacking based on satellite ima...
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton

Application Metrics (with Prometheus examples) #PHPDD18