Update on OpenTSDB and AsyncHBase

HBaseCon Update
Distributed, Scalable Time Series Database
Chris Larsen clarsen@yahoo-inc.com

Who Am I? (no really, who am I?)
Chris Larsen
Maintainer for OpenTSDB
Software Engineer @ Yahoo!
Monitoring Team

What Is OpenTSDB?
Open Source Time Series Database
Store trillions of data points
Sucks up all data and keeps going
Never lose precision
Scales using HBase, Cassandra
Or Bigtable

What good is it?
Systems Monitoring & Measurement
Servers
Networks
Sensor Data
The Internet of Things
SCADA
Financial Data
Scientific Experiment Results

Use Cases
Backing store for Argus:
Open source monitoring
and alerting system
15 HBase Servers
6 month retention
10M writes per minute
95p query latency < 30 days = 200ms
Moving to 200 node cluster writing at 100M/m

Use Cases
●Monitoring system, network and application
performance and statistics
110 region servers, 10M writes/s ~ 2PB
Multi-tenant and Kerberos secure HBase
~200k writes per second per TSD
Central monitoring for all Yahoo properties
Over 2 billion time series served

What Are Time Series?
Time Series: data points for an identity
over time
Typical Identity:
Dotted string: web01.sys.cpu.user.0
OpenTSDB Identity:
Metric: sys.cpu.user
Tags (name/value pairs):
host=web01 cpu=0

What Are Time Series?
Data Point:
Metric + Tags
+ Value: 42
+ Timestamp: 1234567890
sys.cpu.user 1234567890 42 host=web01 cpu=0
^ a data point ^

Writing Data
1) Open Telnet style socket, write:
put sys.cpu.user 1234567890 42 host=web01 cpu=0
2) ..or, post JSON to:
http://<host>:<port>/api/put
3) .. or import big files with CLI
No schema definition
No RRD file creation
Just write!

Querying Data
Graph with the GUI
CLI tools
HTTP API
Aggregate multiple series
Simple query language
To average all CPUs on host:
start=1h-ago
avg sys.cpu.user
host=web01

HBase Data Tables
tsdb - Data point table. Massive
tsdb-uid - Name to UID and UID to
name mappings
tsdb-meta - Time series index and
meta-data
tsdb-tree - Config and index for
hierarchical naming schema

Data Table Schema
Row key is a concatenation of UIDs and time:
metric + timestamp + tagk1 + tagv1… + tagkN + tagvN
sys.cpu.user 1234567890 42 host=web01 cpu=0
x00x00x01x49x95xFBx70x00x00x01x00x00x01x00x00x02x00x00x02
Timestamp normalized on 1 hour boundaries
All data points for an hour are stored in one row
Enables fast scans of all time series for a metric
…or pass a row key filter for specific time series with
particular tags

New for OpenTSDB 2.2
● Append writes (no more need for TSD
Compactions)
● Row salting and random metric IDs
● Downsampling Fill Policies
● Query filters (wildcard, regex, group by or not)
● Storage Exception plugin for retrying writes
● Released February 2016

New for OpenTSDB 2.3
● Graphite style expressions
● Cross-metric expressions
● Calendar based downsampling
● New data stores
● UID assignment plugin interface
● Datapoint write filter plugin interface
● RC1 released May 2016
● New Committer, Jonathan Creasy

Fuzzy Row Filter
How do you find a single time
series out of 1 million?
For a day?
For a month?

Fuzzy Row Filter
Instead of running a regex
string comparator over each
byte array formatted key…
(?s)^.{9}(?:.{8})*Qx00x00x00x02
E(?:Q)x00x0F‡x42x2BE)(?:.{8})*$
TSDB query takes 1.6 seconds
for 89,726 rows
KEY
Match -> m t1 tagk1 tagv1
No Match -> m t1 tagk1 tagv2
No Match -> m t1 tagk1 tagv1 tagk2 tagv3
No Match -> m t1 tagk1 tagv2 tagk2 tagv4
Match -> m t2 tagk tagv1
No Match -> m t2 tagk tagv2

Fuzzy Row Filter
Use a byte mask!
● Use the bloom filter to skip-scan
to the next candidate row.
● Combine with regex (after fuzzy
filter) to filter further.
FuzzyFilter{[FuzzyFilterPair{row_key=[18, 68,
-3, -82, 120, 87, 56, -15, 96, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0],
mask=[0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0,
1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1]}]}
Now it takes 0.239 seconds
KEY
Match -> m t1 tagk1 tagv1
Skip -> m t1 tagk1 tagv1 tagk2 tagv3
m t1 tagk1 tagv2 tagk2 tagv4
m t1 tagk3 tagv5
m t1 tagk3 tagv6
Match -> m t2 tagk tagv1
No Match -> m t2 tagk tagv2

Fuzzy Row Filter
Pros:
● Can improve scan latency by orders of magnitude
● Combines nicely with other filters
Cons:
● All row keys for the match have to be the same, fixed
length
● Doesn’t help much when matching the majority of a set
OR if a set has uniform key lengths
● Doesn’t support bitmasks, only byte masks

AsyncHBase
AsyncHBase is a fully asynchronous, multi-
threaded HBase client
Supports HBase 0.90 to 1.x
Faster and less resource intensive than the
native HBase client
Support for scanner filters, META prefetch,
“fail-fast” RPCs

AsyncHBase in YCSB
● New Yahoo! Cloud Serving Benchmark (YCSB)
module for testing AsyncHBase
● Test Params:
○ 1 YCSB worker thread with workload A for run and load
○ Ran consecutive Async -> HBase -> Async -> HBase… (new YCSB JVM
each run) 50 times
○ HBase 1.0.0 stock Apache with default configs
○ Local host, Macbook Pro
○ 10K rows written/read
○ Async writes for both

AsyncHBase in YCSB
HBase Client
Threads:
238
AsyncHBase
Client
Threads:
22

Upcoming in 1.8
●Reverse Scanning
●Multi-Get requests
●Netty 4
●Lots of bug fixes
○Stuck NSRE bugs
○Region client resource leaks

OpenTSDB on Bigtable
● Bigtable
○Hosted Google Service
○Client uses HTTP2 and GRPC for communication
● OpenTSDB heads home
○Based on a time series store on Bigtable at Google
○Identical schema as HBase
○Same filter support (fuzzy filters are coming)

OpenTSDB on Bigtable
● AsyncBigtable
○Implementation of AsyncHBase’s API for drop-in use
○https://github.com/OpenTSDB/asyncbigtable
○Uses HTable API
○Moving to native Bigtable API
● Thanks to Christos of Pythian, Solomon, Carter, Misha,
and the rest of the Google Bigtable team
● https://www.pythian.com/blog/run-opentsdb-google-
bigtable/#

OpenTSDB on Cassandra
● AsyncCassandra - Implementation of AsyncHBase’s
API for drop-in use
● Wraps Netflix’s Astyanax for asynchronous calls
● Requires the ByteOrderedPartitioner and legacy
API
● Same schema as HBase/Bigtable
● Scan filtering performed client side
● May not work with future Cassandra versions
if they drop the API

Community
Salesforce Argus
●Time series monitoring
and alerting
●Multi-series annotations
●Dashboards
Thanks to Tom Valine and the Salesforce engineers
https://medium.com/salesforce-open-source/argus-time-series-monitoring-and-
alerting-d2941f67864#.ez7mbo3ek
https://github.com/SalesforceEng/Argus

Community
Turn Splicer
●API to shard TSDB queries
●Locality advantage hosting
TSDs on region servers
●Query caching
Thanks to Jonathan Creasy and the Turn engineers
https://github.com/turn/splicer

The Future
Reworked query pipeline for selective ordering
of operations
Histogram support
Flexible query caching framework
Distributed queries
Greater data store abstraction

More Information
Thank you to everyone who has helped test, debug and add to OpenTSDB
2.2 and 2.3 including, but not limited to:
Kyle, Ivan, Davide, Liu, Utkarsh, Andy, Anna, Camden, Can, Carlos, Hugo, Isaih, Kevin, Ping, Jonathan
Contribute at github.com/OpenTSDB/opentsdb
Website: opentsdb.net
Documentation: opentsdb.net/docs/build/html
Mailing List: groups.google.com/group/opentsdb
Images
http://photos.jdhancock.com/photo/2013-06-04-212438-the-lonely-vacuum-of-space.html
http://en.wikipedia.org/wiki/File:Semi-automated-external-monitor-defibrillator.jpg
http://upload.wikimedia.org/wikipedia/commons/1/17/Dining_table_for_two.jpg
http://upload.wikimedia.org/wikipedia/commons/9/92/Easy_button.JPG
https://www.flickr.com/photos/verbeeldingskr8/15563333617
http://www.flickr.com/photos/ladydragonflyherworld/4845314274/
http://lego.cuusoo.com/ideas/view/96

Update on OpenTSDB and AsyncHBase

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Update on OpenTSDB and AsyncHBase (20)

More from HBaseCon (20)

Recently uploaded (20)

Update on OpenTSDB and AsyncHBase