SlideShare a Scribd company logo
© 2018 Bloomberg Finance L.P. All rights reserved.
Serving Billions of
Queries in Millisecond
Latency
HBaseConAsia 2018
August 17, 2018
Biju Nair
bnair10@bloomberg.net
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Agenda
• Need for low latency
• HBase principles
• Modeling
• Implementation
• Monitoring and Tuning
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Bloomberg by the numbers
• Founded in 1981
• 325,000 subscribers in 170 countries
• Over 19,000 employees in 192 locations
• More News reporters than The New York Times + Washington Post +
Chicago Tribune
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Bloomberg Tech
• More than 5,000 software engineers (and growing)
• 100+ engineers and data scientists devoted to machine learning
• One of the largest private networks in the world
• 100B+ tick messages per day, with a peak of more than 10 million
messages/second
• 2M news stories ingested / published each day (that's >500 news stories
ingested/second)
• News content from 125K+ sources
• More than a billion messages (emails and IB chats) processed each day
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Bloomberg in a nutshell
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Data Storage and Retrieval
• Files
• VSAM
• Network
• Hierarchical
• Relational
• MPP
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Using RDBMS
• Use Case
• Entities and Relations
• Logical data model
• Physical data model
• Implementation and tuning
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
HBase Principles
• Ordered Key Value Store
• Distributed
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Key Value
Key-9999 Value-a
Key-9998 Value-b
Key-9997 Value-c
Key-9996 Value-d
…
…
Key-9995 Value-e
Key-9994 Value-f
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Ordered Key Value
Key-9999 Value-a
Key-9998 Value-b
Key-9997 Value-c
Key-9996 Value-d
…
…
Key-9995 Value-e
Key-9994 Value-a
Lexicographicorder
Key-9993 Value-g
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Distributed Order Key Value
Key-199 Value
Key-198 Value
Key-197 Value
Key-499 Value
…
Key-498 Value
Key-497 Value
ordered
…
Key-299 Value
…
Key-298 Value
Key-297 Value
ordered
…
…
…
Key-599 Value
…
Key-598 Value
Key-597 Value
…
Key-399 Value
…
Key-398 Value
Key-397 Value
…
Key-999 Value
…
Key-998 Value
Key-997 Value
…
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Abstraction
• Table row view
• Versioning
• ACIDity
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Table Row View
Key Value
Column Id ValueRow Id Timestamp
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Table Row View
Key11|col1|1234567 Value-A
Key11|col2|1234567 Value-B
Key11|col3|1234567 Value-C
Key11|col4|1234567 Value-D
Key11
Value-A Value-B Value-C
Col1 Col2 Col3
Value-D
Col4
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Versioning
Key11|col1|1234567 Value-A1
Key11|col1|1234566 Value-A
Key11|col2|1234567 Value-B
Key11|col3|1234567 Value-CC
Key11|col3|1234563 Value-C
Key11|col4|1234567 Value-DD
Key11|col4|1234560 Value-D1
Key11|col4|1234557 Value-D
Descendingorder
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
ACIDity
• Atomic at row level
• Consistent to a point in time before the request
• Isolation through MVCC (reads) and row locks (mutations)
• Durability is guaranteed for all successful mutations
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Data Modeling
• Fitness for key value store
— Can’t build relations
— No secondary indexes
— De-normalization
• Understand queries to design key
— Data Skew
— Query Skew
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Data Skew
Key-e Value
Key-e Value
Key-e Value
Key-e Value
Key-a Value
Key-a Value
Key-a Value
Key-b Value
Key-b Value
Key-b Value
Key-a Value
Key-h Value
Key-h Value
Key-h Value
Key-d Value
Key-d Value
Key-d Value
Key-f Value
Key-f Value
Key-x Value
Key-x Value
Key-z Value
Key-z Value
Key-y Value
Key-y Value
Key-e Value
Key-e Value
Key-e Value
Key-e Value
Key-e Value
Key-e Value
Key-e Value
Key-e Value
Key-e Value
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Data Skew
Key-e Value
Key-e Value
Key-e Value
Key-e Value
Key-a Value
Key-a Value
Key-a Value
Key-b Value
Key-b Value
Key-b Value
Key-a Value
Key-h Value
Key-h Value
Key-h Value
Key-d Value
Key-d Value
Key-d Value
Key-f Value
Key-f Value
Key-x Value
Key-x Value
Key-z Value
Key-z Value
Key-y Value
Key-y Value
Key-e Value
Key-e Value
Key-e Value
Key-e Value
Key-e Value
Key-e Value
Key-e Value
Key-e Value
Key-e Value
Hotspot
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Query Skew
Key-199 Value
Key-198 Value
Key-197 Value
Key-499 Value
…
Key-498 Value
Key-497 Value
…
Key-299 Value
…
Key-298 Value
Key-297 Value
…
…
…
Key-599 Value
…
Key-598 Value
Key-597 Value
…
Key-399 Value
…
Key-398 Value
Key-397 Value
…
Key-999 Value
…
Key-998 Value
Key-997 Value
…
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Query Skew
Key-199 Value
Key-198 Value
Key-197 Value
Key-499 Value
…
Key-498 Value
Key-497 Value
…
Key-299 Value
…
Key-298 Value
Key-297 Value
…
…
…
Key-599 Value
…
Key-598 Value
Key-597 Value
…
Key-399 Value
…
Key-398 Value
Key-397 Value
…
Key-999 Value
…
Key-998 Value
Key-997 Value
…
Queries
Queries
Queries
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Query Skew
Key-199 Value
Key-198 Value
Key-197 Value
Key-499 Value
…
Key-498 Value
Key-497 Value
…
Key-299 Value
…
Key-298 Value
Key-297 Value
…
…
…
Key-599 Value
…
Key-598 Value
Key-597 Value
…
Key-399 Value
…
Key-398 Value
Key-397 Value
…
Key-999 Value
…
Key-998 Value
Key-997 Value
…
Queries
Queries
Queries
Hotspot
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
HBase
Data Write
Memory
File System
WAL t1 t1 t1
Store files
Write
1
2
3
Block Block Block
Data Idx Blm
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
HBase
Data Read
Memstore
File System
WAL
Read
1
Block Cache
2
Block
t1 t1 t1
Store files
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Cache
• Pack more blocks into cache
— Block size
— Column Family
• Large cache
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Block Size and Cache Use
BlockCache
unuseduse
64 K Blocks
unuseduse
unuseduse unuseduse
unuseduse unuseduse
unuseduse unuseduse
unuseduse unuseduse
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Block Size and Cache Use
BlockCache
unuseduse
64 K Blocks
unuseduse
unuseduse unuseduse
unuseduse unuseduse
unuseduse unuseduse
unuseduse unuseduse
BlockCache
use
16 K Blocks
use
use use
use use
use use
use use
use use
use use use
use use
use use
use use use
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Block Size vs. Read Latency
BAvg 16.731 16.728 16.761 16.763 16.418 16.371 16.37 16.431 16.152 16.14 16.169 16.158 16.308 16.29 16.325 16.307 16.34 16.381 16.391 16.352
BMedian 14 14 14 14 13 13 13 13 15 15 15 15 13 13 13 13 13 13 13 13
B95% 41 41 41 41 41 41 41 41 43 43 43 43 40 40 40 40 41 41 41 41
B99% 55 55 55 55 54 54 54 54 55 55 55 55 54 54 54 54 54 54 55 54
B99.9% 71 71 71 71 70 70 70 70 67 67 67 67 71 70 70 71 71 71 71 70
BMax 545 1062 559 567 1075 1027 561 567 564 541 558 1062 1062 561 1075 1072 1067 563 1035 1032
Avg 3.002 5.362 5.361 5.357 6.419 6.369 6.405 6.383 6.188 6.196 6.182 6.174 6.246 6.264 6.268 6.253 5.194 5.207 5.219 3.031
Median 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
95% 10 15 15 15 18 18 18 18 18 18 17 17 18 18 18 18 15 15 15 10
99% 15 26 26 26 30 30 30 30 28 28 28 28 29 29 29 29 25 24 25 15
99.90% 26 41 41 41 45 45 45 45 43 43 43 43 44 44 44 44 41 41 41 26
Max 2261 127 185 102 90 106 92 102 93 106 119 114 89 140 132 82 81 150 93 1910
Get Performance (ms) – 64 K Block
Get Performance (ms) – 16 K Block
Note: Smaller block size increases index block size
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Block Size vs. Index Size
Idx Sz K Bloom K
266346 2368
247895 2240
225561 2096
253633 2368
224862 2016
225685 2096
16 K Blocks
Idx Sz K Bloom K
472058 2432
574239 2944
331899 1792
471362 2304
517272 2560
469543 2432
8 K Blocks
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Column Family
Key11|col1|1234567 Value-A1
Key11|col1|1234566 Value-A
Key11|col2|1234567 Value-B
Key11|col3|1234567 Value-CC
Key11|col3|1234563 Value-C
Key11|col4|1234567 Value-DD
Key11|col4|1234560 Value-D1
Key11|col4|1234557 Value-D
BlockCache
use use
use use
use use
use use
use use
use use
use use use
use use
use use
use use use
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Column Family
File System
t1:cf2 t1:cf2
Store files
t1:cf1 t1:cf1
cf1:col1 Timestamp cf2:col1 TimestampRow Id ValueValueRow Id
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Column Family
BlockCache
cf1 cf1
cf1 cf1
cf1 cf1
cf1 cf1
cf1 cf1
cf1 cf1
cf1 cf1 cf1
cf1 cf1
cf1 cf1
cf1 cf1 cf1
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Column Family
BlockCache
cf1 cf1
cf1 cf1
cf1 cf1
cf1 cf1
cf1 cf1
cf1 cf1
cf1 cf1 cf1
cf1 cf1
cf1 cf1
cf1 cf1 cf1
BlockCache
cf1 cf1
cf1 cf1
cf1 cf1
cf1 cf2
cf1 cf2
cf1 cf1
cf1 cf1 cf1
cf1 cf1
cf2 cf1
cf1 cf2 cf1
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Compaction
File System
K-x K-x D-1
Store files
K-xK-xK-x
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Compaction
File System
K-x K-x D-1
Store files
K-xK-xK-x
File System
Store files
K-x
Compaction
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Compaction
• Part of regular HBase operations
• Minor Compaction
• Major Compaction
• Utilizes server and HBase resources
• Major compaction can be scheduled
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Short Circuit Read
HBase/DFS
Client
HDFS
File System
Data
TCP
HBase/DFS
Client
HDFS
File System
Data
TCP
Open
File
Pass
FD
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Garbage Collection
HBase Memstore
File System
WAL t1 t1 t1
Store filesRead
1
Block Cache
2
Block
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Garbage Collection
HBase Memstore
File System
WAL t1 t1 t1
Store filesRead
1
Block Cache
3
Block
Off-heap Cache Block
2
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Large Cache
Avg 2.693 2.814 2.836 2.842 2.812
Median 1 1 1 1 1
95% 8 8 8 8 8
99% 14 14 14 14 15
99.90% 20 20 20 20 20
99.99% 32 31 32 32 33
100.00% 313 319 315 376 341
Max latency 1049 1046 1048 1044 1235
61 GB of Cache
Avg 3.872 3.995 3.936 4.007 4.052
Median 1 1 1 1 1
95% 14 14 14 15 15
99% 20 20 20 20 20
99.90% 27 27 27 28 28
99.99% 36 36 36 37 37
100.00% 208 310 332 207 232
Max latency 1360 1906 1736 1359 1363
93 GB of Cache
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Garbage Collection
• Fine tune Garbage Collector
• For example, some CMS GC options to look at
— ExplicitGCInvokesConcurrent
— CMSInitiatingOccupancyFraction
— UseCMSInitiatingOccupancyOnly
— ParallelGCThreads
— UseParNewGC
• Log GC info which will help with tuning
— PrintGCDetails
— Loggc
— PrintTenuringDistribution
— …
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Region Replication
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Region Replication
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Region Replication
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1T1 r1
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Region Replication
• Requires changes to cluster configuration
— hbase.region.replica.replication.enabled
— hbase.regionserver.storefile.refresh.period (not the complete list)
• Need to specify region replication in table definition
— create 't1', 'f1', {REGION_REPLICATION => 2}
• Client needs to specify when to read secondary
— get1.setConsistency(Consistency.TIMELINE);
— hbase.client.primaryCallTimeout.get
— hbase.client.primaryCallTimeout.multiget
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Region Replication
Time (ms) readers totalQuery totalStale %age
3,000 512 1,520,207 0 0.00%
3,000 512 1,520,207 0 0.00%
3,000 512 1,520,207 0 0.00%
1,000 512 1,520,207 0 0.00%
1,000 512 1,520,207 0 0.00%
1,000 512 1,520,207 0 0.00%
100 512 1,520,207 5,101 0.34%
100 512 1,520,207 1,476 0.10%
100 512 1,520,207 74 0.00%
50 512 1,520,207 6,173 0.41%
50 512 1,520,207 4,785 0.31%
50 512 1,520,207 5,263 0.35%
10 512 1,520,207 22,518 1.48%
10 512 1,520,207 16,818 1.11%
10 512 1,520,207 19,050 1.25%
PrimaryCall Timeout Vs Stale Calls
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Application
• Table pre-split
• Connection reuse
• No read cache/scan cache/batch requests
• Bulk load instead of Put/Batch Mutate
• Column Identifiers
• Code on server
— Co-processor
— Filters
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Monitoring
• Cache hit ratio
• Data locality
• GC pause
• Compactions
• Call queue
• Read latencies
• Server metrics
© 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved.
Summary
• Familiarize system principles
• Use DB development lifecycle
• Ascertain natural fitness for HBase
© 2018 Bloomberg Finance L.P. All rights reserved.
Thank You!
Reference: http://hbase.apache.org
Connect with Bloomberg’s Hadoop Team:
hadoop@bloomberg.net

More Related Content

PDF
Multi-Tenant HBase Cluster - HBaseCon2018-final
PDF
HBase Internals And Operations
PDF
Cursor Implementation in Apache Phoenix
PPTX
2 - Trafodion and Hadoop HBase
PPTX
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
PPTX
1 - The Case for Trafodion
PPTX
Trafodion – an enterprise class sql based on hadoop
PPTX
Hadoop development series(1)
Multi-Tenant HBase Cluster - HBaseCon2018-final
HBase Internals And Operations
Cursor Implementation in Apache Phoenix
2 - Trafodion and Hadoop HBase
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
1 - The Case for Trafodion
Trafodion – an enterprise class sql based on hadoop
Hadoop development series(1)

What's hot (17)

PDF
Geospatial Big Data - Foss4gNA
PDF
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
PDF
Pivotal HAWQ 소개
POTX
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
PPT
Intro to big data and hadoop ubc cs lecture series - g fawkes
PPTX
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
PDF
High-Performance Input Pipelines for Scalable Deep Learning
PPTX
Hive - Cost Based Optimizer
PDF
Federated Queries Across Both Different Storage Mediums and Different Data En...
PPTX
Lessons Learned Migrating from IBM BigInsights to Hortonworks Data Platform
PPTX
Protecting your Critical Hadoop Clusters Against Disasters
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
PPTX
What's new in apache hive
PPTX
Big data course
PPTX
Big data at United Airlines
PPTX
ExxonMobil’s journey to unleash time-series data with open source technology
Geospatial Big Data - Foss4gNA
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Pivotal HAWQ 소개
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
Intro to big data and hadoop ubc cs lecture series - g fawkes
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
High-Performance Input Pipelines for Scalable Deep Learning
Hive - Cost Based Optimizer
Federated Queries Across Both Different Storage Mediums and Different Data En...
Lessons Learned Migrating from IBM BigInsights to Hortonworks Data Platform
Protecting your Critical Hadoop Clusters Against Disasters
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
What's new in apache hive
Big data course
Big data at United Airlines
ExxonMobil’s journey to unleash time-series data with open source technology
Ad

Similar to Serving queries at low latency using HBase (20)

PDF
Real-Time Market Data Analytics Using Kafka Streams
PDF
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
PDF
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
PPTX
Moneyball: Using Advanced Account Insights for Effective ABM Activation
PDF
Replaying KStreams Apps Using State Snapshots (Nishchay Sinha & Yan Wang, Blo...
PPTX
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
PPTX
How Financial Services can Save On File Storage
PDF
Augmented OLAP for Big Data
PDF
Augmented OLAP Analytics for Big Data
PDF
From Data To Insights
PDF
Building csm while going from on premise to saa s
PPTX
Dataworks | 2018-06-20 | Gimel data platform
PPTX
Gimel at Dataworks Summit San Jose 2018
PDF
Modern Product Data Workflows: How King Crushes New Product Development using...
PPTX
Modern Product Data Workflows: How King Crushes New Product Development using...
PPTX
Dev348 ReInvent Corteva Agriscience
PPTX
How data modelling helps serve billions of queries in millisecond latency wit...
PDF
knowledge
PPTX
Next Generation of Treasury Technology Cash Adventure- Brad Teaver.pptx
PPTX
[US & Canda Webinar] The Top 3 Data Sanitization Challenges – And How to Over...
Real-Time Market Data Analytics Using Kafka Streams
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Moneyball: Using Advanced Account Insights for Effective ABM Activation
Replaying KStreams Apps Using State Snapshots (Nishchay Sinha & Yan Wang, Blo...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
How Financial Services can Save On File Storage
Augmented OLAP for Big Data
Augmented OLAP Analytics for Big Data
From Data To Insights
Building csm while going from on premise to saa s
Dataworks | 2018-06-20 | Gimel data platform
Gimel at Dataworks Summit San Jose 2018
Modern Product Data Workflows: How King Crushes New Product Development using...
Modern Product Data Workflows: How King Crushes New Product Development using...
Dev348 ReInvent Corteva Agriscience
How data modelling helps serve billions of queries in millisecond latency wit...
knowledge
Next Generation of Treasury Technology Cash Adventure- Brad Teaver.pptx
[US & Canda Webinar] The Top 3 Data Sanitization Challenges – And How to Over...
Ad

More from Biju Nair (14)

PDF
Chef conf-2015-chef-patterns-at-bloomberg-scale
PDF
Apache Kafka Reference
PDF
Hadoop security
PDF
Chef patterns
PDF
HBase Application Performance Improvement
PDF
HDFS User Reference
PDF
NENUG Apr14 Talk - data modeling for netezza
PDF
Netezza workload management
PDF
Row or Columnar Database
PDF
Using Netezza Query Plan to Improve Performace
PDF
Netezza fundamentals for developers
PDF
Concurrency
PDF
Project Risk Management
PDF
Websphere MQ (MQSeries) fundamentals
Chef conf-2015-chef-patterns-at-bloomberg-scale
Apache Kafka Reference
Hadoop security
Chef patterns
HBase Application Performance Improvement
HDFS User Reference
NENUG Apr14 Talk - data modeling for netezza
Netezza workload management
Row or Columnar Database
Using Netezza Query Plan to Improve Performace
Netezza fundamentals for developers
Concurrency
Project Risk Management
Websphere MQ (MQSeries) fundamentals

Recently uploaded (20)

DOCX
The Five Best AI Cover Tools in 2025.docx
PDF
PTS Company Brochure 2025 (1).pdf.......
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PPTX
Materi_Pemrograman_Komputer-Looping.pptx
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
DOCX
Looking for a Tableau Alternative Try Helical Insight Open Source BI Platform...
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Presentation of Computer CLASS 2 .pptx
PPTX
Materi-Enum-and-Record-Data-Type (1).pptx
PDF
System and Network Administraation Chapter 3
PPTX
CRUISE TICKETING SYSTEM | CRUISE RESERVATION SOFTWARE
PPTX
L1 - Introduction to python Backend.pptx
PDF
AI in Product Development-omnex systems
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Digital Strategies for Manufacturing Companies
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
5 Lead Qualification Frameworks Every Sales Team Should Use
PPTX
ai tools demonstartion for schools and inter college
PPTX
Transform Your Business with a Software ERP System
The Five Best AI Cover Tools in 2025.docx
PTS Company Brochure 2025 (1).pdf.......
VVF-Customer-Presentation2025-Ver1.9.pptx
Materi_Pemrograman_Komputer-Looping.pptx
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Looking for a Tableau Alternative Try Helical Insight Open Source BI Platform...
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Presentation of Computer CLASS 2 .pptx
Materi-Enum-and-Record-Data-Type (1).pptx
System and Network Administraation Chapter 3
CRUISE TICKETING SYSTEM | CRUISE RESERVATION SOFTWARE
L1 - Introduction to python Backend.pptx
AI in Product Development-omnex systems
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Digital Strategies for Manufacturing Companies
2025 Textile ERP Trends: SAP, Odoo & Oracle
5 Lead Qualification Frameworks Every Sales Team Should Use
ai tools demonstartion for schools and inter college
Transform Your Business with a Software ERP System

Serving queries at low latency using HBase

  • 1. © 2018 Bloomberg Finance L.P. All rights reserved. Serving Billions of Queries in Millisecond Latency HBaseConAsia 2018 August 17, 2018 Biju Nair [email protected]
  • 2. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Agenda • Need for low latency • HBase principles • Modeling • Implementation • Monitoring and Tuning
  • 3. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Bloomberg by the numbers • Founded in 1981 • 325,000 subscribers in 170 countries • Over 19,000 employees in 192 locations • More News reporters than The New York Times + Washington Post + Chicago Tribune
  • 4. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Bloomberg Tech • More than 5,000 software engineers (and growing) • 100+ engineers and data scientists devoted to machine learning • One of the largest private networks in the world • 100B+ tick messages per day, with a peak of more than 10 million messages/second • 2M news stories ingested / published each day (that's >500 news stories ingested/second) • News content from 125K+ sources • More than a billion messages (emails and IB chats) processed each day
  • 5. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Bloomberg in a nutshell
  • 6. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Data Storage and Retrieval • Files • VSAM • Network • Hierarchical • Relational • MPP
  • 7. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Using RDBMS • Use Case • Entities and Relations • Logical data model • Physical data model • Implementation and tuning
  • 8. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. HBase Principles • Ordered Key Value Store • Distributed
  • 9. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Key Value Key-9999 Value-a Key-9998 Value-b Key-9997 Value-c Key-9996 Value-d … … Key-9995 Value-e Key-9994 Value-f
  • 10. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Ordered Key Value Key-9999 Value-a Key-9998 Value-b Key-9997 Value-c Key-9996 Value-d … … Key-9995 Value-e Key-9994 Value-a Lexicographicorder Key-9993 Value-g
  • 11. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Distributed Order Key Value Key-199 Value Key-198 Value Key-197 Value Key-499 Value … Key-498 Value Key-497 Value ordered … Key-299 Value … Key-298 Value Key-297 Value ordered … … … Key-599 Value … Key-598 Value Key-597 Value … Key-399 Value … Key-398 Value Key-397 Value … Key-999 Value … Key-998 Value Key-997 Value …
  • 12. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Abstraction • Table row view • Versioning • ACIDity
  • 13. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Table Row View Key Value Column Id ValueRow Id Timestamp
  • 14. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Table Row View Key11|col1|1234567 Value-A Key11|col2|1234567 Value-B Key11|col3|1234567 Value-C Key11|col4|1234567 Value-D Key11 Value-A Value-B Value-C Col1 Col2 Col3 Value-D Col4
  • 15. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Versioning Key11|col1|1234567 Value-A1 Key11|col1|1234566 Value-A Key11|col2|1234567 Value-B Key11|col3|1234567 Value-CC Key11|col3|1234563 Value-C Key11|col4|1234567 Value-DD Key11|col4|1234560 Value-D1 Key11|col4|1234557 Value-D Descendingorder
  • 16. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. ACIDity • Atomic at row level • Consistent to a point in time before the request • Isolation through MVCC (reads) and row locks (mutations) • Durability is guaranteed for all successful mutations
  • 17. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Data Modeling • Fitness for key value store — Can’t build relations — No secondary indexes — De-normalization • Understand queries to design key — Data Skew — Query Skew
  • 18. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Data Skew Key-e Value Key-e Value Key-e Value Key-e Value Key-a Value Key-a Value Key-a Value Key-b Value Key-b Value Key-b Value Key-a Value Key-h Value Key-h Value Key-h Value Key-d Value Key-d Value Key-d Value Key-f Value Key-f Value Key-x Value Key-x Value Key-z Value Key-z Value Key-y Value Key-y Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value
  • 19. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Data Skew Key-e Value Key-e Value Key-e Value Key-e Value Key-a Value Key-a Value Key-a Value Key-b Value Key-b Value Key-b Value Key-a Value Key-h Value Key-h Value Key-h Value Key-d Value Key-d Value Key-d Value Key-f Value Key-f Value Key-x Value Key-x Value Key-z Value Key-z Value Key-y Value Key-y Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Hotspot
  • 20. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Query Skew Key-199 Value Key-198 Value Key-197 Value Key-499 Value … Key-498 Value Key-497 Value … Key-299 Value … Key-298 Value Key-297 Value … … … Key-599 Value … Key-598 Value Key-597 Value … Key-399 Value … Key-398 Value Key-397 Value … Key-999 Value … Key-998 Value Key-997 Value …
  • 21. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Query Skew Key-199 Value Key-198 Value Key-197 Value Key-499 Value … Key-498 Value Key-497 Value … Key-299 Value … Key-298 Value Key-297 Value … … … Key-599 Value … Key-598 Value Key-597 Value … Key-399 Value … Key-398 Value Key-397 Value … Key-999 Value … Key-998 Value Key-997 Value … Queries Queries Queries
  • 22. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Query Skew Key-199 Value Key-198 Value Key-197 Value Key-499 Value … Key-498 Value Key-497 Value … Key-299 Value … Key-298 Value Key-297 Value … … … Key-599 Value … Key-598 Value Key-597 Value … Key-399 Value … Key-398 Value Key-397 Value … Key-999 Value … Key-998 Value Key-997 Value … Queries Queries Queries Hotspot
  • 23. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. HBase Data Write Memory File System WAL t1 t1 t1 Store files Write 1 2 3 Block Block Block Data Idx Blm
  • 24. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. HBase Data Read Memstore File System WAL Read 1 Block Cache 2 Block t1 t1 t1 Store files
  • 25. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Cache • Pack more blocks into cache — Block size — Column Family • Large cache
  • 26. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Block Size and Cache Use BlockCache unuseduse 64 K Blocks unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse
  • 27. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Block Size and Cache Use BlockCache unuseduse 64 K Blocks unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse BlockCache use 16 K Blocks use use use use use use use use use use use use use use use use use use use use use
  • 28. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Block Size vs. Read Latency BAvg 16.731 16.728 16.761 16.763 16.418 16.371 16.37 16.431 16.152 16.14 16.169 16.158 16.308 16.29 16.325 16.307 16.34 16.381 16.391 16.352 BMedian 14 14 14 14 13 13 13 13 15 15 15 15 13 13 13 13 13 13 13 13 B95% 41 41 41 41 41 41 41 41 43 43 43 43 40 40 40 40 41 41 41 41 B99% 55 55 55 55 54 54 54 54 55 55 55 55 54 54 54 54 54 54 55 54 B99.9% 71 71 71 71 70 70 70 70 67 67 67 67 71 70 70 71 71 71 71 70 BMax 545 1062 559 567 1075 1027 561 567 564 541 558 1062 1062 561 1075 1072 1067 563 1035 1032 Avg 3.002 5.362 5.361 5.357 6.419 6.369 6.405 6.383 6.188 6.196 6.182 6.174 6.246 6.264 6.268 6.253 5.194 5.207 5.219 3.031 Median 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 95% 10 15 15 15 18 18 18 18 18 18 17 17 18 18 18 18 15 15 15 10 99% 15 26 26 26 30 30 30 30 28 28 28 28 29 29 29 29 25 24 25 15 99.90% 26 41 41 41 45 45 45 45 43 43 43 43 44 44 44 44 41 41 41 26 Max 2261 127 185 102 90 106 92 102 93 106 119 114 89 140 132 82 81 150 93 1910 Get Performance (ms) – 64 K Block Get Performance (ms) – 16 K Block Note: Smaller block size increases index block size
  • 29. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Block Size vs. Index Size Idx Sz K Bloom K 266346 2368 247895 2240 225561 2096 253633 2368 224862 2016 225685 2096 16 K Blocks Idx Sz K Bloom K 472058 2432 574239 2944 331899 1792 471362 2304 517272 2560 469543 2432 8 K Blocks
  • 30. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Column Family Key11|col1|1234567 Value-A1 Key11|col1|1234566 Value-A Key11|col2|1234567 Value-B Key11|col3|1234567 Value-CC Key11|col3|1234563 Value-C Key11|col4|1234567 Value-DD Key11|col4|1234560 Value-D1 Key11|col4|1234557 Value-D BlockCache use use use use use use use use use use use use use use use use use use use use use use
  • 31. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Column Family File System t1:cf2 t1:cf2 Store files t1:cf1 t1:cf1 cf1:col1 Timestamp cf2:col1 TimestampRow Id ValueValueRow Id
  • 32. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Column Family BlockCache cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1
  • 33. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Column Family BlockCache cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 BlockCache cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf2 cf1 cf2 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf2 cf1 cf1 cf2 cf1
  • 34. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Compaction File System K-x K-x D-1 Store files K-xK-xK-x
  • 35. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Compaction File System K-x K-x D-1 Store files K-xK-xK-x File System Store files K-x Compaction
  • 36. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Compaction • Part of regular HBase operations • Minor Compaction • Major Compaction • Utilizes server and HBase resources • Major compaction can be scheduled
  • 37. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Short Circuit Read HBase/DFS Client HDFS File System Data TCP HBase/DFS Client HDFS File System Data TCP Open File Pass FD
  • 38. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Garbage Collection HBase Memstore File System WAL t1 t1 t1 Store filesRead 1 Block Cache 2 Block
  • 39. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Garbage Collection HBase Memstore File System WAL t1 t1 t1 Store filesRead 1 Block Cache 3 Block Off-heap Cache Block 2
  • 40. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Large Cache Avg 2.693 2.814 2.836 2.842 2.812 Median 1 1 1 1 1 95% 8 8 8 8 8 99% 14 14 14 14 15 99.90% 20 20 20 20 20 99.99% 32 31 32 32 33 100.00% 313 319 315 376 341 Max latency 1049 1046 1048 1044 1235 61 GB of Cache Avg 3.872 3.995 3.936 4.007 4.052 Median 1 1 1 1 1 95% 14 14 14 15 15 99% 20 20 20 20 20 99.90% 27 27 27 28 28 99.99% 36 36 36 37 37 100.00% 208 310 332 207 232 Max latency 1360 1906 1736 1359 1363 93 GB of Cache
  • 41. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Garbage Collection • Fine tune Garbage Collector • For example, some CMS GC options to look at — ExplicitGCInvokesConcurrent — CMSInitiatingOccupancyFraction — UseCMSInitiatingOccupancyOnly — ParallelGCThreads — UseParNewGC • Log GC info which will help with tuning — PrintGCDetails — Loggc — PrintTenuringDistribution — …
  • 42. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Region Replication HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1
  • 43. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Region Replication HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1
  • 44. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Region Replication HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1T1 r1
  • 45. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Region Replication • Requires changes to cluster configuration — hbase.region.replica.replication.enabled — hbase.regionserver.storefile.refresh.period (not the complete list) • Need to specify region replication in table definition — create 't1', 'f1', {REGION_REPLICATION => 2} • Client needs to specify when to read secondary — get1.setConsistency(Consistency.TIMELINE); — hbase.client.primaryCallTimeout.get — hbase.client.primaryCallTimeout.multiget
  • 46. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Region Replication Time (ms) readers totalQuery totalStale %age 3,000 512 1,520,207 0 0.00% 3,000 512 1,520,207 0 0.00% 3,000 512 1,520,207 0 0.00% 1,000 512 1,520,207 0 0.00% 1,000 512 1,520,207 0 0.00% 1,000 512 1,520,207 0 0.00% 100 512 1,520,207 5,101 0.34% 100 512 1,520,207 1,476 0.10% 100 512 1,520,207 74 0.00% 50 512 1,520,207 6,173 0.41% 50 512 1,520,207 4,785 0.31% 50 512 1,520,207 5,263 0.35% 10 512 1,520,207 22,518 1.48% 10 512 1,520,207 16,818 1.11% 10 512 1,520,207 19,050 1.25% PrimaryCall Timeout Vs Stale Calls
  • 47. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Application • Table pre-split • Connection reuse • No read cache/scan cache/batch requests • Bulk load instead of Put/Batch Mutate • Column Identifiers • Code on server — Co-processor — Filters
  • 48. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Monitoring • Cache hit ratio • Data locality • GC pause • Compactions • Call queue • Read latencies • Server metrics
  • 49. © 2018 Bloomberg Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Summary • Familiarize system principles • Use DB development lifecycle • Ascertain natural fitness for HBase
  • 50. © 2018 Bloomberg Finance L.P. All rights reserved. Thank You! Reference: http://hbase.apache.org Connect with Bloomberg’s Hadoop Team: [email protected]