SlideShare a Scribd company logo
© 2017 Bloomberg Finance L.P. All rights reserved.
HBaseCon West 2017
June 12, 2017
Anirudha Jadhav
ajadhav2@bloomberg.net
Biju Nair
bnair10@bloomberg.net
Cursors in Apache Phoenix
© 2017 Bloomberg Finance L.P. All rights reserved.
Leading data and analytics provider for the financial industry
Bloomberg
Bloomberg is a data company
© 2017 Bloomberg Finance L.P. All rights reserved.
Reality of working with data
• The data model changes over time
• Users querying the data model don’t necessarily change
• Alternate query patterns for the same dataset
• Data infrastructure usage needs to be simple
© 2017 Bloomberg Finance L.P. All rights reserved.
Apache Phoenix
• Recipes of best practices for using HBase over a familiar SQL’ish grammar
• It is so much more than SQL
o User defined functions for push-down
o Secondary indices
o Statistics collections, optimizations based on heuristics
o ORM libraries
o JDBC, ODBC support with Query servers
o Integrations: Spark, Kafka, MR and others
© 2017 Bloomberg Finance L.P. All rights reserved.
Extending Apache Phoenix
• A very active and helpful community
• Our ongoing work
o Apache Calcite
o Distributed tests and nightly performance build
o Multi-DC replication
o Deep paging with cursor implementation
© 2017 Bloomberg Finance L.P. All rights reserved.
HBase
HBase
Master
RegionServer RegionServer RegionServer
ZooKeeper
QuorumHBase Client
Application
HDFS
DataNode
HDFS
DataNode
HDFS
DataNode
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix
https://www.slideshare.net/enissoz/apache-phoenix-past-present-and-future-of-sql-over-hbase
http://phoenix.apache.org/presentations/OC-HUG-2014-10-4x3.pdf
HBase
Master
RegionServer RegionServer RegionServer
ZooKeeper
QuorumHBase Client
Application
HDFS
DataNode
HDFS
DataNode
HDFS
DataNode
Phoenix Client
Phoenix RPC
endpoint
Phoenix RPC
endpoint
Phoenix
Coprocessors
SYSTEM.CATALOG SYSTEM.STATS
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix Client
Phoenix Client
Authentication
SQL Parsing
Query rewrite/
Optimization
Query Plan Generation
Transaction Management
HBase
Client
ANTLR4
Hints/Rules
Rules
Tephra
Connection Management
HBase
Client
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix query execution
Connection con =
DriverManager.getConnection("jdbc:phoenix:zkquorum:2181:/hbase:principal:keytabfile")
;
…
PreparedStatement statement = con.prepareStatement("select * from TBL");
…
ResultSet rset = statement.executeQuery();
…
while (rset.next() != null)
…
rset.close()
…
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix query execution
Connect to HBase
Parse SQL Statement
Read/Cache Metadata
Validate SQL statement
Create query plan
Optimize query plan
Create Phoenix Result Set
Close ResultSet
Create Result Iterator
getConnection
prepareStatement
executeQuery
close()
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix Server
Meta Data
Request
RegionServer
MetaDataEndPointImpl
SYSTEM.CATALOG
RegionServer
UngroupedAggregateRO
USER_TABLE
GroupedAggregateRO
ScanRegionObserverMetaDataRegionObserver
Indexer
RegionServer
UngroupedAggregateRO
GroupedAggregateRO
ScanRegionObserver
ServerCachingEndpointImpl
HBase Client
Application
Phoenix Client
Index
Write
Request
Read
Request
USER_TABLE
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursors
• To support row pagination
o Should support forward and backward traversal
• Support required for select queries only
• Data needs to be consistent during traversal
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursors
• DECLARE tCursor CURSOR FOR SELECT * FROM TBL
• OPEN tCursor
• FETCH NEXT 10 ROWS FROM tCursor
• FETCH PRIOR 5 ROWS FROM tCursor
• CLOSE tCursor
© 2017 Bloomberg Finance L.P. All rights reserved.
Implementation options
• PHOENIX-2606
• Use row value constructors
o Query rewrite and complex
• Wrapper over available query Resultsets
o Can leverage Resultsets and so relatively simple
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursor Lifecycle
PreparedStatement statement = con.prepareStatement("DECLARE tCursor CURSOR
FOR SELECT * FROM TBL");
statement.execute();
…
statement = con.prepareStatement("OPEN tCursor");
statement = con.prepareStatement("FETCH NEXT FROM tCursor");
ResultSet rset = statement.execute();
while (rset.next != null)
…
statement = con.prepareStatement(“CLOSE tCursor");
statement.execute();
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursor lifecycle
Parse SQL Statement
Create/Optimize QueryPlan
Create CursorWrapper
Set Cursor Status to Open
Execute CursorFetchPlan
Create CursorResultIterator
Close Cursor
Create Phoenix ResultSet
DECLARE CURSOR
FETCH
OPEN CURSOR
CLOSE
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursor Challenges
• Data Consistency
o Query start timestamp provides snapshot consistency
• Optimization
o Use Scan object for non aggregate queries
• Cache sizing
o Dynamic sizing
Contributors
• Gabriel Jimenez (MIT)
• Anirudha Jadhav (Bloomberg)
• Biju Nair (Bloomberg)
• Ankit Singhal (Hortonworks)
© 2017 Bloomberg Finance L.P. All rights reserved.
Thank You
Hardening Apache Phoenix @PhoenixCon tomorrow 2 PM PT
Q&A

More Related Content

PDF
Multi-Tenant HBase Cluster - HBaseCon2018-final
PDF
Serving queries at low latency using HBase
PDF
HBase Internals And Operations
PPTX
HBaseCon 2015 General Session: The Evolution of HBase @ Bloomberg
PPTX
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
PPTX
HBaseCon 2012 | Real-Time and Batch HBase for Healthcare at Explorys
PPTX
Content Identification using HBase
PPT
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
Multi-Tenant HBase Cluster - HBaseCon2018-final
Serving queries at low latency using HBase
HBase Internals And Operations
HBaseCon 2015 General Session: The Evolution of HBase @ Bloomberg
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2012 | Real-Time and Batch HBase for Healthcare at Explorys
Content Identification using HBase
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data

What's hot (20)

PPTX
2 - Trafodion and Hadoop HBase
PPTX
HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in th...
PDF
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
PPTX
1 - The Case for Trafodion
PPTX
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...
PPTX
Trafodion – an enterprise class sql based on hadoop
PDF
HTAP By Accident: Getting More From PostgreSQL Using Hardware Acceleration
 
PPTX
IoFMT – Internet of Fleet Management Things
PDF
HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...
PDF
Cloudera Operational DB (Apache HBase & Apache Phoenix)
PDF
Geospatial Big Data - Foss4gNA
PPTX
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
PDF
Which Questions We Should Have
PDF
Enterprise Postgres
PDF
Distributed SQL Databases Deconstructed
PPTX
Rebuilding from MongoDB for Scale on HBase
PPTX
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
PDF
There and back_again_oracle_and_big_data_16x9
PDF
2017 DB Trends for Powering Real-Time Systems of Engagement
2 - Trafodion and Hadoop HBase
HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in th...
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
HBase Global Indexing to support large-scale data ingestion at Uber
1 - The Case for Trafodion
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...
Trafodion – an enterprise class sql based on hadoop
HTAP By Accident: Getting More From PostgreSQL Using Hardware Acceleration
 
IoFMT – Internet of Fleet Management Things
HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...
Cloudera Operational DB (Apache HBase & Apache Phoenix)
Geospatial Big Data - Foss4gNA
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
Which Questions We Should Have
Enterprise Postgres
Distributed SQL Databases Deconstructed
Rebuilding from MongoDB for Scale on HBase
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
There and back_again_oracle_and_big_data_16x9
2017 DB Trends for Powering Real-Time Systems of Engagement
Ad

Similar to Cursor Implementation in Apache Phoenix (15)

PDF
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
PDF
[db tech showcase Tokyo 2017] C13:There and back again or how to connect Orac...
PPTX
Spring-Boot-PQS with Apache Ignite Caching @ HbaseCon PhoenixCon Dataworks su...
PPTX
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
PDF
Postgres Foreign Data Wrappers
 
PDF
[db tech showcase Tokyo 2017] C24:Taking off to the clouds. How to use DMS in...
PPTX
State ofdolphin short
PPTX
MySQL 8.0 in a nutshell
PPTX
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
PPTX
Cruising in data lake from zero to scale
PDF
Oh2 opportunity for_smart_db
PPTX
Enabling Real-Time Business with Change Data Capture
PPTX
Sql 2017 net raf
PPTX
Sql 2016 2017 full
PDF
Novinky v Oracle Database 18c
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
[db tech showcase Tokyo 2017] C13:There and back again or how to connect Orac...
Spring-Boot-PQS with Apache Ignite Caching @ HbaseCon PhoenixCon Dataworks su...
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
Postgres Foreign Data Wrappers
 
[db tech showcase Tokyo 2017] C24:Taking off to the clouds. How to use DMS in...
State ofdolphin short
MySQL 8.0 in a nutshell
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Cruising in data lake from zero to scale
Oh2 opportunity for_smart_db
Enabling Real-Time Business with Change Data Capture
Sql 2017 net raf
Sql 2016 2017 full
Novinky v Oracle Database 18c
Ad

More from Biju Nair (14)

PDF
Chef conf-2015-chef-patterns-at-bloomberg-scale
PDF
Apache Kafka Reference
PDF
Hadoop security
PDF
Chef patterns
PDF
HBase Application Performance Improvement
PDF
HDFS User Reference
PDF
NENUG Apr14 Talk - data modeling for netezza
PDF
Netezza workload management
PDF
Row or Columnar Database
PDF
Using Netezza Query Plan to Improve Performace
PDF
Netezza fundamentals for developers
PDF
Concurrency
PDF
Project Risk Management
PDF
Websphere MQ (MQSeries) fundamentals
Chef conf-2015-chef-patterns-at-bloomberg-scale
Apache Kafka Reference
Hadoop security
Chef patterns
HBase Application Performance Improvement
HDFS User Reference
NENUG Apr14 Talk - data modeling for netezza
Netezza workload management
Row or Columnar Database
Using Netezza Query Plan to Improve Performace
Netezza fundamentals for developers
Concurrency
Project Risk Management
Websphere MQ (MQSeries) fundamentals

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
A Presentation on Artificial Intelligence
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Big Data Technologies - Introduction.pptx
PDF
Encapsulation theory and applications.pdf
PPTX
Cloud computing and distributed systems.
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Approach and Philosophy of On baking technology
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
Machine Learning_overview_presentation.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Empathic Computing: Creating Shared Understanding
NewMind AI Weekly Chronicles - August'25-Week II
A Presentation on Artificial Intelligence
Per capita expenditure prediction using model stacking based on satellite ima...
Big Data Technologies - Introduction.pptx
Encapsulation theory and applications.pdf
Cloud computing and distributed systems.
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Approach and Philosophy of On baking technology
The AUB Centre for AI in Media Proposal.docx
Encapsulation_ Review paper, used for researhc scholars
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Review of recent advances in non-invasive hemoglobin estimation
Programs and apps: productivity, graphics, security and other tools
Machine Learning_overview_presentation.pptx
Electronic commerce courselecture one. Pdf
Unlocking AI with Model Context Protocol (MCP)
Diabetes mellitus diagnosis method based random forest with bat algorithm
Digital-Transformation-Roadmap-for-Companies.pptx
Empathic Computing: Creating Shared Understanding

Cursor Implementation in Apache Phoenix

  • 1. © 2017 Bloomberg Finance L.P. All rights reserved. HBaseCon West 2017 June 12, 2017 Anirudha Jadhav [email protected] Biju Nair [email protected] Cursors in Apache Phoenix
  • 2. © 2017 Bloomberg Finance L.P. All rights reserved. Leading data and analytics provider for the financial industry Bloomberg
  • 3. Bloomberg is a data company
  • 4. © 2017 Bloomberg Finance L.P. All rights reserved. Reality of working with data • The data model changes over time • Users querying the data model don’t necessarily change • Alternate query patterns for the same dataset • Data infrastructure usage needs to be simple
  • 5. © 2017 Bloomberg Finance L.P. All rights reserved. Apache Phoenix • Recipes of best practices for using HBase over a familiar SQL’ish grammar • It is so much more than SQL o User defined functions for push-down o Secondary indices o Statistics collections, optimizations based on heuristics o ORM libraries o JDBC, ODBC support with Query servers o Integrations: Spark, Kafka, MR and others
  • 6. © 2017 Bloomberg Finance L.P. All rights reserved. Extending Apache Phoenix • A very active and helpful community • Our ongoing work o Apache Calcite o Distributed tests and nightly performance build o Multi-DC replication o Deep paging with cursor implementation
  • 7. © 2017 Bloomberg Finance L.P. All rights reserved. HBase HBase Master RegionServer RegionServer RegionServer ZooKeeper QuorumHBase Client Application HDFS DataNode HDFS DataNode HDFS DataNode
  • 8. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix https://www.slideshare.net/enissoz/apache-phoenix-past-present-and-future-of-sql-over-hbase http://phoenix.apache.org/presentations/OC-HUG-2014-10-4x3.pdf HBase Master RegionServer RegionServer RegionServer ZooKeeper QuorumHBase Client Application HDFS DataNode HDFS DataNode HDFS DataNode Phoenix Client Phoenix RPC endpoint Phoenix RPC endpoint Phoenix Coprocessors SYSTEM.CATALOG SYSTEM.STATS
  • 9. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix Client Phoenix Client Authentication SQL Parsing Query rewrite/ Optimization Query Plan Generation Transaction Management HBase Client ANTLR4 Hints/Rules Rules Tephra Connection Management HBase Client
  • 10. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix query execution Connection con = DriverManager.getConnection("jdbc:phoenix:zkquorum:2181:/hbase:principal:keytabfile") ; … PreparedStatement statement = con.prepareStatement("select * from TBL"); … ResultSet rset = statement.executeQuery(); … while (rset.next() != null) … rset.close() …
  • 11. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix query execution Connect to HBase Parse SQL Statement Read/Cache Metadata Validate SQL statement Create query plan Optimize query plan Create Phoenix Result Set Close ResultSet Create Result Iterator getConnection prepareStatement executeQuery close()
  • 12. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix Server Meta Data Request RegionServer MetaDataEndPointImpl SYSTEM.CATALOG RegionServer UngroupedAggregateRO USER_TABLE GroupedAggregateRO ScanRegionObserverMetaDataRegionObserver Indexer RegionServer UngroupedAggregateRO GroupedAggregateRO ScanRegionObserver ServerCachingEndpointImpl HBase Client Application Phoenix Client Index Write Request Read Request USER_TABLE
  • 13. © 2017 Bloomberg Finance L.P. All rights reserved. Cursors • To support row pagination o Should support forward and backward traversal • Support required for select queries only • Data needs to be consistent during traversal
  • 14. © 2017 Bloomberg Finance L.P. All rights reserved. Cursors • DECLARE tCursor CURSOR FOR SELECT * FROM TBL • OPEN tCursor • FETCH NEXT 10 ROWS FROM tCursor • FETCH PRIOR 5 ROWS FROM tCursor • CLOSE tCursor
  • 15. © 2017 Bloomberg Finance L.P. All rights reserved. Implementation options • PHOENIX-2606 • Use row value constructors o Query rewrite and complex • Wrapper over available query Resultsets o Can leverage Resultsets and so relatively simple
  • 16. © 2017 Bloomberg Finance L.P. All rights reserved. Cursor Lifecycle PreparedStatement statement = con.prepareStatement("DECLARE tCursor CURSOR FOR SELECT * FROM TBL"); statement.execute(); … statement = con.prepareStatement("OPEN tCursor"); statement = con.prepareStatement("FETCH NEXT FROM tCursor"); ResultSet rset = statement.execute(); while (rset.next != null) … statement = con.prepareStatement(“CLOSE tCursor"); statement.execute();
  • 17. © 2017 Bloomberg Finance L.P. All rights reserved. Cursor lifecycle Parse SQL Statement Create/Optimize QueryPlan Create CursorWrapper Set Cursor Status to Open Execute CursorFetchPlan Create CursorResultIterator Close Cursor Create Phoenix ResultSet DECLARE CURSOR FETCH OPEN CURSOR CLOSE
  • 18. © 2017 Bloomberg Finance L.P. All rights reserved. Cursor Challenges • Data Consistency o Query start timestamp provides snapshot consistency • Optimization o Use Scan object for non aggregate queries • Cache sizing o Dynamic sizing
  • 19. Contributors • Gabriel Jimenez (MIT) • Anirudha Jadhav (Bloomberg) • Biju Nair (Bloomberg) • Ankit Singhal (Hortonworks)
  • 20. © 2017 Bloomberg Finance L.P. All rights reserved. Thank You Hardening Apache Phoenix @PhoenixCon tomorrow 2 PM PT Q&A