SlideShare a Scribd company logo
How to ensure Presto
scalability
in multi use case
Kai Sasaki
Treasure Data Inc.
Kai Sasaki (@Lewuathe)
Software Engineer at Treasure Data Inc.
Hadoop/Presto/Spark
Presto In TD
• 150000+ queries / day
• 190+ TB processing / day
• 10+ MB processing / query * sec
• 100+ million processed records / query
Presto In TD
Prestobase
Proxy
PerfectQueue
query
Plazma
data
Presto
TD API
BI Tool
HTTP
How to make it scalable
• Prestobase Proxy
• Node scheduler
• Resource Group
Prestobase proxy
Prestobase proxy
Prestobase proxy aims to provide the
interface especially for BI tools through
JDBC/ODBC and also to replace Prestogres.
Presto In TD
Prestobase
Proxy
PerfectQueue
query
Plazma
data
Presto
TD API
BI Tool
HTTP
Prestobase proxy
• Written in Scala
• Finagle base RPC proxy
• Running as Docker container
• A user of Airframe
• VCR base light-weight test framework
Finagle
Finagle is an extensible RPC system for the JVM,
used to construct high-concurrency servers.
Finagle implements uniform client and server
APIs for several protocols, and is designed for
high performance and concurrency.
see: https://twitter.github.io/finagle/
Finagle
protected val service: Service[Request, Response] =
bind[SomeFilter] andThen
bind[AnotherHandler] andThen
LastFilter andThen
prestoClient
Build request pipeline by binding
filter, handlers with Airframe
Airframe
Airframe is a trait base dependency injection
framework using Scala macro
- https://github.com/wvlet/airframe
Airframe
- Dependency injection tailored Scala
- Tagged binding with wvlet
https://github.com/wvlet/wvlet
- Object lifecycle management
Airframe
val design : Design =
newDesign
.bind[X].toInstance(new X) // Bind type X to a concrete instance
.bind[Y].toSingleton // Bind type Y to a singleton object
.bind[Z].to[ZImpl] // Bind type Z to an instance of ZImpl
import wvlet.airframe._
trait App {
val x = bind[X]
val y = bind[Y]
val z = bind[Z]
// Do something with X, Y, and Z
}
val session = design.newSession
val app : App = session.build[App]
VCR testing framework
Record test suite HTTP interaction to make
test stable and deterministic
see more detail
https://testing.googleblog.com/2016/11/what-test-engineers-do-at-google.html
VCR testing framework
protected val service: Service[Request, Response] =
bind[SomeFilter] andThen
bind[AnotherHandler] andThen
QueryRewriter andThen
bind[RequestVCR] andThen
prestClient
protected val service: Service[Request, Response] =
bind[SomeFilter] andThen
bind[AnotherHandler] andThen
QueryRewriter andThen
bind[NoRecording] andThen
prestClient
On CI
On Production
Prestobase
VCR testing framework
RequestVCRClient
…
…
SQLite
Recording
Prestobase
VCR testing framework
RequestVCRClient
…
…
SQLite
Replaying
Prestobase proxy
Will be open sourced soon
Node Scheduler
Node Scheduler
Submitting query follows…
- Analyze query AST
- Make query logical/physical plan
- Schedule each stage
Node Scheduler
query
stage2 stage1 stage0
task2-0
task2-1
task2-0
task1-0
task1-1
task0-0
Table Scan output
Node Scheduler
NodeScheduler creates NodeSelector that
selects worker nodes on which tasks are
scheduled. NodeSelector picks up worker
nodes when there is available splits.
Node Scheduler in TD
Keeps worker node map that can be
candidate for launching next tasks.
- Ignore min candidates
- Limit by available memory pool
Node Scheduler in TD
Back to normal memory pool usage after task is completed.
Node Scheduler in TD
Challenges
- Smoothing CPU time metric
- Split type awareness
- Avoid problematic worker nodes
Resource Group
Resource Group
Resource Group was introduced since 0.147
→ https://prestodb.io/docs/current/admin/resource-groups.html
Resource Group aims to limit the resource
usage by account/group/query.
Resource Group
rootGroup
general adhoc
softMemoryLimit: 100%
maxQueued : 5000
maxRunning : 1000
softMemoryLimit: 100%
maxQueued : 100
maxRunning : 200
softMemoryLimit: 100%
maxRunning : 1000
Resource Group limits
- maxQueued
- maxRunning
- softMemoryLimit
Following queries will be queued
- softCpuLimit
Impose penalty against max running queries
- hardCpuLimit
Following queries will be queued
Resource Group scheduling
- schedulingPolicy
- fair : FIFO
- weighted : Selected stochastically
- query_priority : Selected according to priority
- schedulingWeight
Resource Group
Every query must be associated to a resource
group. The matching can be done by
configured selector.
{
"user": “bob", "group": "general"
},
{
"source": “.*adhoc.*", "group": "global.adhoc.adhoc_${USER}"
}
Resource Group
rootGroup
general adhoc
softMemoryLimit: 100%
maxQueued : 5000
maxRunning : 1000
softMemoryLimit: 100%
maxQueued : 100
maxRunning : 200
softMemoryLimit: 100%
maxRunning : 1000
Bob’s
query
Bob’s
query …
Resource Group DI
Easily change resource group config behavior
with Guice injection.
- ResourceGroupConfigurationManager
- configure(ResourceGroup, SelectionContext)
- ResourceGroupSelector
- match(Statement, SelectionContext)
SelectionContext
SelectionContext holds the information for associating
submitted query.
- Authenticated
- User
- Source
- Query Priority
Currently available as default
{
"runningQueryIds": ["query1", "query2"],
"accountId": 1,
"children": [{
"memoryUsage": 12345,
"runningQueryIds": [“query1"],
"children": [],
"runningQueries": 1,
"queuedQueries": 0,
"maxRunningQueries": 2,
"resourceId": "general"
}, {
"memoryUsage": 26296,
"runningQueryIds": ["query2"],
"children": [],
"runningQueries": 1,
"queuedQueries": 0,
"maxRunningQueries": 2,
"resourceId": "scheduled"
}],
"runningQueries": 2,
"maxRunningQueries": 30,
}
Queries in parent group
Running query in general
Running query in scheduled
Recap
Distributed system often requires each
component to be stable and scalable. We can
make Presto ecosystem reliable by doing…
- Code modification reliability with DI
- VCR testing
- Multi dimensional resource scheduling
- Resource isolation makes multi-tenant
distributed SQL engine reliable

More Related Content

PDF
Presto updates to 0.178
PDF
Presto At Treasure Data
PDF
Introduction to Presto at Treasure Data
PDF
Presto at Twitter
PDF
Prestogres, ODBC & JDBC connectivity for Presto
PDF
Presto
PDF
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
PDF
20140120 presto meetup_en
Presto updates to 0.178
Presto At Treasure Data
Introduction to Presto at Treasure Data
Presto at Twitter
Prestogres, ODBC & JDBC connectivity for Presto
Presto
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
20140120 presto meetup_en

What's hot (20)

PPTX
Bullet: A Real Time Data Query Engine
PDF
Presto Strata Hadoop SJ 2016 short talk
ODP
Presto
PDF
Presto in my_use_case
PDF
Presto - Analytical Database. Overview and use cases.
PDF
Understanding Presto - Presto meetup @ Tokyo #1
PDF
Presto+MySQLで分散SQL
PDF
Presto meetup 2015-03-19 @Facebook
PDF
Presto @ Treasure Data - Presto Meetup Boston 2015
PDF
Presto at Hadoop Summit 2016
PDF
Presto - Hadoop Conference Japan 2014
PDF
Hoodie: How (And Why) We built an analytical datastore on Spark
PDF
Distributed Logging Architecture in Container Era
PDF
User Defined Partitioning on PlazmaDB
PPTX
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
PPTX
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
PDF
Natural Language Query and Conversational Interface to Apache Spark
PDF
Data Analytics Service Company and Its Ruby Usage
PDF
Presto Meetup (2015-03-19)
PPTX
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
Bullet: A Real Time Data Query Engine
Presto Strata Hadoop SJ 2016 short talk
Presto
Presto in my_use_case
Presto - Analytical Database. Overview and use cases.
Understanding Presto - Presto meetup @ Tokyo #1
Presto+MySQLで分散SQL
Presto meetup 2015-03-19 @Facebook
Presto @ Treasure Data - Presto Meetup Boston 2015
Presto at Hadoop Summit 2016
Presto - Hadoop Conference Japan 2014
Hoodie: How (And Why) We built an analytical datastore on Spark
Distributed Logging Architecture in Container Era
User Defined Partitioning on PlazmaDB
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Natural Language Query and Conversational Interface to Apache Spark
Data Analytics Service Company and Its Ruby Usage
Presto Meetup (2015-03-19)
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
Ad

Viewers also liked (9)

PDF
Bypassing Web Application Firewalls and other security filters
PDF
Presto @ Facebook: Past, Present and Future
PPTX
Presto: SQL-on-anything
PDF
Presto - SQL on anything
PDF
Facebook Presto presentation
PDF
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
PPTX
Presto: Distributed sql query engine
PDF
Optimizing Presto Connector on Cloud Storage
PPTX
Hive, Presto, and Spark on TPC-DS benchmark
Bypassing Web Application Firewalls and other security filters
Presto @ Facebook: Past, Present and Future
Presto: SQL-on-anything
Presto - SQL on anything
Facebook Presto presentation
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
Presto: Distributed sql query engine
Optimizing Presto Connector on Cloud Storage
Hive, Presto, and Spark on TPC-DS benchmark
Ad

Similar to How to ensure Presto scalability 
in multi use case (20)

PDF
GraphConnect 2014 SF: From Zero to Graph in 120: Scale
PDF
AtlasCamp 2014: Building a Production Ready Connect Add-on
PDF
20151010 my sq-landjavav2a
PPT
Delivering High Performance Ecommerce with Magento Commerce Cloud
PDF
AtlasCamp 2014: Building a Production Ready Connect Add-On
PPT
Assurer - a pluggable server testing/monitoring framework
PPTX
Microsoft Windows Server AppFabric
PDF
Behavior Driven Development and Automation Testing Using Cucumber
PPTX
Altitude San Francisco 2018: Testing with Fastly Workshop
PDF
GE Predix 新手入门 赵锴 物联网_IoT
PDF
Dev309 from asgard to zuul - netflix oss-final
PDF
Structure your Play application with the cake pattern (and test it)
PDF
Play Framework: async I/O with Java and Scala
PPTX
Build A Killer Client For Your REST+JSON API
PDF
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PDF
Using Istio to Secure & Monitor Your Services
PDF
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
PDF
High-Performance Hibernate - JDK.io 2018
PDF
Where is my cache architectural patterns for caching microservices by example
PDF
Cannibalising The Google App Engine
GraphConnect 2014 SF: From Zero to Graph in 120: Scale
AtlasCamp 2014: Building a Production Ready Connect Add-on
20151010 my sq-landjavav2a
Delivering High Performance Ecommerce with Magento Commerce Cloud
AtlasCamp 2014: Building a Production Ready Connect Add-On
Assurer - a pluggable server testing/monitoring framework
Microsoft Windows Server AppFabric
Behavior Driven Development and Automation Testing Using Cucumber
Altitude San Francisco 2018: Testing with Fastly Workshop
GE Predix 新手入门 赵锴 物联网_IoT
Dev309 from asgard to zuul - netflix oss-final
Structure your Play application with the cake pattern (and test it)
Play Framework: async I/O with Java and Scala
Build A Killer Client For Your REST+JSON API
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
Using Istio to Secure & Monitor Your Services
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
High-Performance Hibernate - JDK.io 2018
Where is my cache architectural patterns for caching microservices by example
Cannibalising The Google App Engine

More from Kai Sasaki (20)

PDF
Graviton 2で実現する
コスト効率のよいCDP基盤
PDF
Infrastructure for auto scaling distributed system
PDF
Continuous Optimization for Distributed BigData Analysis
PDF
Recent Changes and Challenges for Future Presto
PDF
Real World Storage in Treasure Data
PDF
20180522 infra autoscaling_system
PDF
Deep dive into deeplearn.js
PDF
Managing multi tenant resource toward Hive 2.0
PDF
Embulk makes Japan visible
PDF
Maintainable cloud architecture_of_hadoop
PDF
図でわかるHDFS Erasure Coding
PDF
Spark MLlib code reading ~optimization~
PDF
How I tried MADE
PDF
Reading kernel org
PDF
Reading drill
PDF
Kernel ext4
PDF
Kernel bootstrap
PDF
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
PDF
Kernel resource
PDF
Kernel overview
Graviton 2で実現する
コスト効率のよいCDP基盤
Infrastructure for auto scaling distributed system
Continuous Optimization for Distributed BigData Analysis
Recent Changes and Challenges for Future Presto
Real World Storage in Treasure Data
20180522 infra autoscaling_system
Deep dive into deeplearn.js
Managing multi tenant resource toward Hive 2.0
Embulk makes Japan visible
Maintainable cloud architecture_of_hadoop
図でわかるHDFS Erasure Coding
Spark MLlib code reading ~optimization~
How I tried MADE
Reading kernel org
Reading drill
Kernel ext4
Kernel bootstrap
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
Kernel resource
Kernel overview

Recently uploaded (20)

PDF
AutoCAD Professional Crack 2025 With License Key
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PPTX
history of c programming in notes for students .pptx
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PPTX
Oracle Fusion HCM Cloud Demo for Beginners
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Tally Prime Crack Download New Version 5.1 [2025] (License Key Free
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
medical staffing services at VALiNTRY
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
Download FL Studio Crack Latest version 2025 ?
PDF
Autodesk AutoCAD Crack Free Download 2025
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PPTX
Monitoring Stack: Grafana, Loki & Promtail
PDF
Designing Intelligence for the Shop Floor.pdf
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
AutoCAD Professional Crack 2025 With License Key
wealthsignaloriginal-com-DS-text-... (1).pdf
history of c programming in notes for students .pptx
Navsoft: AI-Powered Business Solutions & Custom Software Development
Why Generative AI is the Future of Content, Code & Creativity?
Oracle Fusion HCM Cloud Demo for Beginners
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Tally Prime Crack Download New Version 5.1 [2025] (License Key Free
Wondershare Filmora 15 Crack With Activation Key [2025
medical staffing services at VALiNTRY
Reimagine Home Health with the Power of Agentic AI​
Download FL Studio Crack Latest version 2025 ?
Autodesk AutoCAD Crack Free Download 2025
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
Monitoring Stack: Grafana, Loki & Promtail
Designing Intelligence for the Shop Floor.pdf
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...

How to ensure Presto scalability 
in multi use case

  • 1. How to ensure Presto scalability in multi use case Kai Sasaki Treasure Data Inc.
  • 2. Kai Sasaki (@Lewuathe) Software Engineer at Treasure Data Inc. Hadoop/Presto/Spark
  • 3. Presto In TD • 150000+ queries / day • 190+ TB processing / day • 10+ MB processing / query * sec • 100+ million processed records / query
  • 5. How to make it scalable • Prestobase Proxy • Node scheduler • Resource Group
  • 7. Prestobase proxy Prestobase proxy aims to provide the interface especially for BI tools through JDBC/ODBC and also to replace Prestogres.
  • 9. Prestobase proxy • Written in Scala • Finagle base RPC proxy • Running as Docker container • A user of Airframe • VCR base light-weight test framework
  • 10. Finagle Finagle is an extensible RPC system for the JVM, used to construct high-concurrency servers. Finagle implements uniform client and server APIs for several protocols, and is designed for high performance and concurrency. see: https://twitter.github.io/finagle/
  • 11. Finagle protected val service: Service[Request, Response] = bind[SomeFilter] andThen bind[AnotherHandler] andThen LastFilter andThen prestoClient Build request pipeline by binding filter, handlers with Airframe
  • 12. Airframe Airframe is a trait base dependency injection framework using Scala macro - https://github.com/wvlet/airframe
  • 13. Airframe - Dependency injection tailored Scala - Tagged binding with wvlet https://github.com/wvlet/wvlet - Object lifecycle management
  • 14. Airframe val design : Design = newDesign .bind[X].toInstance(new X) // Bind type X to a concrete instance .bind[Y].toSingleton // Bind type Y to a singleton object .bind[Z].to[ZImpl] // Bind type Z to an instance of ZImpl import wvlet.airframe._ trait App { val x = bind[X] val y = bind[Y] val z = bind[Z] // Do something with X, Y, and Z } val session = design.newSession val app : App = session.build[App]
  • 15. VCR testing framework Record test suite HTTP interaction to make test stable and deterministic see more detail https://testing.googleblog.com/2016/11/what-test-engineers-do-at-google.html
  • 16. VCR testing framework protected val service: Service[Request, Response] = bind[SomeFilter] andThen bind[AnotherHandler] andThen QueryRewriter andThen bind[RequestVCR] andThen prestClient protected val service: Service[Request, Response] = bind[SomeFilter] andThen bind[AnotherHandler] andThen QueryRewriter andThen bind[NoRecording] andThen prestClient On CI On Production
  • 19. Prestobase proxy Will be open sourced soon
  • 21. Node Scheduler Submitting query follows… - Analyze query AST - Make query logical/physical plan - Schedule each stage
  • 22. Node Scheduler query stage2 stage1 stage0 task2-0 task2-1 task2-0 task1-0 task1-1 task0-0 Table Scan output
  • 23. Node Scheduler NodeScheduler creates NodeSelector that selects worker nodes on which tasks are scheduled. NodeSelector picks up worker nodes when there is available splits.
  • 24. Node Scheduler in TD Keeps worker node map that can be candidate for launching next tasks. - Ignore min candidates - Limit by available memory pool
  • 25. Node Scheduler in TD Back to normal memory pool usage after task is completed.
  • 26. Node Scheduler in TD Challenges - Smoothing CPU time metric - Split type awareness - Avoid problematic worker nodes
  • 28. Resource Group Resource Group was introduced since 0.147 → https://prestodb.io/docs/current/admin/resource-groups.html Resource Group aims to limit the resource usage by account/group/query.
  • 29. Resource Group rootGroup general adhoc softMemoryLimit: 100% maxQueued : 5000 maxRunning : 1000 softMemoryLimit: 100% maxQueued : 100 maxRunning : 200 softMemoryLimit: 100% maxRunning : 1000
  • 30. Resource Group limits - maxQueued - maxRunning - softMemoryLimit Following queries will be queued - softCpuLimit Impose penalty against max running queries - hardCpuLimit Following queries will be queued
  • 31. Resource Group scheduling - schedulingPolicy - fair : FIFO - weighted : Selected stochastically - query_priority : Selected according to priority - schedulingWeight
  • 32. Resource Group Every query must be associated to a resource group. The matching can be done by configured selector. { "user": “bob", "group": "general" }, { "source": “.*adhoc.*", "group": "global.adhoc.adhoc_${USER}" }
  • 33. Resource Group rootGroup general adhoc softMemoryLimit: 100% maxQueued : 5000 maxRunning : 1000 softMemoryLimit: 100% maxQueued : 100 maxRunning : 200 softMemoryLimit: 100% maxRunning : 1000 Bob’s query Bob’s query …
  • 34. Resource Group DI Easily change resource group config behavior with Guice injection. - ResourceGroupConfigurationManager - configure(ResourceGroup, SelectionContext) - ResourceGroupSelector - match(Statement, SelectionContext)
  • 35. SelectionContext SelectionContext holds the information for associating submitted query. - Authenticated - User - Source - Query Priority Currently available as default
  • 36. { "runningQueryIds": ["query1", "query2"], "accountId": 1, "children": [{ "memoryUsage": 12345, "runningQueryIds": [“query1"], "children": [], "runningQueries": 1, "queuedQueries": 0, "maxRunningQueries": 2, "resourceId": "general" }, { "memoryUsage": 26296, "runningQueryIds": ["query2"], "children": [], "runningQueries": 1, "queuedQueries": 0, "maxRunningQueries": 2, "resourceId": "scheduled" }], "runningQueries": 2, "maxRunningQueries": 30, } Queries in parent group Running query in general Running query in scheduled
  • 37. Recap Distributed system often requires each component to be stable and scalable. We can make Presto ecosystem reliable by doing… - Code modification reliability with DI - VCR testing - Multi dimensional resource scheduling - Resource isolation makes multi-tenant distributed SQL engine reliable