SlideShare a Scribd company logo
Get More Out of
MySQL with TokuDB
Tim Callaghan
VP/Engineering, Tokutek
tim@tokutek.com
@tmcallaghan
Tokutek: Database Performance Engines
What is Tokutek?
Tokutek® offers high performance and scalability for MySQL,
MariaDB and MongoDB. Our easy-to-use open source solutions
are compatible with your existing code and application
infrastructure.
Tokutek Performance Engines Remove Limitations
• Improve insertion performance by 20X
• Reduce HDD and flash storage requirements up to 90%
• No need to rewrite code
Tokutek Mission:
Empower your database to handle the Big Data requirements of
today’s applications
3
A Global Customer Base
Housekeeping
• This presentation will be available for replay
following the event
• We welcome your questions; please use the console
on the right of your screen and we will answer
following the presentation
• A copy of the presentation is available upon request
Agenda
Lets answer the following questions, “How can you…?”
• Easily install and configure TokuDB.
• Dramatically increase performance without rewriting
code.
• Reduce the total cost of your servers and storage.
• Simply perform online schema changes.
• Avoid becoming the support staff for your
application.
• And Q+A
How easy is it to install and configure
TokuDB for
MySQL or MariaDB?
What is TokuDB?
• TokuDB = MySQL* Storage Engine + Patches**
– * MySQL, MariaDB, Percona Server
– ** Patches are required for full functionality
– TokuDB is more than a plugin
• Transactional, ACID + MVCC
– Like InnoDB
• Drop-in replacement for MySQL
• Open Source
– http://github.com/Tokutek/ft-engine
Where can I get TokuDB?
• Tokutek offers MySQL 5.5 and MariaDB 5.5 builds
– www.tokutek.com
• MariaDB 5.5 and 10
– www.mariadb.org
– Also in MariaDB 5.5 from various package repositories
• Experimental Percona Server 5.6 builds
– www.percona.com
Is it truly a “drop in replacement”?
• No Foreign Key support
– you’ll need to drop them
• No Windows or OSX binaries
– Virtual machines are helpful in evaluations
• No 32-bit builds
• Otherwise, yes
How do I get started?
• Start Fresh
– create table <table> engine=tokudb;
– mysqldump / load data infile
• Use your existing MySQL data folder
– alter table <table-to-convert> engine=tokudb;
• Measure the differences
– compression : load/convert your tables
– performance : run your workload
– online schema changes : add a column
Before you dive in – check you’re my.cnf
• TokuDB uses sensible server parameter defaults, but
• Be mindful of your memory
– Reduce innodb_buffer_pool_size (InnoDB) and
key_cache_size (MyISAM)
– Especially if converting tables
– tokudb_cache_size=?G
– Defaults to 50% of RAM, I recommend 80%
– tokudb_directio=1
• Leave everything else alone
How can I dramatically increase
performance without having to rewrite
code?
Where does the performance come from?
• Tokutek’s Fractal Tree® indexes
– Much faster than B-trees in > RAM workloads
– InnoDB and MyISAM use B-trees
– Significant IO reduction
– Messages defer IO on add/update/delete
– All reads and writes are compressed
– Enables users to add more indexes
– Queries go faster
• Lots of good webinar content on our website
– www.tokutek.com/resources/webinars
How much can I reduce my IO?
Converted from
InnoDB to TokuDB
How fast can I insert data into TokuDB?
• InnoDB’s B-trees
– Fast until the index not longer fits in RAM
• TokuDB’s Fractal Tree indexes
– Start fast, stay fast!
• iiBench benchmark
– Insert 1 billion rows
– 1000 inserts per batch
– Auto-increment PK
– 3 secondary indexes
How fast can I insert data into TokuDB?
How fast are mixed workloads?
• Fast, since > RAM mixed workloads generally contain…
– Index maintenance (insert, update, delete)
– Fractal Tree indexes FTW!
– Queries
– TokuDB enables richer indexing (more indexes)
• Sysbench benchmark
– 16 tables, 50 million rows per table
– Each Sysbench transaction contains
– 1 of each query : point, range, aggregation
– indexed update, unindexed update, delete, insert
How fast are mixed workloads?
How do secondary indexes work?
• InnoDB and TokuDB “cluster” the primary key index
– The key (PK) and all other columns are co-located in
memory and on disk
• Secondary indexes co-locate the “index key” and PK
– When a candidate row is found a second lookup
occurs into the PK index
– This means an additional IO is required
– MySQL’s “hidden join”
What is a clustered secondary index?
• “Covering” indexes remove this second lookup, but
require putting the right columns into the index
– create index idx_1 on t1 (c1, c2, c3, c4, c5, c6);
– If c1/c2 are queried, only c3/c4/c5/c6 are covered
– No additional IO, but c7 isn’t covered
• TokuDB supports clustered secondary indexes
– create clustering index idx_1 on t1 (c1, c2);
– All columns in t1 are covered, forever
– Even if new columns are added to the table
What are clustered secondary indexes good at?
• Two words, “RANGE SCANS”
• Several rows (maybe thousands) are scanned without
requiring additional lookups on the PK index
• Also, TokuDB blocks are much larger than InnoDB
– TokuDB = 4MB blocks = sequential IO
– InnoDB = 16KB blocks = random IO
• Can be orders of magnitude faster for range queries
Can SQL be optimized?
• Fractal Tree indexes support message injection
– The actual work (and IO) can be deferred
• Example:
– update t1 set k = k + 1 where pk = 5;
– InnoDB follows read-modify-write pattern
– If field “k” is not indexed, TokuDB avoids IO entirely
– An “increment” message is injected
• Current optimizations
– “replace into”, “insert ignore”, “update”, “insert on
duplicate key update”
How can I reduce the total cost of my
servers and storage?
How can I use less storage?
• Compression, compression, compression!
• All IO in TokuDB is compressed
– Reads and writes
– Usually ~5x compression (but I’ve seen 25x or more)
• TokuDB [currently] supports 3 compression algorithms
– lzma = highest compression (and high CPU)
– zlib = high compression (and much less CPU)
– quicklz = medium compression (even less CPU)
– pluggable architecture, lz4 and snappy “in the lab”
But doesn’t InnoDB support compression?
• Yes, but the compression achieved is far lower
– InnoDB compresses 16K blocks, TokuDB is 64K or 128K
– InnoDB requires fixed on-disk size, TokuDB is flexible
*log style data
But doesn’t InnoDB support compression?
• And InnoDB performance is severely impacted by it
– Compression “misses” are costly
*iiBench workload
How do I compress my data in TokuDB?
create table t1 (c1 bigint not null primary key)
engine=tokudb
row_format=[tokudb_lzma | tokudb_zlib | tokudb_quicklz];
NOTE: Compression is not optional in TokuDB, we use
compression to provide performance advantages as well as save
space.
How can I perform online schema
changes?
What is an “online” schema change?
My definition
“An online schema change is the ability to add or drop a column
on an existing table without blocking further changes to the
table or requiring substantial server resources (CPU, RAM, IO,
disk) to accomplish the operation.”
P.S., I’d like for it to be instantaneous!
What do blocking schema changes look like?
How have online schema changes evolved?
• MySQL 5.5
– Table is read-only while entire table is re-created
• “Manual” process
– Take slave offline, apply to slave, catch up to master,
switch places, repeat
• MySQL 5.6 (and ~ Percona’s pt-online-schema-change-tool)
– Table is rebuilt “in the background”
– Changes are captured, and replayed on new table
– Uses significant RAM, CPU, IO, and disk space
• TokuDB
– alter table t1 add column new_column bigint;
– Done!
What online schema changes can TokuDB handle?
• Add column
• Drop column
• Expand column
– integer types
– varchar, char, varbinary
• Index creation
How can I avoid becoming the support
staff for my application?
34
TokuDB is offered in 2 editions
• Community
– Community support (Google Groups “tokudb-user”)
• Enterprise subscription
– Commercial support
– Wouldn’t you rather be developing another application?
– Extra features
– Hot backup, more on the way
– Access to TokuDB experts
– Input to the product roadmap
Where can I get TokuDB support?
35
Tokutek: Database Performance Engines
Any Questions?
Download TokuDB at www.tokutek.com/products/downloads
Register for product updates, access to premium content, and
invitations at www.tokutek.com
Join the Conversation

More Related Content

PPTX
Introduction to TokuDB v7.5 and Read Free Replication
PDF
TokuDB 高科扩展性 MySQL 和 MariaDB 数据库
PPTX
Percona FT / TokuDB
PDF
TokuDB - What You Need to Know
PPTX
Get More Out of MongoDB with TokuMX
PDF
Fractal Tree Indexes : From Theory to Practice
PDF
Remote DBA Experts SQL Server 2008 New Features
PDF
MyRocks Deep Dive
Introduction to TokuDB v7.5 and Read Free Replication
TokuDB 高科扩展性 MySQL 和 MariaDB 数据库
Percona FT / TokuDB
TokuDB - What You Need to Know
Get More Out of MongoDB with TokuMX
Fractal Tree Indexes : From Theory to Practice
Remote DBA Experts SQL Server 2008 New Features
MyRocks Deep Dive

What's hot (18)

PPTX
In memory databases presentation
PDF
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
PPTX
An introduction to SQL Server in-memory OLTP Engine
PPT
Fudcon talk.ppt
PPTX
What'sNnew in 3.0 Webinar
PDF
InnoDB Architecture and Performance Optimization, Peter Zaitsev
PPTX
When is MyRocks good?
PPTX
In-memory Databases
PPTX
in-memory database system and low latency
PDF
M|18 How to use MyRocks with MariaDB Server
PPTX
Some key value stores using log-structure
PDF
Beyond Postgres: Interesting Projects, Tools and forks
PPTX
Getting innodb compression_ready_for_facebook_scale
PDF
PostgreSQL and MySQL
PDF
SSD Deployment Strategies for MySQL
PDF
505 kobal exadata
PDF
2016 jan-pugs-meetup-v9.5-features
PDF
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
In memory databases presentation
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
An introduction to SQL Server in-memory OLTP Engine
Fudcon talk.ppt
What'sNnew in 3.0 Webinar
InnoDB Architecture and Performance Optimization, Peter Zaitsev
When is MyRocks good?
In-memory Databases
in-memory database system and low latency
M|18 How to use MyRocks with MariaDB Server
Some key value stores using log-structure
Beyond Postgres: Interesting Projects, Tools and forks
Getting innodb compression_ready_for_facebook_scale
PostgreSQL and MySQL
SSD Deployment Strategies for MySQL
505 kobal exadata
2016 jan-pugs-meetup-v9.5-features
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
Ad

Similar to Get More Out of MySQL with TokuDB (20)

PPTX
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
PDF
Toku DB by Aswin
PDF
Big challenges
PDF
MySQL Storage Engines - which do you use? TokuDB? MyRocks? InnoDB?
PDF
TokuDB internals / Лесин Владислав (Percona)
PPTX
Best storage engine for MySQL
PPT
5 Pitfalls to Avoid with MongoDB
PDF
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
PPTX
Handling Massive Writes
PDF
MySQL 8 Server Optimization Swanseacon 2018
PDF
MySQL Tokudb engine benchmark
PDF
MySQL 8 Tips and Tricks from Symfony USA 2018, San Francisco
PDF
Using ScyllaDB for Real-Time Write-Heavy Workloads
PDF
The History and Future of the MySQL ecosystem
PPTX
M|18 Battle of the Online Schema Change Methods
PDF
Introduction of MariaDB AX / TX
PDF
Scaling, Tuning and Maintaining the Monolith
PDF
Bogdan Kecman INIT Presentation
PDF
Instant add column for inno db in mariadb 10.3+ (fosdem 2018, second draft)
PDF
MariaDB - a MySQL Replacement #SELF2014
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
Toku DB by Aswin
Big challenges
MySQL Storage Engines - which do you use? TokuDB? MyRocks? InnoDB?
TokuDB internals / Лесин Владислав (Percona)
Best storage engine for MySQL
5 Pitfalls to Avoid with MongoDB
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
Handling Massive Writes
MySQL 8 Server Optimization Swanseacon 2018
MySQL Tokudb engine benchmark
MySQL 8 Tips and Tricks from Symfony USA 2018, San Francisco
Using ScyllaDB for Real-Time Write-Heavy Workloads
The History and Future of the MySQL ecosystem
M|18 Battle of the Online Schema Change Methods
Introduction of MariaDB AX / TX
Scaling, Tuning and Maintaining the Monolith
Bogdan Kecman INIT Presentation
Instant add column for inno db in mariadb 10.3+ (fosdem 2018, second draft)
MariaDB - a MySQL Replacement #SELF2014
Ad

More from Tim Callaghan (8)

PPTX
Is It Fast? : Measuring MongoDB Performance
PDF
Benchmarking MongoDB for Fame and Fortune
PPTX
So you want to be a software developer? (version 2.0)
PPTX
Performance Benchmarking: Tips, Tricks, and Lessons Learned
PDF
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
PDF
Use Your MySQL Knowledge to Become a MongoDB Guru
PPTX
Creating a Benchmarking Infrastructure That Just Works
PDF
VoltDB : A Technical Overview
Is It Fast? : Measuring MongoDB Performance
Benchmarking MongoDB for Fame and Fortune
So you want to be a software developer? (version 2.0)
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become a MongoDB Guru
Creating a Benchmarking Infrastructure That Just Works
VoltDB : A Technical Overview

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Getting Started with Data Integration: FME Form 101
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Machine learning based COVID-19 study performance prediction
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PPTX
A Presentation on Artificial Intelligence
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Approach and Philosophy of On baking technology
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Machine Learning_overview_presentation.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Getting Started with Data Integration: FME Form 101
Spectral efficient network and resource selection model in 5G networks
Machine learning based COVID-19 study performance prediction
Heart disease approach using modified random forest and particle swarm optimi...
A Presentation on Artificial Intelligence
Encapsulation_ Review paper, used for researhc scholars
Mobile App Security Testing_ A Comprehensive Guide.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Approach and Philosophy of On baking technology
Group 1 Presentation -Planning and Decision Making .pptx
Spectroscopy.pptx food analysis technology
Machine Learning_overview_presentation.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Diabetes mellitus diagnosis method based random forest with bat algorithm
SOPHOS-XG Firewall Administrator PPT.pptx
Programs and apps: productivity, graphics, security and other tools
NewMind AI Weekly Chronicles - August'25-Week II
Accuracy of neural networks in brain wave diagnosis of schizophrenia

Get More Out of MySQL with TokuDB

  • 1. Get More Out of MySQL with TokuDB Tim Callaghan VP/Engineering, Tokutek [email protected] @tmcallaghan
  • 2. Tokutek: Database Performance Engines What is Tokutek? Tokutek® offers high performance and scalability for MySQL, MariaDB and MongoDB. Our easy-to-use open source solutions are compatible with your existing code and application infrastructure. Tokutek Performance Engines Remove Limitations • Improve insertion performance by 20X • Reduce HDD and flash storage requirements up to 90% • No need to rewrite code Tokutek Mission: Empower your database to handle the Big Data requirements of today’s applications
  • 4. Housekeeping • This presentation will be available for replay following the event • We welcome your questions; please use the console on the right of your screen and we will answer following the presentation • A copy of the presentation is available upon request
  • 5. Agenda Lets answer the following questions, “How can you…?” • Easily install and configure TokuDB. • Dramatically increase performance without rewriting code. • Reduce the total cost of your servers and storage. • Simply perform online schema changes. • Avoid becoming the support staff for your application. • And Q+A
  • 6. How easy is it to install and configure TokuDB for MySQL or MariaDB?
  • 7. What is TokuDB? • TokuDB = MySQL* Storage Engine + Patches** – * MySQL, MariaDB, Percona Server – ** Patches are required for full functionality – TokuDB is more than a plugin • Transactional, ACID + MVCC – Like InnoDB • Drop-in replacement for MySQL • Open Source – http://github.com/Tokutek/ft-engine
  • 8. Where can I get TokuDB? • Tokutek offers MySQL 5.5 and MariaDB 5.5 builds – www.tokutek.com • MariaDB 5.5 and 10 – www.mariadb.org – Also in MariaDB 5.5 from various package repositories • Experimental Percona Server 5.6 builds – www.percona.com
  • 9. Is it truly a “drop in replacement”? • No Foreign Key support – you’ll need to drop them • No Windows or OSX binaries – Virtual machines are helpful in evaluations • No 32-bit builds • Otherwise, yes
  • 10. How do I get started? • Start Fresh – create table <table> engine=tokudb; – mysqldump / load data infile • Use your existing MySQL data folder – alter table <table-to-convert> engine=tokudb; • Measure the differences – compression : load/convert your tables – performance : run your workload – online schema changes : add a column
  • 11. Before you dive in – check you’re my.cnf • TokuDB uses sensible server parameter defaults, but • Be mindful of your memory – Reduce innodb_buffer_pool_size (InnoDB) and key_cache_size (MyISAM) – Especially if converting tables – tokudb_cache_size=?G – Defaults to 50% of RAM, I recommend 80% – tokudb_directio=1 • Leave everything else alone
  • 12. How can I dramatically increase performance without having to rewrite code?
  • 13. Where does the performance come from? • Tokutek’s Fractal Tree® indexes – Much faster than B-trees in > RAM workloads – InnoDB and MyISAM use B-trees – Significant IO reduction – Messages defer IO on add/update/delete – All reads and writes are compressed – Enables users to add more indexes – Queries go faster • Lots of good webinar content on our website – www.tokutek.com/resources/webinars
  • 14. How much can I reduce my IO? Converted from InnoDB to TokuDB
  • 15. How fast can I insert data into TokuDB? • InnoDB’s B-trees – Fast until the index not longer fits in RAM • TokuDB’s Fractal Tree indexes – Start fast, stay fast! • iiBench benchmark – Insert 1 billion rows – 1000 inserts per batch – Auto-increment PK – 3 secondary indexes
  • 16. How fast can I insert data into TokuDB?
  • 17. How fast are mixed workloads? • Fast, since > RAM mixed workloads generally contain… – Index maintenance (insert, update, delete) – Fractal Tree indexes FTW! – Queries – TokuDB enables richer indexing (more indexes) • Sysbench benchmark – 16 tables, 50 million rows per table – Each Sysbench transaction contains – 1 of each query : point, range, aggregation – indexed update, unindexed update, delete, insert
  • 18. How fast are mixed workloads?
  • 19. How do secondary indexes work? • InnoDB and TokuDB “cluster” the primary key index – The key (PK) and all other columns are co-located in memory and on disk • Secondary indexes co-locate the “index key” and PK – When a candidate row is found a second lookup occurs into the PK index – This means an additional IO is required – MySQL’s “hidden join”
  • 20. What is a clustered secondary index? • “Covering” indexes remove this second lookup, but require putting the right columns into the index – create index idx_1 on t1 (c1, c2, c3, c4, c5, c6); – If c1/c2 are queried, only c3/c4/c5/c6 are covered – No additional IO, but c7 isn’t covered • TokuDB supports clustered secondary indexes – create clustering index idx_1 on t1 (c1, c2); – All columns in t1 are covered, forever – Even if new columns are added to the table
  • 21. What are clustered secondary indexes good at? • Two words, “RANGE SCANS” • Several rows (maybe thousands) are scanned without requiring additional lookups on the PK index • Also, TokuDB blocks are much larger than InnoDB – TokuDB = 4MB blocks = sequential IO – InnoDB = 16KB blocks = random IO • Can be orders of magnitude faster for range queries
  • 22. Can SQL be optimized? • Fractal Tree indexes support message injection – The actual work (and IO) can be deferred • Example: – update t1 set k = k + 1 where pk = 5; – InnoDB follows read-modify-write pattern – If field “k” is not indexed, TokuDB avoids IO entirely – An “increment” message is injected • Current optimizations – “replace into”, “insert ignore”, “update”, “insert on duplicate key update”
  • 23. How can I reduce the total cost of my servers and storage?
  • 24. How can I use less storage? • Compression, compression, compression! • All IO in TokuDB is compressed – Reads and writes – Usually ~5x compression (but I’ve seen 25x or more) • TokuDB [currently] supports 3 compression algorithms – lzma = highest compression (and high CPU) – zlib = high compression (and much less CPU) – quicklz = medium compression (even less CPU) – pluggable architecture, lz4 and snappy “in the lab”
  • 25. But doesn’t InnoDB support compression? • Yes, but the compression achieved is far lower – InnoDB compresses 16K blocks, TokuDB is 64K or 128K – InnoDB requires fixed on-disk size, TokuDB is flexible *log style data
  • 26. But doesn’t InnoDB support compression? • And InnoDB performance is severely impacted by it – Compression “misses” are costly *iiBench workload
  • 27. How do I compress my data in TokuDB? create table t1 (c1 bigint not null primary key) engine=tokudb row_format=[tokudb_lzma | tokudb_zlib | tokudb_quicklz]; NOTE: Compression is not optional in TokuDB, we use compression to provide performance advantages as well as save space.
  • 28. How can I perform online schema changes?
  • 29. What is an “online” schema change? My definition “An online schema change is the ability to add or drop a column on an existing table without blocking further changes to the table or requiring substantial server resources (CPU, RAM, IO, disk) to accomplish the operation.” P.S., I’d like for it to be instantaneous!
  • 30. What do blocking schema changes look like?
  • 31. How have online schema changes evolved? • MySQL 5.5 – Table is read-only while entire table is re-created • “Manual” process – Take slave offline, apply to slave, catch up to master, switch places, repeat • MySQL 5.6 (and ~ Percona’s pt-online-schema-change-tool) – Table is rebuilt “in the background” – Changes are captured, and replayed on new table – Uses significant RAM, CPU, IO, and disk space • TokuDB – alter table t1 add column new_column bigint; – Done!
  • 32. What online schema changes can TokuDB handle? • Add column • Drop column • Expand column – integer types – varchar, char, varbinary • Index creation
  • 33. How can I avoid becoming the support staff for my application?
  • 34. 34 TokuDB is offered in 2 editions • Community – Community support (Google Groups “tokudb-user”) • Enterprise subscription – Commercial support – Wouldn’t you rather be developing another application? – Extra features – Hot backup, more on the way – Access to TokuDB experts – Input to the product roadmap Where can I get TokuDB support?
  • 35. 35 Tokutek: Database Performance Engines Any Questions? Download TokuDB at www.tokutek.com/products/downloads Register for product updates, access to premium content, and invitations at www.tokutek.com Join the Conversation

Editor's Notes