SlideShare a Scribd company logo
ADDING FAST ANALYTICS TO
MYSQL APPLICATIONS
WITH CLICKHOUSE
Robert Hodges and Altinity Engineering Team
Introduction to Presenter
www.altinity.com
Leading software and services
provider for ClickHouse
Major committer and community
sponsor in US and Western Europe
Robert Hodges - Altinity CEO
30+ years on DBMS plus
virtualization and security.
ClickHouse is DBMS #20
Introduction to ClickHouse
SQL optimized for analytics
Runs on bare metal to cloud
Stores data in columns
Parallel and vectorized execution
Scales to many petabytes
Is Open source (Apache 2.0)
Is WAY fast on analytic queries
a b c d
a b c d
a b c d
a b c d
Introduction to MySQL
Full SQL implementation
Runs on bare metal to cloud
Stores data in rows
Single-threaded, concurrent query
Scales to high transaction loads
Is Open source (GPL V2)
Is WAY fast on updates & point queries
BTreeIndex
a b c d e f g h
i j k l
BTreeBTree
MySQL Trade-Offs
Queries on large MySQL tables are resource-intensive and inefficient....
● Enormous I/O load due to row organization
● Careful indexing required
● Compression of limited value
● Parallel query limited/unavailable
● Highly dependent on buffer pool size
For rows > 100M MySQL analytic results are very slow
Options for ClickHouse query acceleration
Full migration to ClickHouseMove big tables, keep
dimensions in MySQL
Other combinations are possible...
Accessing MySQL
Tables from
ClickHouse
MySQL sample tables
request_id*
datetime
date
customer_id
sku_id
Table: traffic
id*
name
Table: customer
id*
name
Table: sku
Database: repl
Accessing MySQL data from ClickHouse
MySQL Database Engine
MySQL Table Function
MySQL Table Engine
MySQL Dictionary
Selecting data from tables in MySQL
-- Select data from all tables.
SELECT
t.datetime, t.date, t.request_id,
c.name customer, s.name sku
FROM traffic t
JOIN customer c ON t.customer_id = c.id
JOIN sku s ON t.sku_id = s.id
LIMIT 10;
Access a MySQL database from ClickHouse
CREATE DATABASE mysql_repl
ENGINE=MySQL(
'127.0.0.1:3306',
'repl',
'root',
'secret')
use mysql_repl
show tables
Database engine
Navigating MySQL tables from ClickHouse
Demo
Selecting data from MySQL
SELECT
t.datetime, t.date, t.request_id,
t.name customer, s.name sku
FROM (
SELECT t.* FROM traffic t
JOIN customer c ON t.customer_id = c.id) AS t
JOIN sku s ON t.sku_id = s.id
WHERE customer_id = 5
ORDER BY t.request_id LIMIT 10
Predicate pushed
down to MySQL
ClickHouse performance beats MySQL!
Transferring data from MySQL Engine
-- Create a ClickHouse table from MySQL.
CREATE TABLE traffic as repl.traffic
ENGINE = MergeTree
PARTITION BY toYYYYMM(datetime)
ORDER BY (customer_id, date)
-- Pull in MySQL data.
INSERT INTO traffic SELECT *
FROM mysql_repl.traffic
SELECT count(*) FROM traffic
Accessing data using MySQL table function
SELECT t.datetime, t.date, t.request_id, c.name customer
FROM traffic t
JOIN
mysql('127.0.0.1:3306', 'repl', 'customer', 'root',
'secret') c
ON t.customer_id = c.id
WHERE t.customer_id = 5
ORDER BY t.request_id LIMIT 10
Predicate pushdown
works from base table
Accessing data using MySQL table engine
CREATE TABLE mysql_customer (
id Int32,
name String
)
ENGINE = MySQL(127.0.0.1:3306', 'repl', 'customer', 'root',
'secret')
SELECT t.datetime, t.date, t.request_id, c.name customer
FROM traffic t
JOIN mysql_customer c ON t.customer_id = c.id
ORDER BY t.request_id LIMIT 10
Access a MySQL table using a Dictionary
<yandex>
<dictionary><name>mysql_sku</name>
<source> <mysql>
<host>localhost</host> <port>3306</port><user>root</user>
<password>********</password><db>repl</db> <table>sku</table>
</mysql> </source>
<layout> <hashed/> </layout>
<structure>
<id> <name>id</name> </id>
<attribute>
<name>name</name> <type>String</type> <null_value></null_value>
</attribute>
</structure>
<lifetime>0</lifetime>
</dictionary>
</yandex>
Select local, remote, and dictionary data
SELECT
t.datetime,
t.date,
t.request_id,
c.name AS customer,
dictGetOrDefault('mysql_sku', 'name',
toUInt64(sku_id), 'NOT FOUND') AS sku
FROM traffic AS t
INNER JOIN mysql_customer AS c ON t.customer_id = c.id
ORDER BY t.request_id ASC
LIMIT 10
Figuring out what’s happening on MySQL
-- Enable MySQL query log
set global general_log=1;
(MySQL query log)
16 Connect root@localhost on repl using TCP/IP
16 Query SET NAMES utf8
16 Query SELECT `id`, `name` FROM `repl`.`sku` WHERE `id` = 3
Automatic
Propagation of
Changes
Replicate MySQL Data to ClickHouse
SELECT from MySQL Table
MySQL Row Replication
Kafka
SELECT Method: Setup
-- Create a tracking table on MySQL side.
CREATE TABLE last_request_id (
id bigint
);
INSERT INTO last_request_id VALUES (-1);
Ensure table is
visible in
ClickHouse
Load initial value
SELECT Method: Change Propagation
INSERT INTO traffic SELECT *
FROM mysql_repl.traffic
WHERE request_id >
(
SELECT max(id)
FROM mysql_repl.last_request_id
)
INSERT INTO mysql_repl.last_request_id SELECT max(request_id)
FROM traffic
Select new
rows
Update tracking value
MySQL Replication Method: Setup
my.cnf:
server-id = 1
log_bin = /var/log/mysql/mysql-bin.log
expire_logs_days = 10
max_binlog_size = 100M
binlog-format = row
(1) Ensure MySQL
table(s) have
primary keys
(2) Enable row
replication
(3) Install and run
clickhouse-mysql
https://github.com/Altinity/clickhouse-mysql-data-reader
MySQL Replication: Change Propagation
Demo
Kafka Method: Discussion
Kafka
Queue
Binlog
Binlog
Binlog
Food for thought
● ClickHouse works best on wide tables
● Consider triggers to add dimension info to base rows on MySQL
● Replication methods are complex to operate
● Use Kafka when many MySQL instances generate data
● Best approach is to migrate large tables completely to MySQL
○ No replication to manage
○ MySQL runs faster and requires fewer resources
ClickHouse MySQL
Client Support
ClickHouse supports MySQL clients??!
<?xml version="1.0"?>
<yandex>
...
<!-- Enable MySQL wire protocol. -->
<mysql_port>33306</mysql_port>
...
</yandex>
Here’s the proof!
mysql -h127.0.0.1 -P33306 -udefault --password=''
...
mysql> use mysql_repl
mysql> SELECT
-> t.datetime, t.date, t.request_id,
-> t.name customer, s.name sku
-> FROM (
-> SELECT t.* FROM traffic t
-> JOIN customer c ON t.customer_id = c.id) AS t
-> JOIN sku s ON t.sku_id = s.id
-> WHERE customer_id = 5
-> ORDER BY t.request_id LIMIT 10;
...
Wrap-up
Key Takeaways
● ClickHouse has multiple ways to access data in MySQL
● Use replication to pull changes, if you have to
● ClickHouse supports MySQL Protocol so clients can
connect directly
● Keep approach as simple as possible for maximum joy
ClickHouse can query MySQL data faster than MySQL!*
*Your mileage may vary
Thank you!
Special Offer:
Contact us for a
1-hour consultation!
Contacts:
info@altinity.com
Visit us at:
https://www.altinity.com
ClickHouse-MySQL:
https://github.com/Altinity/clickhouse-mysql-data-reader
Free Consultation:
https://blog.altinity.com/offer

More Related Content

What's hot (20)

PDF
Altinity Cluster Manager: ClickHouse Management for Kubernetes and Cloud
Altinity Ltd
 
PDF
Altinity Quickstart for ClickHouse
Altinity Ltd
 
PDF
Tiered storage intro. By Robert Hodges, Altinity CEO
Altinity Ltd
 
PDF
Creating Beautiful Dashboards with Grafana and ClickHouse
Altinity Ltd
 
PDF
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
Altinity Ltd
 
PDF
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
Altinity Ltd
 
PDF
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Altinity Ltd
 
PDF
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
Altinity Ltd
 
PDF
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Altinity Ltd
 
PDF
ClickHouse Features for Advanced Users, by Aleksei Milovidov
Altinity Ltd
 
PDF
Fun with click house window functions webinar slides 2021-08-19
Altinity Ltd
 
PDF
ClickHouse materialized views - a secret weapon for high performance analytic...
Altinity Ltd
 
PDF
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
Altinity Ltd
 
PDF
ClickHouse Unleashed 2020: Our Favorite New Features for Your Analytical Appl...
Altinity Ltd
 
PDF
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...
Altinity Ltd
 
PDF
Better than you think: Handling JSON data in ClickHouse
Altinity Ltd
 
PDF
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Altinity Ltd
 
PDF
ClickHouse Analytical DBMS. Introduction and usage, by Alexander Zaitsev
Altinity Ltd
 
PDF
ClickHouse Monitoring 101: What to monitor and how
Altinity Ltd
 
PDF
Big Data and Beautiful Video: How ClickHouse enables Mux to Deliver Content a...
Altinity Ltd
 
Altinity Cluster Manager: ClickHouse Management for Kubernetes and Cloud
Altinity Ltd
 
Altinity Quickstart for ClickHouse
Altinity Ltd
 
Tiered storage intro. By Robert Hodges, Altinity CEO
Altinity Ltd
 
Creating Beautiful Dashboards with Grafana and ClickHouse
Altinity Ltd
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
Altinity Ltd
 
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
Altinity Ltd
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Altinity Ltd
 
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
Altinity Ltd
 
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Altinity Ltd
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
Altinity Ltd
 
Fun with click house window functions webinar slides 2021-08-19
Altinity Ltd
 
ClickHouse materialized views - a secret weapon for high performance analytic...
Altinity Ltd
 
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
Altinity Ltd
 
ClickHouse Unleashed 2020: Our Favorite New Features for Your Analytical Appl...
Altinity Ltd
 
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...
Altinity Ltd
 
Better than you think: Handling JSON data in ClickHouse
Altinity Ltd
 
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Altinity Ltd
 
ClickHouse Analytical DBMS. Introduction and usage, by Alexander Zaitsev
Altinity Ltd
 
ClickHouse Monitoring 101: What to monitor and how
Altinity Ltd
 
Big Data and Beautiful Video: How ClickHouse enables Mux to Deliver Content a...
Altinity Ltd
 

Similar to Webinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse (20)

PDF
ClickHouse 2018. How to stop waiting for your queries to complete and start ...
Altinity Ltd
 
PPTX
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Altinity Ltd
 
PPTX
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Altinity Ltd
 
PDF
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
Athens Big Data
 
PDF
Low Cost Transactional and Analytics with MySQL + Clickhouse
Jervin Real
 
PDF
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Altinity Ltd
 
PDF
Low Cost Transactional and Analytics with MySQL + Clickhouse
Jervin Real
 
PDF
ClickHouse Introduction by Alexander Zaitsev, Altinity CTO
Altinity Ltd
 
PPTX
ClickHouse Paris Meetup. ClickHouse Analytical DBMS, Introduction. By Alexand...
Altinity Ltd
 
PDF
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
Altinity Ltd
 
PDF
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
Altinity Ltd
 
PDF
ClickHouse Introduction, by Alexander Zaitsev, Altinity CTO
Altinity Ltd
 
PPTX
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 
PDF
ClickHouse Analytical DBMS: Introduction and Case Studies, by Alexander Zaitsev
Altinity Ltd
 
PDF
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Altinity Ltd
 
PDF
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
Altinity Ltd
 
PDF
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Alkin Tezuysal
 
PDF
Our Story With ClickHouse at seo.do
Metehan Çetinkaya
 
PDF
10 Good Reasons to Use ClickHouse
rpolat
 
PDF
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
Jesmar Cannao'
 
ClickHouse 2018. How to stop waiting for your queries to complete and start ...
Altinity Ltd
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Altinity Ltd
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Altinity Ltd
 
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
Athens Big Data
 
Low Cost Transactional and Analytics with MySQL + Clickhouse
Jervin Real
 
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Altinity Ltd
 
Low Cost Transactional and Analytics with MySQL + Clickhouse
Jervin Real
 
ClickHouse Introduction by Alexander Zaitsev, Altinity CTO
Altinity Ltd
 
ClickHouse Paris Meetup. ClickHouse Analytical DBMS, Introduction. By Alexand...
Altinity Ltd
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
Altinity Ltd
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
Altinity Ltd
 
ClickHouse Introduction, by Alexander Zaitsev, Altinity CTO
Altinity Ltd
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 
ClickHouse Analytical DBMS: Introduction and Case Studies, by Alexander Zaitsev
Altinity Ltd
 
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Altinity Ltd
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
Altinity Ltd
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Alkin Tezuysal
 
Our Story With ClickHouse at seo.do
Metehan Çetinkaya
 
10 Good Reasons to Use ClickHouse
rpolat
 
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
Jesmar Cannao'
 
Ad

More from Altinity Ltd (20)

PDF
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Altinity Ltd
 
PDF
Fun with ClickHouse Window Functions-2021-08-19.pdf
Altinity Ltd
 
PDF
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Altinity Ltd
 
PDF
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Altinity Ltd
 
PDF
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Altinity Ltd
 
PDF
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Altinity Ltd
 
PDF
ClickHouse ReplacingMergeTree in Telecom Apps
Altinity Ltd
 
PDF
Adventures with the ClickHouse ReplacingMergeTree Engine
Altinity Ltd
 
PDF
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Altinity Ltd
 
PDF
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Ltd
 
PDF
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
Altinity Ltd
 
PDF
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
Altinity Ltd
 
PDF
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
Altinity Ltd
 
PDF
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
Altinity Ltd
 
PDF
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
Altinity Ltd
 
PDF
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
Altinity Ltd
 
PDF
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
Altinity Ltd
 
PDF
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
Altinity Ltd
 
PDF
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
Altinity Ltd
 
PDF
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
Altinity Ltd
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Altinity Ltd
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Altinity Ltd
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Altinity Ltd
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Altinity Ltd
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Altinity Ltd
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Altinity Ltd
 
ClickHouse ReplacingMergeTree in Telecom Apps
Altinity Ltd
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Altinity Ltd
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Altinity Ltd
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Ltd
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
Altinity Ltd
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
Altinity Ltd
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
Altinity Ltd
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
Altinity Ltd
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
Altinity Ltd
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
Altinity Ltd
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
Altinity Ltd
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
Altinity Ltd
 
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
Altinity Ltd
 
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
Altinity Ltd
 
Ad

Recently uploaded (20)

PDF
LLM Search Readiness Audit - Dentsu x SEO Square - June 2025.pdf
Nick Samuel
 
PDF
Unlocking FME Flow’s Potential: Architecture Design for Modern Enterprises
Safe Software
 
PDF
Enhancing Environmental Monitoring with Real-Time Data Integration: Leveragin...
Safe Software
 
PDF
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
 
PPTX
Practical Applications of AI in Local Government
OnBoard
 
PDF
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
 
PDF
Plugging AI into everything: Model Context Protocol Simplified.pdf
Abati Adewale
 
PPTX
Enabling the Digital Artisan – keynote at ICOCI 2025
Alan Dix
 
PPTX
Smarter Governance with AI: What Every Board Needs to Know
OnBoard
 
PDF
Automating the Geo-Referencing of Historic Aerial Photography in Flanders
Safe Software
 
PDF
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
PDF
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
PDF
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
 
PDF
Open Source Milvus Vector Database v 2.6
Zilliz
 
PDF
UiPath Agentic AI ile Akıllı Otomasyonun Yeni Çağı
UiPathCommunity
 
PPTX
UserCon Belgium: Honey, VMware increased my bill
stijn40
 
PDF
The Growing Value and Application of FME & GenAI
Safe Software
 
PDF
Python Conference Singapore - 19 Jun 2025
ninefyi
 
PDF
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
PDF
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
 
LLM Search Readiness Audit - Dentsu x SEO Square - June 2025.pdf
Nick Samuel
 
Unlocking FME Flow’s Potential: Architecture Design for Modern Enterprises
Safe Software
 
Enhancing Environmental Monitoring with Real-Time Data Integration: Leveragin...
Safe Software
 
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
 
Practical Applications of AI in Local Government
OnBoard
 
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
 
Plugging AI into everything: Model Context Protocol Simplified.pdf
Abati Adewale
 
Enabling the Digital Artisan – keynote at ICOCI 2025
Alan Dix
 
Smarter Governance with AI: What Every Board Needs to Know
OnBoard
 
Automating the Geo-Referencing of Historic Aerial Photography in Flanders
Safe Software
 
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
 
Open Source Milvus Vector Database v 2.6
Zilliz
 
UiPath Agentic AI ile Akıllı Otomasyonun Yeni Çağı
UiPathCommunity
 
UserCon Belgium: Honey, VMware increased my bill
stijn40
 
The Growing Value and Application of FME & GenAI
Safe Software
 
Python Conference Singapore - 19 Jun 2025
ninefyi
 
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
 

Webinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse

  • 1. ADDING FAST ANALYTICS TO MYSQL APPLICATIONS WITH CLICKHOUSE Robert Hodges and Altinity Engineering Team
  • 2. Introduction to Presenter www.altinity.com Leading software and services provider for ClickHouse Major committer and community sponsor in US and Western Europe Robert Hodges - Altinity CEO 30+ years on DBMS plus virtualization and security. ClickHouse is DBMS #20
  • 3. Introduction to ClickHouse SQL optimized for analytics Runs on bare metal to cloud Stores data in columns Parallel and vectorized execution Scales to many petabytes Is Open source (Apache 2.0) Is WAY fast on analytic queries a b c d a b c d a b c d a b c d
  • 4. Introduction to MySQL Full SQL implementation Runs on bare metal to cloud Stores data in rows Single-threaded, concurrent query Scales to high transaction loads Is Open source (GPL V2) Is WAY fast on updates & point queries BTreeIndex a b c d e f g h i j k l BTreeBTree
  • 5. MySQL Trade-Offs Queries on large MySQL tables are resource-intensive and inefficient.... ● Enormous I/O load due to row organization ● Careful indexing required ● Compression of limited value ● Parallel query limited/unavailable ● Highly dependent on buffer pool size For rows > 100M MySQL analytic results are very slow
  • 6. Options for ClickHouse query acceleration Full migration to ClickHouseMove big tables, keep dimensions in MySQL Other combinations are possible...
  • 8. MySQL sample tables request_id* datetime date customer_id sku_id Table: traffic id* name Table: customer id* name Table: sku Database: repl
  • 9. Accessing MySQL data from ClickHouse MySQL Database Engine MySQL Table Function MySQL Table Engine MySQL Dictionary
  • 10. Selecting data from tables in MySQL -- Select data from all tables. SELECT t.datetime, t.date, t.request_id, c.name customer, s.name sku FROM traffic t JOIN customer c ON t.customer_id = c.id JOIN sku s ON t.sku_id = s.id LIMIT 10;
  • 11. Access a MySQL database from ClickHouse CREATE DATABASE mysql_repl ENGINE=MySQL( '127.0.0.1:3306', 'repl', 'root', 'secret') use mysql_repl show tables Database engine
  • 12. Navigating MySQL tables from ClickHouse Demo
  • 13. Selecting data from MySQL SELECT t.datetime, t.date, t.request_id, t.name customer, s.name sku FROM ( SELECT t.* FROM traffic t JOIN customer c ON t.customer_id = c.id) AS t JOIN sku s ON t.sku_id = s.id WHERE customer_id = 5 ORDER BY t.request_id LIMIT 10 Predicate pushed down to MySQL
  • 15. Transferring data from MySQL Engine -- Create a ClickHouse table from MySQL. CREATE TABLE traffic as repl.traffic ENGINE = MergeTree PARTITION BY toYYYYMM(datetime) ORDER BY (customer_id, date) -- Pull in MySQL data. INSERT INTO traffic SELECT * FROM mysql_repl.traffic SELECT count(*) FROM traffic
  • 16. Accessing data using MySQL table function SELECT t.datetime, t.date, t.request_id, c.name customer FROM traffic t JOIN mysql('127.0.0.1:3306', 'repl', 'customer', 'root', 'secret') c ON t.customer_id = c.id WHERE t.customer_id = 5 ORDER BY t.request_id LIMIT 10 Predicate pushdown works from base table
  • 17. Accessing data using MySQL table engine CREATE TABLE mysql_customer ( id Int32, name String ) ENGINE = MySQL(127.0.0.1:3306', 'repl', 'customer', 'root', 'secret') SELECT t.datetime, t.date, t.request_id, c.name customer FROM traffic t JOIN mysql_customer c ON t.customer_id = c.id ORDER BY t.request_id LIMIT 10
  • 18. Access a MySQL table using a Dictionary <yandex> <dictionary><name>mysql_sku</name> <source> <mysql> <host>localhost</host> <port>3306</port><user>root</user> <password>********</password><db>repl</db> <table>sku</table> </mysql> </source> <layout> <hashed/> </layout> <structure> <id> <name>id</name> </id> <attribute> <name>name</name> <type>String</type> <null_value></null_value> </attribute> </structure> <lifetime>0</lifetime> </dictionary> </yandex>
  • 19. Select local, remote, and dictionary data SELECT t.datetime, t.date, t.request_id, c.name AS customer, dictGetOrDefault('mysql_sku', 'name', toUInt64(sku_id), 'NOT FOUND') AS sku FROM traffic AS t INNER JOIN mysql_customer AS c ON t.customer_id = c.id ORDER BY t.request_id ASC LIMIT 10
  • 20. Figuring out what’s happening on MySQL -- Enable MySQL query log set global general_log=1; (MySQL query log) 16 Connect root@localhost on repl using TCP/IP 16 Query SET NAMES utf8 16 Query SELECT `id`, `name` FROM `repl`.`sku` WHERE `id` = 3
  • 22. Replicate MySQL Data to ClickHouse SELECT from MySQL Table MySQL Row Replication Kafka
  • 23. SELECT Method: Setup -- Create a tracking table on MySQL side. CREATE TABLE last_request_id ( id bigint ); INSERT INTO last_request_id VALUES (-1); Ensure table is visible in ClickHouse Load initial value
  • 24. SELECT Method: Change Propagation INSERT INTO traffic SELECT * FROM mysql_repl.traffic WHERE request_id > ( SELECT max(id) FROM mysql_repl.last_request_id ) INSERT INTO mysql_repl.last_request_id SELECT max(request_id) FROM traffic Select new rows Update tracking value
  • 25. MySQL Replication Method: Setup my.cnf: server-id = 1 log_bin = /var/log/mysql/mysql-bin.log expire_logs_days = 10 max_binlog_size = 100M binlog-format = row (1) Ensure MySQL table(s) have primary keys (2) Enable row replication (3) Install and run clickhouse-mysql https://github.com/Altinity/clickhouse-mysql-data-reader
  • 26. MySQL Replication: Change Propagation Demo
  • 28. Food for thought ● ClickHouse works best on wide tables ● Consider triggers to add dimension info to base rows on MySQL ● Replication methods are complex to operate ● Use Kafka when many MySQL instances generate data ● Best approach is to migrate large tables completely to MySQL ○ No replication to manage ○ MySQL runs faster and requires fewer resources
  • 30. ClickHouse supports MySQL clients??! <?xml version="1.0"?> <yandex> ... <!-- Enable MySQL wire protocol. --> <mysql_port>33306</mysql_port> ... </yandex>
  • 31. Here’s the proof! mysql -h127.0.0.1 -P33306 -udefault --password='' ... mysql> use mysql_repl mysql> SELECT -> t.datetime, t.date, t.request_id, -> t.name customer, s.name sku -> FROM ( -> SELECT t.* FROM traffic t -> JOIN customer c ON t.customer_id = c.id) AS t -> JOIN sku s ON t.sku_id = s.id -> WHERE customer_id = 5 -> ORDER BY t.request_id LIMIT 10; ...
  • 33. Key Takeaways ● ClickHouse has multiple ways to access data in MySQL ● Use replication to pull changes, if you have to ● ClickHouse supports MySQL Protocol so clients can connect directly ● Keep approach as simple as possible for maximum joy ClickHouse can query MySQL data faster than MySQL!* *Your mileage may vary
  • 34. Thank you! Special Offer: Contact us for a 1-hour consultation! Contacts: [email protected] Visit us at: https://www.altinity.com ClickHouse-MySQL: https://github.com/Altinity/clickhouse-mysql-data-reader Free Consultation: https://blog.altinity.com/offer