SlideShare a Scribd company logo
Query Kafka with SQL
Jove Zhong
Co-Founder and Head of Product, Timeplus
Gang Tao
Co-Founder and CTO, Timeplus
Why, how, and what’s next?
Sep 27, 2023
image credit: teacoffeecup.com
sed 's/coffee/SQL on Kafka/g'
Real-time data is everywhere, at the edge and cloud
46 ZB
of data created by
billions of IoT by 2025
30%
of data generated will be
real-time by 2025
Only 1%
of data is analyzed and
streaming data is
primarily untapped
Why SQL on Kafka?
Why SQL on Database?
ret = open_database(&(my_stock->inventory_dbp)..);
my_database->get(my_database, NULL, &key, &data, 0);
client.get(key)
update_bins = {'b'=: u"ud83dude04" 'i': aerospike.null()}
client.put(key, update_bins)
request = new GetItemRequest()
.withKey(key_to_get)
.withTableName(table_name);
SELECT * FROM tab WHERE id='id1'
UPDATE tab SET flag=FALSE WHERE id='id1'
Why SQL on Kafka?
Reliable Fast Easy
Powerful Descriptive
FinTech
● Real-time post-trade analytics
● Real-time pricing
DevOps
● Real-time Github insights
● Real-time o11y and usage based
pricing
Security Compliance
● SOC2 compliance
● Container vulnerability monitoring
● Monitor Superblocks user activities
● Protect sensitive info in Slack
IoT
● Real-time fleet monitoring
Customer 360
● Auth0 notifications for new signups
● HubSpot custom dashboards/alerts
● Jitsu clickstream analytics
● Real-time Twitter marketing
Misc
● Wildfire monitoring and alerting
● Data-driven parent
Sample Use Cases
source: https://docs.timeplus.com/showcases
How do you like your coffee?
Flink ksqlDB Hazelcast
Druid Pinot
Trino
ClickHouse StarRocks
RisingWave Databend
Streaming
Processor
Streaming
Database
Real-time
Database
FlinkSQL
since 2016
FlinkSQL
since 2016
Community
☕☕☕☕
Real-time ☕☕☕
Streaming ☕☕☕
Historical ☕
JOIN
☕☕☕☕
Largescale
☕☕☕☕
Lightweight☕☕
Easy to use☕☕
Community ☕☕☕
Real-time ☕☕☕
Streaming ☕☕☕
Historical ☕☕
JOIN ☕☕☕
Largescale ☕☕
Lightweight☕☕
Easy to use☕☕☕
ksqlDB
since 2019
Distributed computation and storage platform
No dependency on disk storage, it keeps all its
operational state in the RAM of the cluster.
Flink ksqlDB Hazelcast
Druid Pinot
Trino
Streaming
Processor
Streaming
Database
Real-time
Database
1. create a schema json (columns, PKs)
2. create a table configuration json (streamType=Kafka)
3. docker run .. apachepinot/pinot:latest AddTable 
-schemaFile /tmp/transcript-schema.json 
-tableConfigFile /tmp/transcript-table-realtime.json 
..
-exec
1. load the druid-kafka-indexing-service extension on both the
Overlord and the MiddleManagers
2. Create a supervisor-spec.json containing the Kafka
supervisor spec file.
3. curl -X POST -H 'Content-Type: application/json' -d
@supervisor-spec.json
http://localhost:8090/druid/indexer/v1/supervisor
Add a catalog properties file etc/catalog/kafka.properties for the Kafka connector.
$ ./trino --catalog kafka --schema aSchema
trino:aSchema> SELECT count(*) FROM customer;
ClickHouse StarRocks
Streaming
Processor
Streaming
Database
Real-time
Database
the Next-Generation Streaming Database
(Kafka + Flink + ClickHouse )
SQL with streaming extension
Data Ingestion Unified Query Processing Pipeline
ingest
append
stream
read
historical
read
streaming
storage
historical
storage
query
Kafka
External
Stream
SELECT * FROM car_live_data
Stream tail
SELECT count(*) FROM car_live_data
Global
aggregation
SELECT window_start, count(*)
FROM tumble(car_live_data, 1m)
GROUP BY window_start
Window
aggregation
SELECT cid,
speed_kmh,
lag(speed_kmh) OVER
(PARTITION BY cid) AS last_spd
FROM car_live_data
Sub streams
SELECT window_start, count(*)
FROM tumble(car_live_data, 5s)
GROUP BY window_start
EMIT AFTER WATERMARK AND DELAY 2s
Late event
SELECT *
FROM car_live_data
WHERE
_tp_time > now() - 1d
Time travel
Community ☕☕
Real-time
☕☕☕☕
Streaming ☕☕☕
Historical
☕☕☕☕
JOIN
☕☕☕☕
Largescale ☕☕
Lightweight
☕☕☕☕
Easy to use ☕☕☕
since 2021
Mocha
one more thing
ClickHouse StarRocks
CREATE TABLE queue2 (
timestamp UInt64,
level String,
message String
)
ENGINE = Kafka
SETTINGS
kafka_broker_list =
'localhost:9092',
kafka_topic_list = 'topic',
kafka_group_name = 'group1',
kafka_format = 'JSONEachRow',
kafka_num_consumers = 4;
CREATE ROUTINE LOAD test_db.table102 ON
table1
COLUMNS TERMINATED BY ",",
COLUMNS (user_id, user_gender,
event_date, event_type)
WHERE event_type = 1
FROM KAFKA
(
"kafka_broker_list" =
"<kafka_broker_host>:<kafka_broker_port>"
,
"kafka_topic" = "topic1",
"property.kafka_default_offsets" =
"OFFSET_BEGINNING"
);
ClickHouse features
● table engine and table function
● rich functions and data types
● not 100% ansi compatible
Streaming
Processor
Streaming
Database
Realtime
Database
RisingWave
Databend
Query Your Streaming Data on Kafka using SQL: Why, How, and What
dozer -c dozer-config.yaml
curl -X POST
http://localhost:8080/tout/query
--header 'Content-Type:
application/json'
Community ☕☕☕
Real-time ☕☕☕
Streaming ☕☕☕
Historical ☕☕
JOIN ☕☕☕
Largescale ☕☕☕
Lightweight☕☕☕
☕
Easy to use☕☕☕
Cappuccino
Query Your Streaming Data on Kafka using SQL: Why, How, and What
Programing - turn data into insight
human
machine
1GL - machine language
2GL - assembly language
3GL - imperative language
4GL - descriptive language
5GL - intelligent language
data
insight
source
Streaming
Processor
● SQL as data pipeline
● No data storage
● Unbounded real-time
query
ETL / Data Pipeline
ingest
external
Realtime
Database
● mostly leveraging kafka to
ingest data
● federation search/query
○ ClickHouse Kafka Engine
○ Trino
● Bounded batch query, no
streaming query
Historical Report / Ad hoc Analysis
source
Streaming
Database
● support kafka data
storage
● Unbounded real-time
query
● combination of
real-time data and
historical data
Hybrid
Query Kafka with SQL: Open Source + Cloud + Source Available
Flink
ksqlDB Hazelcast Druid Pinot Trino
ClickHouse StarRocks
RisingWave
Databend
Streaming Processor Streaming Database Realtime Database
Community
☕☕☕☕
Real-time ☕☕☕
Streaming ☕☕☕
Historical ☕
JOIN
☕☕☕☕
Largescale
☕☕☕☕
Lightweight☕☕
Easy to use☕☕
Community ☕☕☕
Real-time ☕☕☕
Streaming ☕☕☕
Historical ☕☕
JOIN ☕☕☕
Largescale ☕☕
Lightweight☕☕
Easy to use☕☕☕
Community ☕☕
Real-time
☕☕☕☕
Streaming ☕☕☕
Historical
☕☕☕☕
JOIN
☕☕☕☕
Largescale ☕☕
Lightweight☕☕☕
☕
Easy to use☕☕☕
Community ☕☕☕
Real-time ☕☕☕
Streaming ☕☕☕
Historical ☕☕
JOIN ☕☕☕
Largescale ☕☕☕
Lightweight☕☕☕
☕
Easy to use☕☕☕
Q+A / Thank you!
Meet us at booth #407
Try Timeplus Proton (Open Source)
Or sign up for a free cloud account
timeplus.com

More Related Content

Similar to Query Your Streaming Data on Kafka using SQL: Why, How, and What (20)

PDF
KSQL - Stream Processing simplified!
Guido Schmutz
 
PPTX
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Kinetica
 
PDF
Streaming ETL with Apache Kafka and KSQL
Nick Dearden
 
PDF
Beyond the brokers - A tour of the Kafka ecosystem
Damien Gasparina
 
PDF
Beyond the Brokers: A Tour of the Kafka Ecosystem
confluent
 
PDF
London Apache Kafka Meetup (Jan 2017)
Landoop Ltd
 
PDF
Introduction to Apache Kafka and Confluent... and why they matter!
Paolo Castagna
 
PDF
Beyond the brokers - Un tour de l'écosystème Kafka
Florent Ramiere
 
PDF
Apache Kafka as Event Streaming Platform for Microservice Architectures
Kai Wähner
 
PDF
First Steps with Apache Kafka on Google Cloud Platform
confluent
 
PDF
JHipster conf 2019 - Kafka Ecosystem
Florent Ramiere
 
PDF
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Ben Stopford
 
PPTX
Streaming Data and Stream Processing with Apache Kafka
confluent
 
PDF
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
confluent
 
PDF
10 essentials steps for kafka streaming services
inovia
 
PDF
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
Kai Wähner
 
PDF
Apache Kafka and ksqlDB in Action: Let's Build a Streaming Data Pipeline! (Ro...
confluent
 
PDF
Streaming etl in practice with postgre sql, apache kafka, and ksql mic
Bas van Oudenaarde
 
PDF
Kafka Vienna Meetup 020719
Patrik Kleindl
 
PPTX
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Lviv Startup Club
 
KSQL - Stream Processing simplified!
Guido Schmutz
 
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Kinetica
 
Streaming ETL with Apache Kafka and KSQL
Nick Dearden
 
Beyond the brokers - A tour of the Kafka ecosystem
Damien Gasparina
 
Beyond the Brokers: A Tour of the Kafka Ecosystem
confluent
 
London Apache Kafka Meetup (Jan 2017)
Landoop Ltd
 
Introduction to Apache Kafka and Confluent... and why they matter!
Paolo Castagna
 
Beyond the brokers - Un tour de l'écosystème Kafka
Florent Ramiere
 
Apache Kafka as Event Streaming Platform for Microservice Architectures
Kai Wähner
 
First Steps with Apache Kafka on Google Cloud Platform
confluent
 
JHipster conf 2019 - Kafka Ecosystem
Florent Ramiere
 
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Ben Stopford
 
Streaming Data and Stream Processing with Apache Kafka
confluent
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
confluent
 
10 essentials steps for kafka streaming services
inovia
 
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
Kai Wähner
 
Apache Kafka and ksqlDB in Action: Let's Build a Streaming Data Pipeline! (Ro...
confluent
 
Streaming etl in practice with postgre sql, apache kafka, and ksql mic
Bas van Oudenaarde
 
Kafka Vienna Meetup 020719
Patrik Kleindl
 
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Lviv Startup Club
 

More from HostedbyConfluent (20)

PDF
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
PDF
Renaming a Kafka Topic | Kafka Summit London
HostedbyConfluent
 
PDF
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 
PDF
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
HostedbyConfluent
 
PDF
Exactly-once Stream Processing with Arroyo and Kafka
HostedbyConfluent
 
PDF
Fish Plays Pokemon | Kafka Summit London
HostedbyConfluent
 
PDF
Tiered Storage 101 | Kafla Summit London
HostedbyConfluent
 
PDF
Building a Self-Service Stream Processing Portal: How And Why
HostedbyConfluent
 
PDF
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
HostedbyConfluent
 
PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
PDF
TL;DR Kafka Metrics | Kafka Summit London
HostedbyConfluent
 
PDF
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
PDF
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
PDF
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
HostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
HostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
HostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
HostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
HostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
HostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 
Ad

Recently uploaded (20)

PDF
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
PPSX
Usergroup - OutSystems Architecture.ppsx
Kurt Vandevelde
 
PPTX
MARTSIA: A Tool for Confidential Data Exchange via Public Blockchain - Poster...
Michele Kryston
 
PDF
How to Comply With Saudi Arabia’s National Cybersecurity Regulations.pdf
Bluechip Advanced Technologies
 
PDF
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
PDF
Dev Dives: Accelerating agentic automation with Autopilot for Everyone
UiPathCommunity
 
PPTX
Paycifi - Programmable Trust_Breakfast_PPTXT
FinTech Belgium
 
PDF
Supporting the NextGen 911 Digital Transformation with FME
Safe Software
 
PDF
ArcGIS Utility Network Migration - The Hunter Water Story
Safe Software
 
PDF
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
 
PDF
FME as an Orchestration Tool with Principles From Data Gravity
Safe Software
 
PDF
Pipeline Industry IoT - Real Time Data Monitoring
Safe Software
 
PPTX
Mastering Authorization: Integrating Authentication and Authorization Data in...
Hitachi, Ltd. OSS Solution Center.
 
PDF
Darley - FIRST Copenhagen Lightning Talk (2025-06-26) Epochalypse 2038 - Time...
treyka
 
PDF
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
Edge AI and Vision Alliance
 
PDF
Proactive Server and System Monitoring with FME: Using HTTP and System Caller...
Safe Software
 
PDF
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
PDF
Enhancing Environmental Monitoring with Real-Time Data Integration: Leveragin...
Safe Software
 
PPTX
CapCut Pro PC Crack Latest Version Free Free
josanj305
 
PDF
Kubernetes - Architecture & Components.pdf
geethak285
 
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
Usergroup - OutSystems Architecture.ppsx
Kurt Vandevelde
 
MARTSIA: A Tool for Confidential Data Exchange via Public Blockchain - Poster...
Michele Kryston
 
How to Comply With Saudi Arabia’s National Cybersecurity Regulations.pdf
Bluechip Advanced Technologies
 
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
Dev Dives: Accelerating agentic automation with Autopilot for Everyone
UiPathCommunity
 
Paycifi - Programmable Trust_Breakfast_PPTXT
FinTech Belgium
 
Supporting the NextGen 911 Digital Transformation with FME
Safe Software
 
ArcGIS Utility Network Migration - The Hunter Water Story
Safe Software
 
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
 
FME as an Orchestration Tool with Principles From Data Gravity
Safe Software
 
Pipeline Industry IoT - Real Time Data Monitoring
Safe Software
 
Mastering Authorization: Integrating Authentication and Authorization Data in...
Hitachi, Ltd. OSS Solution Center.
 
Darley - FIRST Copenhagen Lightning Talk (2025-06-26) Epochalypse 2038 - Time...
treyka
 
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
Edge AI and Vision Alliance
 
Proactive Server and System Monitoring with FME: Using HTTP and System Caller...
Safe Software
 
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
Enhancing Environmental Monitoring with Real-Time Data Integration: Leveragin...
Safe Software
 
CapCut Pro PC Crack Latest Version Free Free
josanj305
 
Kubernetes - Architecture & Components.pdf
geethak285
 
Ad

Query Your Streaming Data on Kafka using SQL: Why, How, and What

  • 1. Query Kafka with SQL Jove Zhong Co-Founder and Head of Product, Timeplus Gang Tao Co-Founder and CTO, Timeplus Why, how, and what’s next? Sep 27, 2023
  • 2. image credit: teacoffeecup.com sed 's/coffee/SQL on Kafka/g'
  • 3. Real-time data is everywhere, at the edge and cloud 46 ZB of data created by billions of IoT by 2025 30% of data generated will be real-time by 2025 Only 1% of data is analyzed and streaming data is primarily untapped
  • 4. Why SQL on Kafka?
  • 5. Why SQL on Database? ret = open_database(&(my_stock->inventory_dbp)..); my_database->get(my_database, NULL, &key, &data, 0); client.get(key) update_bins = {'b'=: u"ud83dude04" 'i': aerospike.null()} client.put(key, update_bins) request = new GetItemRequest() .withKey(key_to_get) .withTableName(table_name); SELECT * FROM tab WHERE id='id1' UPDATE tab SET flag=FALSE WHERE id='id1'
  • 6. Why SQL on Kafka? Reliable Fast Easy Powerful Descriptive
  • 7. FinTech ● Real-time post-trade analytics ● Real-time pricing DevOps ● Real-time Github insights ● Real-time o11y and usage based pricing Security Compliance ● SOC2 compliance ● Container vulnerability monitoring ● Monitor Superblocks user activities ● Protect sensitive info in Slack IoT ● Real-time fleet monitoring Customer 360 ● Auth0 notifications for new signups ● HubSpot custom dashboards/alerts ● Jitsu clickstream analytics ● Real-time Twitter marketing Misc ● Wildfire monitoring and alerting ● Data-driven parent Sample Use Cases source: https://docs.timeplus.com/showcases
  • 8. How do you like your coffee? Flink ksqlDB Hazelcast Druid Pinot Trino ClickHouse StarRocks RisingWave Databend Streaming Processor Streaming Database Real-time Database
  • 10. FlinkSQL since 2016 Community ☕☕☕☕ Real-time ☕☕☕ Streaming ☕☕☕ Historical ☕ JOIN ☕☕☕☕ Largescale ☕☕☕☕ Lightweight☕☕ Easy to use☕☕
  • 11. Community ☕☕☕ Real-time ☕☕☕ Streaming ☕☕☕ Historical ☕☕ JOIN ☕☕☕ Largescale ☕☕ Lightweight☕☕ Easy to use☕☕☕ ksqlDB since 2019
  • 12. Distributed computation and storage platform No dependency on disk storage, it keeps all its operational state in the RAM of the cluster. Flink ksqlDB Hazelcast Druid Pinot Trino Streaming Processor Streaming Database Real-time Database
  • 13. 1. create a schema json (columns, PKs) 2. create a table configuration json (streamType=Kafka) 3. docker run .. apachepinot/pinot:latest AddTable -schemaFile /tmp/transcript-schema.json -tableConfigFile /tmp/transcript-table-realtime.json .. -exec 1. load the druid-kafka-indexing-service extension on both the Overlord and the MiddleManagers 2. Create a supervisor-spec.json containing the Kafka supervisor spec file. 3. curl -X POST -H 'Content-Type: application/json' -d @supervisor-spec.json http://localhost:8090/druid/indexer/v1/supervisor
  • 14. Add a catalog properties file etc/catalog/kafka.properties for the Kafka connector. $ ./trino --catalog kafka --schema aSchema trino:aSchema> SELECT count(*) FROM customer;
  • 16. the Next-Generation Streaming Database (Kafka + Flink + ClickHouse ) SQL with streaming extension Data Ingestion Unified Query Processing Pipeline ingest append stream read historical read streaming storage historical storage query Kafka External Stream
  • 17. SELECT * FROM car_live_data Stream tail SELECT count(*) FROM car_live_data Global aggregation SELECT window_start, count(*) FROM tumble(car_live_data, 1m) GROUP BY window_start Window aggregation SELECT cid, speed_kmh, lag(speed_kmh) OVER (PARTITION BY cid) AS last_spd FROM car_live_data Sub streams SELECT window_start, count(*) FROM tumble(car_live_data, 5s) GROUP BY window_start EMIT AFTER WATERMARK AND DELAY 2s Late event SELECT * FROM car_live_data WHERE _tp_time > now() - 1d Time travel
  • 20. ClickHouse StarRocks CREATE TABLE queue2 ( timestamp UInt64, level String, message String ) ENGINE = Kafka SETTINGS kafka_broker_list = 'localhost:9092', kafka_topic_list = 'topic', kafka_group_name = 'group1', kafka_format = 'JSONEachRow', kafka_num_consumers = 4; CREATE ROUTINE LOAD test_db.table102 ON table1 COLUMNS TERMINATED BY ",", COLUMNS (user_id, user_gender, event_date, event_type) WHERE event_type = 1 FROM KAFKA ( "kafka_broker_list" = "<kafka_broker_host>:<kafka_broker_port>" , "kafka_topic" = "topic1", "property.kafka_default_offsets" = "OFFSET_BEGINNING" ); ClickHouse features ● table engine and table function ● rich functions and data types ● not 100% ansi compatible
  • 23. dozer -c dozer-config.yaml curl -X POST http://localhost:8080/tout/query --header 'Content-Type: application/json'
  • 24. Community ☕☕☕ Real-time ☕☕☕ Streaming ☕☕☕ Historical ☕☕ JOIN ☕☕☕ Largescale ☕☕☕ Lightweight☕☕☕ ☕ Easy to use☕☕☕ Cappuccino
  • 26. Programing - turn data into insight human machine 1GL - machine language 2GL - assembly language 3GL - imperative language 4GL - descriptive language 5GL - intelligent language data insight
  • 27. source Streaming Processor ● SQL as data pipeline ● No data storage ● Unbounded real-time query ETL / Data Pipeline ingest external Realtime Database ● mostly leveraging kafka to ingest data ● federation search/query ○ ClickHouse Kafka Engine ○ Trino ● Bounded batch query, no streaming query Historical Report / Ad hoc Analysis source Streaming Database ● support kafka data storage ● Unbounded real-time query ● combination of real-time data and historical data Hybrid
  • 28. Query Kafka with SQL: Open Source + Cloud + Source Available Flink ksqlDB Hazelcast Druid Pinot Trino ClickHouse StarRocks RisingWave Databend Streaming Processor Streaming Database Realtime Database
  • 29. Community ☕☕☕☕ Real-time ☕☕☕ Streaming ☕☕☕ Historical ☕ JOIN ☕☕☕☕ Largescale ☕☕☕☕ Lightweight☕☕ Easy to use☕☕ Community ☕☕☕ Real-time ☕☕☕ Streaming ☕☕☕ Historical ☕☕ JOIN ☕☕☕ Largescale ☕☕ Lightweight☕☕ Easy to use☕☕☕ Community ☕☕ Real-time ☕☕☕☕ Streaming ☕☕☕ Historical ☕☕☕☕ JOIN ☕☕☕☕ Largescale ☕☕ Lightweight☕☕☕ ☕ Easy to use☕☕☕ Community ☕☕☕ Real-time ☕☕☕ Streaming ☕☕☕ Historical ☕☕ JOIN ☕☕☕ Largescale ☕☕☕ Lightweight☕☕☕ ☕ Easy to use☕☕☕
  • 30. Q+A / Thank you! Meet us at booth #407 Try Timeplus Proton (Open Source) Or sign up for a free cloud account timeplus.com