SlideShare a Scribd company logo
REAL-TIME ANALYTICS WITH
APACHE FLINK
AND DRUID
Berlin Buzzwords 2016
Jan Graßegger - @gesundkrank
DATA ENGINEER @
OUR DATA
70,000EVENTS
PER
SECOND 50DIMENSIONS
20METRICS
DRUID
DRUID
‣ Online Analytical Processing (OLAP) System
‣ Column-oriented
‣ Distributed
‣ Built-in data sharding based on time windows
‣ JSON query language
DATA STRUCTURES
Column
TOP PRIVATE DOMAIN
battle.net
battle.net
noxxic.com
noxxic.com
Strings to Integers
battle.net 5
noxxic.com 6
Encoded column data
[5, 5, 6, 6]
DATA STRUCTURES
Column Bitmap Indices
battle.net [1, 1, 0, 0]
noxxic.com [0, 0, 1, 1]
TOP PRIVATE DOMAIN
battle.net
battle.net
noxxic.com
noxxic.com
FIREHOSES
FIREHOSES
APACHE FLINK
PROCESSING
?Kafka Flink Druid
TRANQUILITY
TRANQUILITY
‣ Helps ingesting real-time data into Druid
‣ Provides adapters for Samza, Spark, Storm and
Flink
‣ Standalone HTTP and Kafka applications
Kafka Flink Druid
Tranquility
PROCESSING
Replays?
LAMBDA
KAPPA
Kafka Flink Druid
Tranquility
HDFS
for replays
PROCESSING
RESULTS
▸Kappa-like architecture that’s able to do replays from
HDFS & Kafka
▸Added Flink sink to Tranquility
▸“Hacked“ replays into Tranquility
▸Real-Time Reporting
QUESTIONS?

More Related Content

PPTX
Rds data lake @ Robinhood
PDF
PDF
When NOT to use Apache Kafka?
PDF
Kafka streams windowing behind the curtain
PPTX
PDF
Kafka internals
PDF
HDFS Analysis for Small Files
PDF
Building robust CDC pipeline with Apache Hudi and Debezium
Rds data lake @ Robinhood
When NOT to use Apache Kafka?
Kafka streams windowing behind the curtain
Kafka internals
HDFS Analysis for Small Files
Building robust CDC pipeline with Apache Hudi and Debezium

What's hot (20)

PDF
Kafka streams 20201012
PDF
Producer Performance Tuning for Apache Kafka
PDF
Unified Stream and Batch Processing with Apache Flink
PDF
Iceberg: A modern table format for big data (Strata NY 2018)
PDF
Kafka Streams: What it is, and how to use it?
PDF
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
PDF
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
PPTX
Modern Enterprise integration Strategies
PDF
Introduction to Stream Processing
PDF
Cassandra at eBay - Cassandra Summit 2012
PPTX
Apache Kafka Best Practices
PDF
Kappa vs Lambda Architectures and Technology Comparison
PPTX
Airflow presentation
PDF
A Thorough Comparison of Delta Lake, Iceberg and Hudi
PPTX
Snowflake: The Good, the Bad, and the Ugly
PDF
High Concurrency Architecture at TIKI
PDF
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
PPTX
Real time big data stream processing
PDF
Apache Kafka Architecture & Fundamentals Explained
PPTX
Apache Kafka at LinkedIn
Kafka streams 20201012
Producer Performance Tuning for Apache Kafka
Unified Stream and Batch Processing with Apache Flink
Iceberg: A modern table format for big data (Strata NY 2018)
Kafka Streams: What it is, and how to use it?
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
Modern Enterprise integration Strategies
Introduction to Stream Processing
Cassandra at eBay - Cassandra Summit 2012
Apache Kafka Best Practices
Kappa vs Lambda Architectures and Technology Comparison
Airflow presentation
A Thorough Comparison of Delta Lake, Iceberg and Hudi
Snowflake: The Good, the Bad, and the Ugly
High Concurrency Architecture at TIKI
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Real time big data stream processing
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka at LinkedIn
Ad

Viewers also liked (20)

PDF
Aggregated queries with Druid on terrabytes and petabytes of data
PPTX
Scalable Real-time analytics using Druid
PDF
Real-time analytics with Druid at Appsflyer
PPT
Case Study: Realtime Analytics with Druid
PDF
Interactive analytics at scale with druid
PDF
Data Analytics with Druid
PPTX
Druid realtime indexing
PPTX
Druid at Hadoop Ecosystem
PPTX
Pulsar: Real-time Analytics at Scale with Kafka, Kylin and Druid
PPTX
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
PPTX
Click-Through Example for Flink’s KafkaConsumer Checkpointing
PPTX
Apache Kylin - OLAP Cubes for SQL on Hadoop
PDF
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
PPTX
Realtime streaming architecture in INFINARIO
PDF
Apache Provisionr (incubating) - Bucharest JUG 10
PDF
Helio, a Continues Real-Time Fraud Detection and Monitoring Solution
PDF
Strata lightening-talk
PPTX
Big Data Day LA 2016/ Big Data Track - Real Time Analytics with Druid - Guill...
PDF
Druid @ branch
PPTX
Apache Kylin Streaming
Aggregated queries with Druid on terrabytes and petabytes of data
Scalable Real-time analytics using Druid
Real-time analytics with Druid at Appsflyer
Case Study: Realtime Analytics with Druid
Interactive analytics at scale with druid
Data Analytics with Druid
Druid realtime indexing
Druid at Hadoop Ecosystem
Pulsar: Real-time Analytics at Scale with Kafka, Kylin and Druid
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Click-Through Example for Flink’s KafkaConsumer Checkpointing
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Realtime streaming architecture in INFINARIO
Apache Provisionr (incubating) - Bucharest JUG 10
Helio, a Continues Real-Time Fraud Detection and Monitoring Solution
Strata lightening-talk
Big Data Day LA 2016/ Big Data Track - Real Time Analytics with Druid - Guill...
Druid @ branch
Apache Kylin Streaming
Ad

Similar to Real-time Analytics with Apache Flink and Druid (20)

PDF
A Trifecta of Real-Time Applications: Apache Kafka, Flink, and Druid
PDF
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
PDF
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
PDF
Apache Druid 101
PPTX
Realtime classroom analytics powered by apache druid
PDF
Streaming sql and druid
PPTX
Scalable olap with druid
PPTX
January 2016 Flink Community Update & Roadmap 2016
PDF
Self Service Analytics at Twitch
PPTX
Flink Streaming @BudapestData
PPTX
Realtime data processing with Flink and Druid by Youngpyo Lee, SKT
PDF
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
PDF
Stream processing with Apache Flink @ OfferUp
PPTX
Workshop híbrido: Stream Processing con Flink
PPTX
The of Operational Analytics Data Store
PDF
Five Fabulous Sinks for Your Kafka Data. #3 will surprise you! (Rachel Pedres...
PPTX
Real time analytics
PDF
Druid at Strata Conf NY 2016.pdf
PDF
Fast analytics kudu to druid
PPTX
Community Update May 2016 (January - May) | Berlin Apache Flink Meetup
A Trifecta of Real-Time Applications: Apache Kafka, Flink, and Druid
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
Apache Druid 101
Realtime classroom analytics powered by apache druid
Streaming sql and druid
Scalable olap with druid
January 2016 Flink Community Update & Roadmap 2016
Self Service Analytics at Twitch
Flink Streaming @BudapestData
Realtime data processing with Flink and Druid by Youngpyo Lee, SKT
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
Stream processing with Apache Flink @ OfferUp
Workshop híbrido: Stream Processing con Flink
The of Operational Analytics Data Store
Five Fabulous Sinks for Your Kafka Data. #3 will surprise you! (Rachel Pedres...
Real time analytics
Druid at Strata Conf NY 2016.pdf
Fast analytics kudu to druid
Community Update May 2016 (January - May) | Berlin Apache Flink Meetup

Recently uploaded (20)

PPTX
UNIT 4 Total Quality Management .pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
Sustainable Sites - Green Building Construction
PPT
Mechanical Engineering MATERIALS Selection
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
web development for engineering and engineering
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
Geodesy 1.pptx...............................................
UNIT 4 Total Quality Management .pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Sustainable Sites - Green Building Construction
Mechanical Engineering MATERIALS Selection
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Foundation to blockchain - A guide to Blockchain Tech
bas. eng. economics group 4 presentation 1.pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
CH1 Production IntroductoryConcepts.pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
OOP with Java - Java Introduction (Basics)
web development for engineering and engineering
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Geodesy 1.pptx...............................................

Real-time Analytics with Apache Flink and Druid