SlideShare a Scribd company logo
Unlocking the Power of Apache Flink:
An Introduction in 4 Actsin
David Anderson
Software Practice Lead, Confluent
Apache Flink Committer
Today’s consumers expect real-time services
Real-time
Data
A Sale
A Shipment
A Trade
Rich Front-End
Customer Experiences
A Customer
Experience
Real-Time Backend
Operations
Real-time services rely on stream processing
Real-time Stream Processing
Driving business value with Apache Flink
Real-time analytics Event-driven applications
Streaming data
pipelines
Continuously produce and
update results which are
displayed and delivered to users
as real-time data streams are
consumed
● Ad/campaign performance
● Content performance
● Quality monitoring of Telco
networks
● Usage metering and billing
Recognize patterns and react to
incoming events by triggering
computations, state updates, or
external actions
● Fraud detection
● Anomaly detection
● Business process
monitoring
● Geo-fencing
Real-time data pipelines that
continuously ingest, enrich,
and transform data streams,
loading them into destination
systems for timely action (vs.
batch processing)
● Continuous ETL
● Real-time search index
building
● ML pipelines
● Data lake ingestion
Developers are choosing Flink because of
Its performance and rich feature set
Scalability & high
performance
Flink supports stream
processing workloads at
tremendous scale
Flink supports Java,
Python, & SQL, enabling
developers to work in
their language of choice
Flink supports stream
processing, batch
processing, and ad-hoc
analytics through one
technology
Unified processing
Flink's checkpointing
mechanism provides
exactly-once guarantees
automatically
Fault tolerance &
high availability
Language flexibility
Flink is a top 5 Apache project and has a very active community
@yourtwitterhandle | developer.confluent.io
Streaming
The four cornerstones on which Flink is built
State
Time Snapshots
● A stream is a sequence of events
● Business data is always a stream: bounded or unbounded
● For Flink, batch processing is just a special case in the runtime
now
past future
bounded stream
unbounded stream
Streaming
Real-time services rely on stream processing
Kafka
Databases
Key/Value Stores
Files
Apps
Sources
Real-time Stream Processing
Sinks
Real-time Stream Processing
Real-time services rely on stream processing
Kafka
Databases
Key/Value Stores
Files
Apps
Sources Sinks
The Job Graph (or Topology)
The Job Graph (or Topology)
OPERATOR
CONNECTION
Stream processing
• Parallel
• Forward
• Repartition
• Rebalance
grouped by
shape
SOURCE
Stream processing
• Parallel
• Forward
• Repartition
• Rebalance
grouped by
shape
SOURCE
Stream processing
• Parallel
• Forward
• Repartition
• Rebalance
group by
color
FILTER
Stream processing
• Parallel
• Forward
• Repartition
• Rebalance
COUNT
1
2
3
1
2
3
4
Stream processing with SQL
INSERT INTO results
SELECT color, COUNT(*)
FROM events
WHERE color <> orange
GROUP BY color;
GROUP BY color
results
COUNT
WHERE color <> orange
events
Stream processing with SQL
INSERT INTO results
SELECT color, COUNT(*)
FROM events
WHERE color <> orange
GROUP BY color;
GROUP BY color
results
COUNT
WHERE color <> orange
events
Stream processing with SQL
INSERT INTO results
SELECT color, COUNT(*)
FROM events
WHERE color <> orange
GROUP BY color;
GROUP BY color
events
results
COUNT
WHERE color <> orange
Stream processing with SQL
INSERT INTO results
SELECT color, COUNT(*)
FROM events
WHERE color <> orange
GROUP BY color;
GROUP BY color
results
COUNT
WHERE color <> orange
events
Stream processing with SQL
INSERT INTO results
SELECT color, COUNT(*)
FROM events
WHERE color <> orange
GROUP BY color;
GROUP BY color
results
COUNT
WHERE color <> orange
events
events
Flink’s APIs
Apache Flink Runtime
Low-Level
Stream Operator API
Optimizer / Planner
Table / SQL API
DataStream API
Runtime Architecture
Runtime Architecture
Flink supports streaming
● Bounded or unbounded streams
● Entire pipeline must always be running
● Input must be processed as it arrives
● Results are reported as they become ready
● Failure recovery resumes from a recent snapshot
● Flink guarantees effectively exactly-once results
despite out-of-order data and restarts due to
failures, etc.
● Only bounded streams
● Execution proceeds in stages, running as needed
● Input may be pre-sorted by time and key
● Results are reported at the end of the job
● Failure recovery does a reset and full restart
● Effectively exactly-once guarantees are more
straightforward
and batch
@yourtwitterhandle | developer.confluent.io
Streaming State
Time Snapshots
Stateful stream processing with Flink SQL
INSERT INTO results
SELECT color, COUNT(*)
FROM events
WHERE color <> orange
GROUP BY color;
GROUP BY color
events
results
COUNT
WHERE color <> orange
Stateful stream processing with Flink SQL
INSERT INTO results
SELECT color, COUNT(*)
FROM events
WHERE color <> orange
GROUP BY color;
GROUP BY color
events
results
COUNT
WHERE color <> orange
Stateful stream processing with Flink SQL
● Counting requires
state
GROUP BY color
events
results
COUNT
WHERE color <> orange
State
• Local
• Fast
• Fault tolerant
State
• Local
• Fast
• Fault tolerant
@yourtwitterhandle | developer.confluent.io
Streaming State
Time Snapshots
Time
• Synchronize
• Wait
• Timeout
09:05:44
When the event was created at its
original source.
Event time
09:08:01
When the event is being processed.
This time varies between applications.
Processing time
● Streams are (roughly) ordered by time
Out-of-order event streams
10:10
10:14
10:10
10:14
Coping with out of order events
This event will
be read next
Coping with out of order events
These events
follow
Coping with out of order events
Imagine a window counting events for the hour ending at 2:00. How long should this
window wait before producing its results?
Watermarks measure progress of event time
Watermark
● This watermark has been generated by assuming that the stream is at most 5 minutes
out-of-order
Watermarks measure progress of event time
● This watermark has been generated by assuming that the stream is at most 5 minutes
out-of-order
● The watermark is the max timestamp seen so far, minus this out-of-orderness estimate
1:50 - 5 = 1:45
Watermarks measure progress of event time
● This watermark has been generated by assuming that the stream is at most 5 minutes
out-of-order
● The watermark is the max timestamp seen so far, minus this out-of-orderness estimate
● A watermark is an assertion about the completeness of the stream
Now this stream is
complete up to 1:45
Watermarks measure progress of event time
Imagine a window counting events for the hour ending at 2:00. How long should this
window wait before producing its results?
It should wait for a watermark with a timestamp of at least 2:00.
What are
watermarks for?
They make things happen when the time is right.
The idle stream
problem
● Streams that are idle do not
advance the watermark
● This prevents windows from
producing results
The idle stream
problem
● Streams that are idle do not
advance the watermark
● This prevents windows from
producing results
Solutions
● Balance the partitions so none are
empty or idle, or
● Send keep-alive events, or
● Configure the watermarking to
use idleness detection
Watermarks
● Not needed for applications that only use wall-clock (processing) time
● Not needed for batch processing
● Are needed for triggering actions based on event-time, e.g., closing a
window
● Are generated based on an assumption of how out of order the data might
be
● Provide control over the tradeoff between completeness and latency
● Flink SQL drops late events; the DataStream API offers more control
● Allow for consistent, reproducible results
● Potentially idle sources require special attention
@yourtwitterhandle | developer.confluent.io
Streaming State
Time Snapshots
A checkpoint is an
automatic snapshot
created by Flink,
primarily for the
purpose of failure
recovery
A checkpoint is an
automatic snapshot
created by Flink,
primarily for the
purpose of failure
recovery
A savepoint is a
manual snapshot
created for some
operational purpose
(e.g., a stateful
upgrade)
Snapshots
events
results
COUNT
FILTER
GROUP BY color
Snapshots
Source Filter Count by color Sink
events
results
COUNT
FILTER
GROUP BY color
Snapshots
Source Filter Count by color Sink
Offsets for some
partitions
Offsets for other
partitions
events
results
COUNT
FILTER
GROUP BY color
Snapshots
Source Filter Count by color Sink
Offsets for some
partitions
______________________
Offsets for other
partitions
______________________
events
results
COUNT
FILTER
GROUP BY color
Snapshots
Source Filter Count by color Sink
Offsets for some
partitions
______________________ Counters for some
colors
Offsets for other
partitions
______________________ Counters for other
colors
events
results
COUNT
FILTER
GROUP BY color
Snapshots
Source Filter Count by color Sink
Offsets for some
partitions
______________________ Counters for some
colors
Transaction ID
Offsets for other
partitions
______________________ Counters for other
colors
__________________
events
results
COUNT
FILTER
GROUP BY color
Taking a
snapshot
does NOT
stop the
world
Checkpoints and savepoints are created
asynchronously, while the job continues to
process events and produce results
Because
these are
self-consistent,
global
snapshots
● Flink provides (effectively) exactly-once
guarantees
● Recovery involves restarting the entire job
from the most recent checkpoint
Recovery
Wrap-up
Streaming
Unfamiliar to many
developers, but
ultimately
straightforward
Watermarks
encapsulate something
complex in one place –
the sources
● how out-of-order?
● can it be idle?
Transparent to
application developers
State snapshots for
recovery
Delightfully simple
● local
● key/value
● single-threaded
State Event time and
watermarks
Where is the
Flink community?
To subscribe to the mailing lists, or get an invite
to Slack, see https://flink.apache.org/community/
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
Your Apache Flink®
journey begins here
developer.confluent.io

More Related Content

PDF
Apache Flink internals
PDF
Introduction to Apache Flink - Fast and reliable big data processing
PDF
Putting the Human Back in the Loop: Keynote Talk at IS-EUD 2023 Cagliari
PDF
Iceberg: A modern table format for big data (Strata NY 2018)
PPTX
Introduction to Apache Flink
PDF
OPA: The Cloud Native Policy Engine
PDF
Dynamic Partition Pruning in Apache Spark
PDF
A Deep Dive into Query Execution Engine of Spark SQL
Apache Flink internals
Introduction to Apache Flink - Fast and reliable big data processing
Putting the Human Back in the Loop: Keynote Talk at IS-EUD 2023 Cagliari
Iceberg: A modern table format for big data (Strata NY 2018)
Introduction to Apache Flink
OPA: The Cloud Native Policy Engine
Dynamic Partition Pruning in Apache Spark
A Deep Dive into Query Execution Engine of Spark SQL

What's hot (20)

PPTX
Flexible and Real-Time Stream Processing with Apache Flink
PPTX
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
PPTX
Apache Flink and what it is used for
PPTX
Real-time Stream Processing with Apache Flink
PDF
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
PPTX
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
PPTX
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
PDF
Building a fully managed stream processing platform on Flink at scale for Lin...
PDF
Changelog Stream Processing with Apache Flink
PDF
Introduction To Flink
PPTX
Near real-time statistical modeling and anomaly detection using Flink!
PDF
Stream Processing with Apache Flink
PDF
NiFi Developer Guide
PPTX
Data Stream Processing with Apache Flink
PDF
Introduction to Apache Flink
PPTX
Building Reliable Lakehouses with Apache Flink and Delta Lake
PDF
Stateful stream processing with Apache Flink
PDF
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
PPTX
Apache flink
PPTX
Extending Flink SQL for stream processing use cases
Flexible and Real-Time Stream Processing with Apache Flink
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Apache Flink and what it is used for
Real-time Stream Processing with Apache Flink
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Building a fully managed stream processing platform on Flink at scale for Lin...
Changelog Stream Processing with Apache Flink
Introduction To Flink
Near real-time statistical modeling and anomaly detection using Flink!
Stream Processing with Apache Flink
NiFi Developer Guide
Data Stream Processing with Apache Flink
Introduction to Apache Flink
Building Reliable Lakehouses with Apache Flink and Delta Lake
Stateful stream processing with Apache Flink
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Apache flink
Extending Flink SQL for stream processing use cases
Ad

Similar to Unlocking the Power of Apache Flink: An Introduction in 4 Acts (20)

PDF
Making Sense of Apache Flink: A Fearless Introduction
PPTX
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
PDF
Getting Data In and Out of Flink - Understanding Flink and Its Connector Ecos...
PDF
Flink SQL: The Challenges to Build a Streaming SQL Engine
PPTX
Streaming SQL to unify batch and stream processing: Theory and practice with ...
PPTX
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
PPTX
Flink. Pure Streaming
PDF
Unbounded bounded-data-strangeloop-2016-monal-daxini
PDF
Zurich Flink Meetup
PDF
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
PPTX
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
PDF
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
PDF
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
PDF
When Streaming Needs Batch With Konstantin Knauf | Current 2022
PPTX
Gcp dataflow
PDF
Fluentd and Distributed Logging at Kubecon
PPTX
Architectual Comparison of Apache Apex and Spark Streaming
PPTX
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
PPTX
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
PDF
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
Making Sense of Apache Flink: A Fearless Introduction
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
Getting Data In and Out of Flink - Understanding Flink and Its Connector Ecos...
Flink SQL: The Challenges to Build a Streaming SQL Engine
Streaming SQL to unify batch and stream processing: Theory and practice with ...
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
Flink. Pure Streaming
Unbounded bounded-data-strangeloop-2016-monal-daxini
Zurich Flink Meetup
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
When Streaming Needs Batch With Konstantin Knauf | Current 2022
Gcp dataflow
Fluentd and Distributed Logging at Kubecon
Architectual Comparison of Apache Apex and Spark Streaming
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
Ad

More from HostedbyConfluent (20)

PDF
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
PDF
Renaming a Kafka Topic | Kafka Summit London
PDF
Evolution of NRT Data Ingestion Pipeline at Trendyol
PDF
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
PDF
Exactly-once Stream Processing with Arroyo and Kafka
PDF
Fish Plays Pokemon | Kafka Summit London
PDF
Tiered Storage 101 | Kafla Summit London
PDF
Building a Self-Service Stream Processing Portal: How And Why
PDF
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
PDF
TL;DR Kafka Metrics | Kafka Summit London
PDF
A Window Into Your Kafka Streams Tasks | KSL
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
PDF
Data Contracts Management: Schema Registry and Beyond
PDF
Code-First Approach: Crafting Efficient Flink Apps
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Renaming a Kafka Topic | Kafka Summit London
Evolution of NRT Data Ingestion Pipeline at Trendyol
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Exactly-once Stream Processing with Arroyo and Kafka
Fish Plays Pokemon | Kafka Summit London
Tiered Storage 101 | Kafla Summit London
Building a Self-Service Stream Processing Portal: How And Why
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Navigating Private Network Connectivity Options for Kafka Clusters
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Explaining How Real-Time GenAI Works in a Noisy Pub
TL;DR Kafka Metrics | Kafka Summit London
A Window Into Your Kafka Streams Tasks | KSL
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Data Contracts Management: Schema Registry and Beyond
Code-First Approach: Crafting Efficient Flink Apps
Debezium vs. the World: An Overview of the CDC Ecosystem
Beyond Tiered Storage: Serverless Kafka with No Local Disks

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Getting Started with Data Integration: FME Form 101
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Approach and Philosophy of On baking technology
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Machine Learning_overview_presentation.pptx
PPTX
1. Introduction to Computer Programming.pptx
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPT
Teaching material agriculture food technology
PPTX
A Presentation on Artificial Intelligence
NewMind AI Weekly Chronicles - August'25-Week II
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Getting Started with Data Integration: FME Form 101
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
SOPHOS-XG Firewall Administrator PPT.pptx
Approach and Philosophy of On baking technology
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Encapsulation_ Review paper, used for researhc scholars
Building Integrated photovoltaic BIPV_UPV.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Electronic commerce courselecture one. Pdf
Unlocking AI with Model Context Protocol (MCP)
Dropbox Q2 2025 Financial Results & Investor Presentation
Machine Learning_overview_presentation.pptx
1. Introduction to Computer Programming.pptx
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Per capita expenditure prediction using model stacking based on satellite ima...
Advanced methodologies resolving dimensionality complications for autism neur...
Teaching material agriculture food technology
A Presentation on Artificial Intelligence

Unlocking the Power of Apache Flink: An Introduction in 4 Acts

  • 1. Unlocking the Power of Apache Flink: An Introduction in 4 Actsin David Anderson Software Practice Lead, Confluent Apache Flink Committer
  • 2. Today’s consumers expect real-time services
  • 3. Real-time Data A Sale A Shipment A Trade Rich Front-End Customer Experiences A Customer Experience Real-Time Backend Operations Real-time services rely on stream processing Real-time Stream Processing
  • 4. Driving business value with Apache Flink Real-time analytics Event-driven applications Streaming data pipelines Continuously produce and update results which are displayed and delivered to users as real-time data streams are consumed ● Ad/campaign performance ● Content performance ● Quality monitoring of Telco networks ● Usage metering and billing Recognize patterns and react to incoming events by triggering computations, state updates, or external actions ● Fraud detection ● Anomaly detection ● Business process monitoring ● Geo-fencing Real-time data pipelines that continuously ingest, enrich, and transform data streams, loading them into destination systems for timely action (vs. batch processing) ● Continuous ETL ● Real-time search index building ● ML pipelines ● Data lake ingestion
  • 5. Developers are choosing Flink because of Its performance and rich feature set Scalability & high performance Flink supports stream processing workloads at tremendous scale Flink supports Java, Python, & SQL, enabling developers to work in their language of choice Flink supports stream processing, batch processing, and ad-hoc analytics through one technology Unified processing Flink's checkpointing mechanism provides exactly-once guarantees automatically Fault tolerance & high availability Language flexibility Flink is a top 5 Apache project and has a very active community
  • 6. @yourtwitterhandle | developer.confluent.io Streaming The four cornerstones on which Flink is built State Time Snapshots
  • 7. ● A stream is a sequence of events ● Business data is always a stream: bounded or unbounded ● For Flink, batch processing is just a special case in the runtime now past future bounded stream unbounded stream Streaming
  • 8. Real-time services rely on stream processing Kafka Databases Key/Value Stores Files Apps Sources Real-time Stream Processing Sinks
  • 9. Real-time Stream Processing Real-time services rely on stream processing Kafka Databases Key/Value Stores Files Apps Sources Sinks
  • 10. The Job Graph (or Topology)
  • 11. The Job Graph (or Topology) OPERATOR CONNECTION
  • 12. Stream processing • Parallel • Forward • Repartition • Rebalance grouped by shape SOURCE
  • 13. Stream processing • Parallel • Forward • Repartition • Rebalance grouped by shape SOURCE
  • 14. Stream processing • Parallel • Forward • Repartition • Rebalance group by color FILTER
  • 15. Stream processing • Parallel • Forward • Repartition • Rebalance COUNT 1 2 3 1 2 3 4
  • 16. Stream processing with SQL INSERT INTO results SELECT color, COUNT(*) FROM events WHERE color <> orange GROUP BY color; GROUP BY color results COUNT WHERE color <> orange events
  • 17. Stream processing with SQL INSERT INTO results SELECT color, COUNT(*) FROM events WHERE color <> orange GROUP BY color; GROUP BY color results COUNT WHERE color <> orange events
  • 18. Stream processing with SQL INSERT INTO results SELECT color, COUNT(*) FROM events WHERE color <> orange GROUP BY color; GROUP BY color events results COUNT WHERE color <> orange
  • 19. Stream processing with SQL INSERT INTO results SELECT color, COUNT(*) FROM events WHERE color <> orange GROUP BY color; GROUP BY color results COUNT WHERE color <> orange events
  • 20. Stream processing with SQL INSERT INTO results SELECT color, COUNT(*) FROM events WHERE color <> orange GROUP BY color; GROUP BY color results COUNT WHERE color <> orange events events
  • 21. Flink’s APIs Apache Flink Runtime Low-Level Stream Operator API Optimizer / Planner Table / SQL API DataStream API
  • 24. Flink supports streaming ● Bounded or unbounded streams ● Entire pipeline must always be running ● Input must be processed as it arrives ● Results are reported as they become ready ● Failure recovery resumes from a recent snapshot ● Flink guarantees effectively exactly-once results despite out-of-order data and restarts due to failures, etc. ● Only bounded streams ● Execution proceeds in stages, running as needed ● Input may be pre-sorted by time and key ● Results are reported at the end of the job ● Failure recovery does a reset and full restart ● Effectively exactly-once guarantees are more straightforward and batch
  • 26. Stateful stream processing with Flink SQL INSERT INTO results SELECT color, COUNT(*) FROM events WHERE color <> orange GROUP BY color; GROUP BY color events results COUNT WHERE color <> orange
  • 27. Stateful stream processing with Flink SQL INSERT INTO results SELECT color, COUNT(*) FROM events WHERE color <> orange GROUP BY color; GROUP BY color events results COUNT WHERE color <> orange
  • 28. Stateful stream processing with Flink SQL ● Counting requires state GROUP BY color events results COUNT WHERE color <> orange
  • 32. Time • Synchronize • Wait • Timeout 09:05:44 When the event was created at its original source. Event time 09:08:01 When the event is being processed. This time varies between applications. Processing time
  • 33. ● Streams are (roughly) ordered by time Out-of-order event streams 10:10 10:14 10:10 10:14
  • 34. Coping with out of order events This event will be read next
  • 35. Coping with out of order events These events follow
  • 36. Coping with out of order events Imagine a window counting events for the hour ending at 2:00. How long should this window wait before producing its results?
  • 37. Watermarks measure progress of event time Watermark ● This watermark has been generated by assuming that the stream is at most 5 minutes out-of-order
  • 38. Watermarks measure progress of event time ● This watermark has been generated by assuming that the stream is at most 5 minutes out-of-order ● The watermark is the max timestamp seen so far, minus this out-of-orderness estimate 1:50 - 5 = 1:45
  • 39. Watermarks measure progress of event time ● This watermark has been generated by assuming that the stream is at most 5 minutes out-of-order ● The watermark is the max timestamp seen so far, minus this out-of-orderness estimate ● A watermark is an assertion about the completeness of the stream Now this stream is complete up to 1:45
  • 40. Watermarks measure progress of event time Imagine a window counting events for the hour ending at 2:00. How long should this window wait before producing its results? It should wait for a watermark with a timestamp of at least 2:00.
  • 41. What are watermarks for? They make things happen when the time is right.
  • 42. The idle stream problem ● Streams that are idle do not advance the watermark ● This prevents windows from producing results
  • 43. The idle stream problem ● Streams that are idle do not advance the watermark ● This prevents windows from producing results Solutions ● Balance the partitions so none are empty or idle, or ● Send keep-alive events, or ● Configure the watermarking to use idleness detection
  • 44. Watermarks ● Not needed for applications that only use wall-clock (processing) time ● Not needed for batch processing ● Are needed for triggering actions based on event-time, e.g., closing a window ● Are generated based on an assumption of how out of order the data might be ● Provide control over the tradeoff between completeness and latency ● Flink SQL drops late events; the DataStream API offers more control ● Allow for consistent, reproducible results ● Potentially idle sources require special attention
  • 46. A checkpoint is an automatic snapshot created by Flink, primarily for the purpose of failure recovery
  • 47. A checkpoint is an automatic snapshot created by Flink, primarily for the purpose of failure recovery A savepoint is a manual snapshot created for some operational purpose (e.g., a stateful upgrade)
  • 49. Snapshots Source Filter Count by color Sink events results COUNT FILTER GROUP BY color
  • 50. Snapshots Source Filter Count by color Sink Offsets for some partitions Offsets for other partitions events results COUNT FILTER GROUP BY color
  • 51. Snapshots Source Filter Count by color Sink Offsets for some partitions ______________________ Offsets for other partitions ______________________ events results COUNT FILTER GROUP BY color
  • 52. Snapshots Source Filter Count by color Sink Offsets for some partitions ______________________ Counters for some colors Offsets for other partitions ______________________ Counters for other colors events results COUNT FILTER GROUP BY color
  • 53. Snapshots Source Filter Count by color Sink Offsets for some partitions ______________________ Counters for some colors Transaction ID Offsets for other partitions ______________________ Counters for other colors __________________ events results COUNT FILTER GROUP BY color
  • 54. Taking a snapshot does NOT stop the world Checkpoints and savepoints are created asynchronously, while the job continues to process events and produce results
  • 55. Because these are self-consistent, global snapshots ● Flink provides (effectively) exactly-once guarantees ● Recovery involves restarting the entire job from the most recent checkpoint Recovery
  • 57. Streaming Unfamiliar to many developers, but ultimately straightforward Watermarks encapsulate something complex in one place – the sources ● how out-of-order? ● can it be idle? Transparent to application developers State snapshots for recovery Delightfully simple ● local ● key/value ● single-threaded State Event time and watermarks
  • 58. Where is the Flink community? To subscribe to the mailing lists, or get an invite to Slack, see https://flink.apache.org/community/
  • 61. Your Apache Flink® journey begins here developer.confluent.io