SlideShare a Scribd company logo
seven-ways-to-run-flink-on-aws.pdf
whoami
Software Engineer with
Focus on
Data-Intensive Applications
Site Reliability Engineering

Stateful Computations over Data Streams
Deployment
YARN
Mesos
Kubernetes
Apache Flink
Apache Flink
apache/flink
Stream Processor
Stream Processor
1. "Functional" DSL
1. "Functional" DSL
3. Parallel Execution Graph
3. Parallel Execution Graph
2. Logical Execution Graph
2. Logical Execution Graph
from("json-events").map(MyPojo::fromJson)
.keyBy(MyPojo::getUserId)
.timeWindow(SECONDS.of(10))
.count()
.to("events-per-user")
Flink stack
Flink stack
Flink cluster
Flink cluster
Job Manager
Task Manager Task Manager Task Manager
Use case
Use case
seven-ways-to-run-flink-on-aws.pdf
The recommended way
The recommended way
EMR
EMR
https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/aws.html
seven-ways-to-run-flink-on-aws.pdf
EMR Architecture
EMR Architecture
Job Manager
Task Manager Task Manager Task Manager
YARN Container YARN Container YARN Container
YARN Container
seven-ways-to-run-flink-on-aws.pdf
👍
👍
Recommended
Battle-tested
👎
👎
Need to manage Hadoop cluster
Flink version lag (currently 1.8.0)
The do-it-yourself way
The do-it-yourself way
EC2
EC2
EC2 Architecture
EC2 Architecture
Job Manager
Task Manager Task Manager Task Manager
Reproducibility
Reproducibility


ansible/ansible
hashicorp/packer
👍
👍
Fully customizable
👎
👎
Need to build everything on your own
Need to manage EC2 instances
The containerized way
The containerized way
ECS + EC2
ECS + EC2
ECS Architecture (with EC2)
ECS Architecture (with EC2)
Job Manager
Task Manager Task Manager Task Manager
ECS Task
ECS Service
ECS Task ECS Task ECS Task
ECS Service
👍
👍
Useful model
👎
👎
Proprietary primitives
Not common, little community support
Need to manage EC2 instances
The CaaS way
The CaaS way
ECS + Fargate
ECS + Fargate
ECS Architecture (with Fargate)
ECS Architecture (with Fargate)
Job Manager
Task Manager Task Manager Task Manager
ECS Task
ECS Service
ECS Task ECS Task ECS Task
ECS Service
👍
👍
Useful model
No instance-level management required
👎
👎
Proprietary primitives
Not common, little community support
The cloud-native way
The cloud-native way
EKS
EKS
Kubernetes
Kubernetes
… is an
open source system for
managing containerized applications
across multiple hosts;
providing basic mechanisms for
deployment,
maintenance, and
scaling of applications.
EKS Architecture
EKS Architecture
Job Manager
Task Manager Task Manager Task Manager
Kubernetes pod Kubernetes pod Kubernetes pod
Kubernetes pod
Kubernetes deployment
Kubernetes deployment
seven-ways-to-run-flink-on-aws.pdf


weaveworks/eksctl
lyft/flinkk8soperator
Ververica Platform
👍
👍
Powerful primitives
De-facto standard
Strong ecosystem
👎
👎
Need to manage Kubernetes cluster
The hosted way
The hosted way
Kinesis Data Analytics for Java
Kinesis Data Analytics for Java
Kinesis Data Analytics for Java Architecture
Kinesis Data Analytics for Java Architecture
seven-ways-to-run-flink-on-aws.pdf
👍
👍
Fully hosted
State migration feature
👎
👎
Restricted to specific use case
Failure scenarios
Flink version lag (currently 1.6)
The serverless way
The serverless way
Lambda
Lambda
Lambda Architecture
Lambda Architecture
seven-ways-to-run-flink-on-aws.pdf
👍
👍
Fun experiment
👎
👎
Not a good fit
Recap
Recap
EMR
The recommended way
EC2
The do-it-yourself way
ECS + EC2
The containerized way
ECS + Fargate
The CaaS way
EKS
The cloud-native way
Kinesis Analytics for Java
The hosted way
Lambda
The serverless way
What's next?
What's next?
State management & fault tolerance



Monitoring & alerting


Continuous Deployment
Infrastructure as Code

Tuning



Checkpointing
Upgrading Applications and Flink Versions
Job Manager High Availability
mbode/flink-prometheus-example
Monitoring Flink with Prometheus (Flink Forward 2018)
hashicorp/terraform
Data Types & Serialization
Tuning Checkpoints and Large State
Improving throughput and latency with Flink's network stack (Flink Forward 2018)
 
 
maximilian.bode@tng.tech @mxpbode
maximilianbo.de mbode

More Related Content

PPTX
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
PDF
Introducing the Apache Flink Kubernetes Operator
PDF
Apache Spark Overview
PDF
Kafka Streams State Stores Being Persistent
PPTX
Welcome to the Flink Community!
PPTX
Extending Flink SQL for stream processing use cases
PDF
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
PDF
Introduction to PySpark
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Introducing the Apache Flink Kubernetes Operator
Apache Spark Overview
Kafka Streams State Stores Being Persistent
Welcome to the Flink Community!
Extending Flink SQL for stream processing use cases
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
Introduction to PySpark

What's hot (20)

PDF
Spark streaming , Spark SQL
PDF
ksqlDB - Stream Processing simplified!
PPTX
PDF
Quarkus tips, tricks, and techniques
PDF
Building robust CDC pipeline with Apache Hudi and Debezium
PPTX
Introduction to Apache Flink
PPTX
One sink to rule them all: Introducing the new Async Sink
PDF
Better APIs with GraphQL
PDF
Introducing DataFrames in Spark for Large Scale Data Science
PDF
Spark overview
PPTX
Introduction to spark
PDF
Event-sourced architectures with Akka
PPTX
Building flexible ETL pipelines with Apache Camel on Quarkus
PPTX
Discover Quarkus and GraalVM
PDF
Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...
PPTX
Apache Flink in the Cloud-Native Era
PPTX
React + Redux Introduction
PDF
Введение в maven
PDF
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
PDF
Terraform
Spark streaming , Spark SQL
ksqlDB - Stream Processing simplified!
Quarkus tips, tricks, and techniques
Building robust CDC pipeline with Apache Hudi and Debezium
Introduction to Apache Flink
One sink to rule them all: Introducing the new Async Sink
Better APIs with GraphQL
Introducing DataFrames in Spark for Large Scale Data Science
Spark overview
Introduction to spark
Event-sourced architectures with Akka
Building flexible ETL pipelines with Apache Camel on Quarkus
Discover Quarkus and GraalVM
Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...
Apache Flink in the Cloud-Native Era
React + Redux Introduction
Введение в maven
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Terraform
Ad

Similar to seven-ways-to-run-flink-on-aws.pdf (20)

PDF
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
PPTX
Deploying windows containers with kubernetes
PDF
Metal-k8s presentation by Julien Girardin @ Paris Kubernetes Meetup
PPTX
Docker and kubernetes
PPTX
Docker Demystified for SB JUG
PPTX
Docker - Demo on PHP Application deployment
PPTX
Docker Deep Dive Understanding Docker Engine Docker for DevOps
PPTX
DevOps in Age of Kubernetes
PPTX
Terraform
PDF
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
PPTX
Weave User Group Talk - DockerCon 2017 Recap
PDF
OpenJDK & Graalvm
PDF
IDI 2020 - Containers Meet Serverless
PDF
Containers and Nutanix - Acropolis Container Services
PPTX
Kubernetes @ meetic
PPTX
Episode 3: Kubernetes and Big Data Services
PPTX
devops_
PDF
A 60-minute tour of AWS Compute (November 2016)
PPTX
Slides of Kubernetes Athens Meetup vol3 - Unikernels An alternative OS Archit...
PDF
JAWS-UG ECS Best Practices #jawsug_ct
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
Deploying windows containers with kubernetes
Metal-k8s presentation by Julien Girardin @ Paris Kubernetes Meetup
Docker and kubernetes
Docker Demystified for SB JUG
Docker - Demo on PHP Application deployment
Docker Deep Dive Understanding Docker Engine Docker for DevOps
DevOps in Age of Kubernetes
Terraform
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
Weave User Group Talk - DockerCon 2017 Recap
OpenJDK & Graalvm
IDI 2020 - Containers Meet Serverless
Containers and Nutanix - Acropolis Container Services
Kubernetes @ meetic
Episode 3: Kubernetes and Big Data Services
devops_
A 60-minute tour of AWS Compute (November 2016)
Slides of Kubernetes Athens Meetup vol3 - Unikernels An alternative OS Archit...
JAWS-UG ECS Best Practices #jawsug_ct
Ad

Recently uploaded (20)

PDF
Introduction to the R Programming Language
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PPTX
Database Infoormation System (DBIS).pptx
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
How to run a consulting project- client discovery
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
Introduction to Inferential Statistics.pptx
PDF
annual-report-2024-2025 original latest.
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PPT
Predictive modeling basics in data cleaning process
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PDF
[EN] Industrial Machine Downtime Prediction
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
IMPACT OF LANDSLIDE.....................
Introduction to the R Programming Language
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
Database Infoormation System (DBIS).pptx
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
How to run a consulting project- client discovery
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Introduction to Inferential Statistics.pptx
annual-report-2024-2025 original latest.
Pilar Kemerdekaan dan Identi Bangsa.pptx
Predictive modeling basics in data cleaning process
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
[EN] Industrial Machine Downtime Prediction
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
IMPACT OF LANDSLIDE.....................

seven-ways-to-run-flink-on-aws.pdf