SlideShare a Scribd company logo
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
TODO: Apps vs Analytics
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
Distributed Stream Processing with Apache Kafka
• Twitter:
• @jaykreps
• @confluentinc
• @apachekafka
• http://confluent.io/blog
Download Apache Kafka
& Confluent Platform
confluent.io/download

More Related Content

Viewers also liked (20)

PPTX
I Heart Log: Real-time Data and Apache Kafka
Jay Kreps
 
PDF
Building Kafka-powered Activity Stream
Oleksiy Holubyev
 
PDF
Apache Kafka lessons learned @PAYBACK
Maxim Shelest
 
PPTX
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Michael Noll
 
PPTX
Introduction to Apache Kafka
Jeff Holoman
 
PDF
Scalable Algorithm Design with MapReduce
Pietro Michiardi
 
PDF
CrealyticsEvents - a step closer to an event-driven architecture
Oleksiy Holubyev
 
PDF
Architektur von Big Data Lösungen
Guido Schmutz
 
PDF
Hadoop Internals
Pietro Michiardi
 
PDF
Unified Log Processing Architecture
Guido Schmutz
 
PDF
Kafka at trivago
Clemens Valiente
 
PDF
Relational Algebra and MapReduce
Pietro Michiardi
 
PDF
High-level Programming Languages: Apache Pig and Pig Latin
Pietro Michiardi
 
PDF
Creating RESTful API’s with Grails and Spring Security
Alvaro Sanchez-Mariscal
 
PPTX
Real time Messages at Scale with Apache Kafka and Couchbase
Will Gardella
 
PPTX
Apache Kafka at LinkedIn
Discover Pinterest
 
PDF
Introduction to Streaming Analytics
Guido Schmutz
 
PPTX
Cqrs, event sourcing and microservices
Marcelo Cure
 
PPTX
Kafka website activity architecture
Omid Vahdaty
 
PDF
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Guido Schmutz
 
I Heart Log: Real-time Data and Apache Kafka
Jay Kreps
 
Building Kafka-powered Activity Stream
Oleksiy Holubyev
 
Apache Kafka lessons learned @PAYBACK
Maxim Shelest
 
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Michael Noll
 
Introduction to Apache Kafka
Jeff Holoman
 
Scalable Algorithm Design with MapReduce
Pietro Michiardi
 
CrealyticsEvents - a step closer to an event-driven architecture
Oleksiy Holubyev
 
Architektur von Big Data Lösungen
Guido Schmutz
 
Hadoop Internals
Pietro Michiardi
 
Unified Log Processing Architecture
Guido Schmutz
 
Kafka at trivago
Clemens Valiente
 
Relational Algebra and MapReduce
Pietro Michiardi
 
High-level Programming Languages: Apache Pig and Pig Latin
Pietro Michiardi
 
Creating RESTful API’s with Grails and Spring Security
Alvaro Sanchez-Mariscal
 
Real time Messages at Scale with Apache Kafka and Couchbase
Will Gardella
 
Apache Kafka at LinkedIn
Discover Pinterest
 
Introduction to Streaming Analytics
Guido Schmutz
 
Cqrs, event sourcing and microservices
Marcelo Cure
 
Kafka website activity architecture
Omid Vahdaty
 
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Guido Schmutz
 

Recently uploaded (20)

PPTX
ERP - FICO Presentation BY BSL BOKARO STEEL LIMITED.pptx
ravisranjan
 
PPTX
CONCEPT OF PROGRAMMING in language .pptx
tamim41
 
PDF
AWS Consulting Services: Empowering Digital Transformation with Nlineaxis
Nlineaxis IT Solutions Pvt Ltd
 
PDF
Cloud computing Lec 02 - virtualization.pdf
asokawennawatte
 
PDF
IObit Uninstaller Pro 14.3.1.8 Crack for Windows Latest
utfefguu
 
PDF
>Wondershare Filmora Crack Free Download 2025
utfefguu
 
PPTX
Quality on Autopilot: Scaling Testing in Uyuni
Oscar Barrios Torrero
 
PPTX
computer forensics encase emager app exp6 1.pptx
ssuser343e92
 
PPTX
Introduction to web development | MERN Stack
JosephLiyon
 
PPTX
EO4EU Ocean Monitoring: Maritime Weather Routing Optimsation Use Case
EO4EU
 
PDF
>Nitro Pro Crack 14.36.1.0 + Keygen Free Download [Latest]
utfefguu
 
PPTX
For my supp to finally picking supp that work
necas19388
 
PPTX
Iobit Driver Booster Pro 12 Crack Free Download
chaudhryakashoo065
 
PPTX
Wondershare Filmora Crack 14.5.18 + Key Full Download [Latest 2025]
HyperPc soft
 
PPTX
B2C EXTRANET | EXTRANET WEBSITE | EXTRANET INTEGRATION
philipnathen82
 
PPTX
NeuroStrata: Harnessing Neuro-Symbolic Paradigms for Improved Testability and...
Ivan Ruchkin
 
PDF
Automated Testing and Safety Analysis of Deep Neural Networks
Lionel Briand
 
PPT
Information Communication Technology Concepts
LOIDAALMAZAN3
 
PDF
Code Once; Run Everywhere - A Beginner’s Journey with React Native
Hasitha Walpola
 
PDF
How DeepSeek Beats ChatGPT: Cost Comparison and Key Differences
sumitpurohit810
 
ERP - FICO Presentation BY BSL BOKARO STEEL LIMITED.pptx
ravisranjan
 
CONCEPT OF PROGRAMMING in language .pptx
tamim41
 
AWS Consulting Services: Empowering Digital Transformation with Nlineaxis
Nlineaxis IT Solutions Pvt Ltd
 
Cloud computing Lec 02 - virtualization.pdf
asokawennawatte
 
IObit Uninstaller Pro 14.3.1.8 Crack for Windows Latest
utfefguu
 
>Wondershare Filmora Crack Free Download 2025
utfefguu
 
Quality on Autopilot: Scaling Testing in Uyuni
Oscar Barrios Torrero
 
computer forensics encase emager app exp6 1.pptx
ssuser343e92
 
Introduction to web development | MERN Stack
JosephLiyon
 
EO4EU Ocean Monitoring: Maritime Weather Routing Optimsation Use Case
EO4EU
 
>Nitro Pro Crack 14.36.1.0 + Keygen Free Download [Latest]
utfefguu
 
For my supp to finally picking supp that work
necas19388
 
Iobit Driver Booster Pro 12 Crack Free Download
chaudhryakashoo065
 
Wondershare Filmora Crack 14.5.18 + Key Full Download [Latest 2025]
HyperPc soft
 
B2C EXTRANET | EXTRANET WEBSITE | EXTRANET INTEGRATION
philipnathen82
 
NeuroStrata: Harnessing Neuro-Symbolic Paradigms for Improved Testability and...
Ivan Ruchkin
 
Automated Testing and Safety Analysis of Deep Neural Networks
Lionel Briand
 
Information Communication Technology Concepts
LOIDAALMAZAN3
 
Code Once; Run Everywhere - A Beginner’s Journey with React Native
Hasitha Walpola
 
How DeepSeek Beats ChatGPT: Cost Comparison and Key Differences
sumitpurohit810
 
Ad

Distributed Stream Processing with Apache Kafka

Editor's Notes

  • #2: TODO: fix title Introduce self What is Stream Processing Brief intro to Kafka Kafka Streams
  • #3: Exciting! Important!
  • #4: Doesn’t mean you drop everything on the floor if anything slows down Streaming algorithms—online space Can compute median
  • #5: About how inputs are translated into outputs (very fundamental)
  • #6: HTTP/REST All databases Run all the time Each request totally independent—No real ordering Can fail individual requests if you want Very simple! About the future!
  • #7: “Ed, the MapReduce job never finishes if you watch it like that” Job kicks off at a certain time Cron! Processes all the input, produces all the input Data is usually static Hadoop! DWH, JCL Archaic but powerful. Can do analytics! Compex algorithms! Also can be really efficient! Inherently high latency
  • #8: Generalizes request/response and batch. Program takes some inputs and produces some outputs Could be all inputs Could be one at a time Runs continuously forever!
  • #9: Companies == streams What a retail store do Streams Retail - Sales - Shipments and logistics - Pricing - Re-ordering - Analytics - Fraud and theft
  • #10: Quick run-through of the features in Kafka.
  • #11: Logs Distributed Fault-tolerant
  • #12: Change to Logs Unify Batch and stream processing
  • #15: Can’t just scale storage, need to scale processing Important: order
  • #16: Streaming platform is the successor to messaging Stream processing is how you build asynchronous services. That is going to be the key to solving my pipeline sprawl problem. Instead of having N^2 different pipelines, one for each pair of systems I am going to have a central place that hosts all these event streams—the streaming platform. This is a central way that all these systems and applications can plug in to get the streams they need. So I can capture streams from databases, and feed them into DWH, Hadoop, monitoring and analytics systems. They key advantage is that there is a single integration point for each thing that wants data. Now obviously to make this work I’m going to need to ensure I have met the reliability, scalability, and latency guarantees for each of these systems.
  • #17: Current state
  • #23: OpenGL Triangle
  • #29: Add screenshot example
  • #30: Add screenshot example
  • #32: TODO: Summarize
  • #33: Change to “Logs make reprocessing easy”
  • #34: Time is hard Need a model of time Request/Response ignores the issue, you just set an aggressive timeout Batch solves the issue usually by just freezing all data for the day Stream processing needs to actually address the issue
  • #39: Kafka Streams: Manage the set of live processors and route data to them Uses Kafka’s group management facility External framework Start and restart processes Package processes Deploy code
  • #40: DBs handle tables Stream Processors handle streams
  • #41: Companies == streams What a retail store do Streams Retail - Sales - Shipments and logistics - Pricing - Re-ordering - Analytics - Fraud and theft
  • #43: But…no notion of time
  • #62: Also: Other talks Kafka Summit Streaming data hackathon Stop by the Confluent booth and ask your questions about Kafka or stream processing Get a Kafka t-shirt and sticker. We’re also giving away a few books: the early release of Kafka: The Definitive Guide, Making Sense of Stream Processing, and I Heart Logs Meet the authors and get your book signed. We also want to invite you to participate in the Stream Data Hackathon in San Francisco on the evening of April 25, the day before Kafka Summit You might be interested in some of the other Confluent talks. If you missed it you’ll have access to the video recording.