SlideShare a Scribd company logo
1Confidential
Processing IoT Data with
MQTT and Apache Kafka
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
LinkedIn
@KaiWaehner
www.confluent.io
www.kai-waehner.de
Kafka-Native End-to-End IoT Data Integration and Processing
3
Agenda
1) IoT Use Cases
2) MQTT Standard
3) Apache Kafka Ecosystem
4) Kafka-Native End-to-End IoT Integration Architecture(s)
5) IoT Data Processing
4
Agenda
1) IoT Use Cases
2) MQTT Standard
3) Apache Kafka Ecosystem
4) Kafka-Native End-to-End IoT Integration Architecture(s)
5) IoT Data Processing
6
Connected Intelligence (Cars, Machines, Robots, …)
7
Smart Cities
8
Smart Retail and Customer 360
9
Intelligent Applications (Early Part Scrapping, Predictive Maintenance, …)
10
?
Architecture (High Level)
Kafka BrokerKafka BrokerStreaming
Platform
Connect
w/ MQTT
connector
IoT
Gateway
DevicesDevicesDevicesDevice
Device Tracking
(Real Time)
Predictive
Maintenance
(Near Real Time)
Log Analytics
(Batch)
Edge Data Center / Cloud
How to integrate?
11
Poll
Which IoT scenarios do you see in your company?
1) IoT ingestion into analytics cluster
2) Bi-directional communication to control IoT devices
(e.g. connected cars, fleet management, logistics)
3) Real time stream processing using machine learning
(e.g. predictive maintenance, early part scrapping)
4) No IoT scenarios today; maybe in the future
13
Agenda
1) IoT Use Cases
2) MQTT Standard
3) Apache Kafka Ecosystem
4) Kafka-Native End-to-End IoT Integration Architecture(s)
5) IoT Data Processing
14
MQTT - Publish / subscribe messaging protocol
• Built on top of TCP/IP for constrained devices and unreliable networks
• Many (open source) broker implementations
• Many client libraries
• IoT-specific features for bad network / connectivity
• Widely used (mostly IoT, but also web and mobile apps via MQTT over WebSockets)
15
MQTT
Server 1
Processor 1 Processor 2 ...
MQTT Architecture (no scale)
topic: [deviceid]/car
16
MQTT
Server
Coordinator
MQTT
Server 1
MQTT
Server 2
MQTT
Server 3
MQTT
Server 4
topic: [deviceid]/car
...
MQTT Architecture (clustering depends on broker implementation)
Processor
1
Processor
2
Processor
3
Processor
4
17
MQTT Architecture (clustering depends on broker implementation)
Load
Balancer
MQTT
Server 1
MQTT
Server 2
MQTT
Server 3
MQTT
Server 4
topic: [deviceid]/car
...
Processor
1
Processor
2
Processor
3
Processor
4
18
MQTT Trade-Offs
Pros
• Lightweight
• Simple API
• Built for poor connectivity / high latency scenario
• Many client connections (tens of thousands per MQTT server)
Cons
• Queuing, not stream processing
• Can’t handle usage surges (no buffering)
• No high scalability
• Very asynchronous processing (often offline for long time)
• No good integration to rest of the enterprise
• No reprocessing of events
19
Agenda
1) IoT Use Cases
2) MQTT Standard
3) Apache Kafka Ecosystem
4) Kafka-Native End-to-End IoT Integration Architecture(s)
5) IoT Data Processing
20
Apache Kafka – The Rise of a Streaming Platform
The Log ConnectorsConnectors
Producer Consumer
Streaming Engine
24
Apache Kafka == Distributed Commit Log with Replication
26
Kafka Trade-Offs (from IoT perspective)
Pros
• Stream processing, not just queuing
• High throughput
• Large scale
• High availability
• Long term storage and buffering
• Reprocessing of events
• Good integration to rest of the enterprise
Cons
• Not built for tens of thousands connections
• Requires stable network and good infrastructure
• No IoT-specific features like keep alive, last will or testament
27
(De facto) Standards for Processing IoT Data
A Match Made In Heaven
+ =
28
Agenda
1) IoT Use Cases
2) MQTT Standard
3) Apache Kafka Ecosystem
4) Kafka-Native End-to-End IoT Integration Architecture(s)
5) IoT Data Processing
29
?
Architecture (High Level)
Kafka BrokerKafka BrokerStreaming
Platform
Connect
w/ MQTT
connector
GatewayDevicesDevicesDevicesDevice
Device Tracking
(Real Time)
Predictive
Maintenance
(Near Real Time)
Log Analytics
(Batch)
Edge Data Center / Cloud
How to integrate?
30
MQTT
Server
Coordinator
MQTT
Server 1
MQTT
Server 2
MQTT
Server 3
MQTT
Server 4
topic: [deviceid]/car
Kafka Integration
Sensor Data
Stream
processing
Kafka Cluster
End-to-End Integration from MQTT to Apache Kafka
31
Design Questions for End-to-End Integration
• How much throughput?
• Ingest-only vs. processing of data?
• Analytical vs. operational deployments?
• Device publish only vs. device pub/sub?
• Pull vs. Push?
• Low-level client vs. integration framework vs. proxy?
• Integration patterns needed? (transform, route, …)?
• IoT-specific features required (last will, testament, …)?
32
Kafka-Native Integration Options between MQTT and Apache Kafka
Kafka Connect
MQTT Proxy
REST Proxy
33
Kafka-Native Integration Options between MQTT and Apache Kafka
Kafka Connect
MQTT Proxy
REST Proxy
34
MQTT Source and Sink Connectors for Kafka Connect
https://www.confluent.io/hub/
https://www.confluent.io/connector/kafka-connect-mqtt/
35
?
Integration with Kafka Connect (Source and Sink)
Kafka BrokerKafka BrokerKafka Broker
MQTT
Broker
Connect
w/ MQTT
connector
Connect
w/ MQTT
connectorMQTT
DevicesDevicesDevicesDevice
Kafka
Consumer
MQTT Broker
Persistent + offers MQTT-specific features
Consumes push data from IoT devices
Kafka Connect
Kafka Consumer + Kafka Producer under the hood
Pull-based (at own pace, without overwhelming the source or getting overwhelmed by the source)
Out-of-the-box scalability and integration features (like connectors, converters, SMTs)
38
Kafka Connect components
39
Connector for MQTT Source + Single Message Transformation (SMT)
curl -s -X POST -H 'Content-Type: application/json' http://localhost:8083/connectors -d '{
"name" : "mqtt-source",
"config" : {
"connector.class" : "io.confluent.connect.mqtt.MqttSourceConnector",
"tasks.max" : "1",
"mqtt.server.uri" : "tcp://127.0.0.1:1883",
"mqtt.topics" : "temperature",
"kafka.topics" : "mqtt.",
"transforms":"filter",
"transforms.filter.type":"com.github.kaiwaehner.kafka.connect.smt.StringFilter",
"transforms.filter.topic.format":"fraud"
}
}'
40
Kafka Connect - Converters
MQTT
Broker
S3
Object
Storage
MQTT Source
Connector
AvroConverter
AvroConverter
S3 Sink
Connector
JSON
Message
Connect data
API format
byte[]
(Avro)
byte[]
(Avro)
Connect data
API format
AmazonS3
Object
41
Live Demo
… MQTT Integration with Kafka Connect…
Connect
42
Kafka Connect + MQTT Connector
https://github.com/kaiwaehner/kafka-connect-iot-mqtt-connector-example
43
Kafka-Native Integration Options between MQTT and Apache Kafka
Kafka Connect
MQTT Proxy
REST Proxy
44
MQTT Proxy
Kafka BrokerKafka BrokerKafka Broker
MQTT
ProxyMQTT
DevicesDevicesDevicesDevices
Kafka
Consumer
MQTT Proxy
MQTT is push-based
Horizontally scalable
Consumes push data from IoT devices and forwards it to Kafka Broker at low-latency
Kafka Producer under the hood
No MQTT Broker needed
Kafka Broker
Source of truth
Responsible for persistence, high availability, reliability
45
Details of Confluent’s MQTT Proxy Implementation
General and modular framework
• Based on Netty to not re-invent the wheel (network layer handling, thread pools)
• Scalable with standard load balancer
• Internally uses Kafka Connect formats (allows re-using transformation and other Connect-
constructs à Coming soon)
Three pipeline stages
• Network (Netty)
• Protocol (like MQTT with QoS 0,1,2 today, later others, maybe e.g. WebSockets)
• Stream (Kafka clients: Today Producers, later also consumers)
Missing parts in first release
• Only MQTT Publish; MQTT Subscribe coming soon
• MQTT-specific features like last will or testament
46
Kafka-Native Integration Options between MQTT and Apache Kafka
Kafka Connect
MQTT Proxy
REST Proxy
47
Confluent REST Proxy
REST Proxy
Non-Java
Applications
Native Kafka Java
Applications
Schema Registry
REST / HTTP(S)
TCP
The „simple alternative“ for IoT
• Simple and understood
• HTTP(S) Proxy à Push-based
• Security ”easier”
• Scalable with standard load balancer
(still synchronous HTTP)
• Not for very high throughput
• Implement Connect features in your
client app
48
Confluent REST Proxy - Produce and Consume Messages
49
Agenda
1) IoT Use Cases
2) MQTT Standard
3) Apache Kafka Ecosystem
4) Kafka-Native End-to-End IoT Integration Architecture(s)
5) IoT Data Processing
5050
Processing Options for MQTT Data with Apache Kafka
Streams
Kafka native vs. additional big data cluster and technology
(or others, you name it …)
5353
Example: Anomaly Detection System to Predict Defects in Car Engine
MQTT
Proxy
Elastic
search
Grafana
Kafka
Cluster
Kafka
Connect
KSQL
Car Sensors
Kafka Ecosystem
Other Components
Real Time
Emergency
System
All Data
PotentialDefect
Apply
Analytic
Model
Filter
Anomalies
On premise DC: Kubernetes + Confluent OperatorAt the edge
5454
KSQL and Deep Learning (Auto Encoder) for Anomaly Detection
“CREATE STREAM AnomalyDetection AS
SELECT sensor_id, detectAnomaly(sensor_values)
FROM car_engine;“
User Defined Function (UDF)
55
Live Demo
… Processing IoT data with MQTT Proxy and KSQL …
56
Deep Learning UDF for KSQL for Streaming Anomaly Detection of MQTT IoT Sensor Data
https://github.com/kaiwaehner/ksql-udf-deep-learning-mqtt-iot
57
Poll
What is the best choice for your IoT integration between MQTT and Kafka?
1. Kafka Connect
2. MQTT Proxy
3. REST Proxy
4. Custom Kafka Client
(Java Client, Nifi, StreamSets, non-MQTT technology, …)
58
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
@KaiWaehner
www.kai-waehner.de
www.confluent.io
LinkedIn
Questions? Feedback?
Please contact me!

More Related Content

What's hot (20)

PPTX
Introduction to Kafka Cruise Control
Jiangjie Qin
 
PDF
Beautiful Monitoring With Grafana and InfluxDB
leesjensen
 
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Jean-Paul Azar
 
PDF
Kafka 101 and Developer Best Practices
confluent
 
PPTX
Introduction to Apache Kafka
AIMDek Technologies
 
PDF
OpenStack Ironic - Bare Metal-as-a-Service
Ramon Acedo Rodriguez
 
PDF
Common issues with Apache Kafka® Producer
confluent
 
PPTX
Pivotal Container Service Overview
VMware Tanzu
 
PDF
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Kai Wähner
 
PDF
Hexagonal Architecture - PHP Barcelona Monthly Talk (DDD)
Carlos Buenosvinos
 
PDF
Apache Kafka Introduction
Amita Mirajkar
 
PDF
Network-Connected Development with ZeroMQ
ICS
 
PDF
GitOps with Amazon EKS Anywhere by Dan Budris
Weaveworks
 
PPTX
Kafka presentation
Mohammed Fazuluddin
 
PDF
Kafka internals
David Groozman
 
PPTX
Azure kubernetes service (aks)
Akash Agrawal
 
PDF
GDG Cloud Southlake #8 Steve Cravens: Infrastructure as-Code (IaC) in 2022: ...
James Anderson
 
PPTX
Building a scalable microservice architecture with envoy, kubernetes and istio
SAMIR BEHARA
 
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
PPTX
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Flink Forward
 
Introduction to Kafka Cruise Control
Jiangjie Qin
 
Beautiful Monitoring With Grafana and InfluxDB
leesjensen
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Jean-Paul Azar
 
Kafka 101 and Developer Best Practices
confluent
 
Introduction to Apache Kafka
AIMDek Technologies
 
OpenStack Ironic - Bare Metal-as-a-Service
Ramon Acedo Rodriguez
 
Common issues with Apache Kafka® Producer
confluent
 
Pivotal Container Service Overview
VMware Tanzu
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Kai Wähner
 
Hexagonal Architecture - PHP Barcelona Monthly Talk (DDD)
Carlos Buenosvinos
 
Apache Kafka Introduction
Amita Mirajkar
 
Network-Connected Development with ZeroMQ
ICS
 
GitOps with Amazon EKS Anywhere by Dan Budris
Weaveworks
 
Kafka presentation
Mohammed Fazuluddin
 
Kafka internals
David Groozman
 
Azure kubernetes service (aks)
Akash Agrawal
 
GDG Cloud Southlake #8 Steve Cravens: Infrastructure as-Code (IaC) in 2022: ...
James Anderson
 
Building a scalable microservice architecture with envoy, kubernetes and istio
SAMIR BEHARA
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Flink Forward
 

Similar to Processing IoT Data from End to End with MQTT and Apache Kafka (20)

PDF
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
Kai Wähner
 
PDF
IoT Sensor Analytics with Kafka, ksqlDB and TensorFlow
Kai Wähner
 
PDF
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
confluent
 
PPTX
IoT Data Streaming - Why MQTT and Kafka are a match made in heaven | Dominik ...
HostedbyConfluent
 
PPTX
Kafka Summit 2021 - Why MQTT and Kafka are a match made in heaven
Dominik Obermaier
 
PDF
Best Practices for Streaming Connected Car Data with MQTT & Kafka
HiveMQ
 
PDF
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Kai Wähner
 
PDF
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...
confluent
 
PDF
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
confluent
 
PDF
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
confluent
 
PDF
Ingesting and Processing IoT Data - using MQTT, Kafka Connect and KSQL
Guido Schmutz
 
PDF
Integrating Sparkplug IoT Edge of Network Nodes with Kafka with Yves Kurz
HostedbyConfluent
 
PDF
HiveMQ + Kafka - The Ideal Solution for IoT MQTT Data Integration
HiveMQ
 
PDF
HiveMQ + Kafka: The ideal solution for IoT MQTT data integration
MargarethaErber
 
PDF
MQTT. Kafka. InfluxDB. SQL. IoT Harmony. #tutorial by Stefan Bocutiu
landoop
 
PDF
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
confluent
 
PDF
Io t data streaming
ratthaslip ranokphanuwat
 
PDF
Lightweight and scalable IoT Architectures with MQTT
Dominik Obermaier
 
PDF
Beyond the brokers - A tour of the Kafka ecosystem
Damien Gasparina
 
PDF
Beyond the Brokers: A Tour of the Kafka Ecosystem
confluent
 
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
Kai Wähner
 
IoT Sensor Analytics with Kafka, ksqlDB and TensorFlow
Kai Wähner
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
confluent
 
IoT Data Streaming - Why MQTT and Kafka are a match made in heaven | Dominik ...
HostedbyConfluent
 
Kafka Summit 2021 - Why MQTT and Kafka are a match made in heaven
Dominik Obermaier
 
Best Practices for Streaming Connected Car Data with MQTT & Kafka
HiveMQ
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Kai Wähner
 
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...
confluent
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
confluent
 
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
confluent
 
Ingesting and Processing IoT Data - using MQTT, Kafka Connect and KSQL
Guido Schmutz
 
Integrating Sparkplug IoT Edge of Network Nodes with Kafka with Yves Kurz
HostedbyConfluent
 
HiveMQ + Kafka - The Ideal Solution for IoT MQTT Data Integration
HiveMQ
 
HiveMQ + Kafka: The ideal solution for IoT MQTT data integration
MargarethaErber
 
MQTT. Kafka. InfluxDB. SQL. IoT Harmony. #tutorial by Stefan Bocutiu
landoop
 
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
confluent
 
Io t data streaming
ratthaslip ranokphanuwat
 
Lightweight and scalable IoT Architectures with MQTT
Dominik Obermaier
 
Beyond the brokers - A tour of the Kafka ecosystem
Damien Gasparina
 
Beyond the Brokers: A Tour of the Kafka Ecosystem
confluent
 
Ad

More from confluent (20)

PDF
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
confluent
 
PPTX
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
PDF
Migration, backup and restore made easy using Kannika
confluent
 
PDF
Five Things You Need to Know About Data Streaming in 2025
confluent
 
PDF
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
PDF
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
PDF
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
PDF
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
PDF
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
PDF
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
PDF
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
PDF
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
PDF
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
PDF
Unlocking value with event-driven architecture by Confluent
confluent
 
PDF
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
PDF
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
PDF
Building API data products on top of your real-time data infrastructure
confluent
 
PDF
Speed Wins: From Kafka to APIs in Minutes
confluent
 
PDF
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
confluent
 
Webinar Think Right - Shift Left - 19-03-2025.pptx
confluent
 
Migration, backup and restore made easy using Kannika
confluent
 
Five Things You Need to Know About Data Streaming in 2025
confluent
 
Data in Motion Tour Seoul 2024 - Keynote
confluent
 
Data in Motion Tour Seoul 2024 - Roadmap Demo
confluent
 
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
confluent
 
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
confluent
 
Data in Motion Tour 2024 Riyadh, Saudi Arabia
confluent
 
Build a Real-Time Decision Support Application for Financial Market Traders w...
confluent
 
Strumenti e Strategie di Stream Governance con Confluent Platform
confluent
 
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
confluent
 
Building Real-Time Gen AI Applications with SingleStore and Confluent
confluent
 
Unlocking value with event-driven architecture by Confluent
confluent
 
Il Data Streaming per un’AI real-time di nuova generazione
confluent
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Building API data products on top of your real-time data infrastructure
confluent
 
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Ad

Recently uploaded (20)

PDF
5 Things to Consider When Deploying AI in Your Enterprise
Safe Software
 
PPTX
Enabling the Digital Artisan – keynote at ICOCI 2025
Alan Dix
 
PDF
Automating the Geo-Referencing of Historic Aerial Photography in Flanders
Safe Software
 
PDF
Redefining Work in the Age of AI - What to expect? How to prepare? Why it mat...
Malinda Kapuruge
 
PDF
The Growing Value and Application of FME & GenAI
Safe Software
 
PDF
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
 
PPTX
reInforce 2025 Lightning Talk - Scott Francis.pptx
ScottFrancis51
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
 
PPTX
CapCut Pro Crack For PC Latest Version {Fully Unlocked} 2025
pcprocore
 
PPTX
01_Approach Cyber- DORA Incident Management.pptx
FinTech Belgium
 
PPTX
MARTSIA: A Tool for Confidential Data Exchange via Public Blockchain - Poster...
Michele Kryston
 
PDF
Plugging AI into everything: Model Context Protocol Simplified.pdf
Abati Adewale
 
PDF
Python Conference Singapore - 19 Jun 2025
ninefyi
 
PDF
Why aren't you using FME Flow's CPU Time?
Safe Software
 
PPTX
Curietech AI in action - Accelerate MuleSoft development
shyamraj55
 
PDF
Darley - FIRST Copenhagen Lightning Talk (2025-06-26) Epochalypse 2038 - Time...
treyka
 
PDF
ArcGIS Utility Network Migration - The Hunter Water Story
Safe Software
 
PDF
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
PDF
Open Source Milvus Vector Database v 2.6
Zilliz
 
5 Things to Consider When Deploying AI in Your Enterprise
Safe Software
 
Enabling the Digital Artisan – keynote at ICOCI 2025
Alan Dix
 
Automating the Geo-Referencing of Historic Aerial Photography in Flanders
Safe Software
 
Redefining Work in the Age of AI - What to expect? How to prepare? Why it mat...
Malinda Kapuruge
 
The Growing Value and Application of FME & GenAI
Safe Software
 
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
 
reInforce 2025 Lightning Talk - Scott Francis.pptx
ScottFrancis51
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
 
CapCut Pro Crack For PC Latest Version {Fully Unlocked} 2025
pcprocore
 
01_Approach Cyber- DORA Incident Management.pptx
FinTech Belgium
 
MARTSIA: A Tool for Confidential Data Exchange via Public Blockchain - Poster...
Michele Kryston
 
Plugging AI into everything: Model Context Protocol Simplified.pdf
Abati Adewale
 
Python Conference Singapore - 19 Jun 2025
ninefyi
 
Why aren't you using FME Flow's CPU Time?
Safe Software
 
Curietech AI in action - Accelerate MuleSoft development
shyamraj55
 
Darley - FIRST Copenhagen Lightning Talk (2025-06-26) Epochalypse 2038 - Time...
treyka
 
ArcGIS Utility Network Migration - The Hunter Water Story
Safe Software
 
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
Open Source Milvus Vector Database v 2.6
Zilliz
 

Processing IoT Data from End to End with MQTT and Apache Kafka

  • 1. 1Confidential Processing IoT Data with MQTT and Apache Kafka Kai Waehner Technology Evangelist [email protected] LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de Kafka-Native End-to-End IoT Data Integration and Processing
  • 2. 3 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) Kafka-Native End-to-End IoT Integration Architecture(s) 5) IoT Data Processing
  • 3. 4 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) Kafka-Native End-to-End IoT Integration Architecture(s) 5) IoT Data Processing
  • 4. 6 Connected Intelligence (Cars, Machines, Robots, …)
  • 6. 8 Smart Retail and Customer 360
  • 7. 9 Intelligent Applications (Early Part Scrapping, Predictive Maintenance, …)
  • 8. 10 ? Architecture (High Level) Kafka BrokerKafka BrokerStreaming Platform Connect w/ MQTT connector IoT Gateway DevicesDevicesDevicesDevice Device Tracking (Real Time) Predictive Maintenance (Near Real Time) Log Analytics (Batch) Edge Data Center / Cloud How to integrate?
  • 9. 11 Poll Which IoT scenarios do you see in your company? 1) IoT ingestion into analytics cluster 2) Bi-directional communication to control IoT devices (e.g. connected cars, fleet management, logistics) 3) Real time stream processing using machine learning (e.g. predictive maintenance, early part scrapping) 4) No IoT scenarios today; maybe in the future
  • 10. 13 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) Kafka-Native End-to-End IoT Integration Architecture(s) 5) IoT Data Processing
  • 11. 14 MQTT - Publish / subscribe messaging protocol • Built on top of TCP/IP for constrained devices and unreliable networks • Many (open source) broker implementations • Many client libraries • IoT-specific features for bad network / connectivity • Widely used (mostly IoT, but also web and mobile apps via MQTT over WebSockets)
  • 12. 15 MQTT Server 1 Processor 1 Processor 2 ... MQTT Architecture (no scale) topic: [deviceid]/car
  • 13. 16 MQTT Server Coordinator MQTT Server 1 MQTT Server 2 MQTT Server 3 MQTT Server 4 topic: [deviceid]/car ... MQTT Architecture (clustering depends on broker implementation) Processor 1 Processor 2 Processor 3 Processor 4
  • 14. 17 MQTT Architecture (clustering depends on broker implementation) Load Balancer MQTT Server 1 MQTT Server 2 MQTT Server 3 MQTT Server 4 topic: [deviceid]/car ... Processor 1 Processor 2 Processor 3 Processor 4
  • 15. 18 MQTT Trade-Offs Pros • Lightweight • Simple API • Built for poor connectivity / high latency scenario • Many client connections (tens of thousands per MQTT server) Cons • Queuing, not stream processing • Can’t handle usage surges (no buffering) • No high scalability • Very asynchronous processing (often offline for long time) • No good integration to rest of the enterprise • No reprocessing of events
  • 16. 19 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) Kafka-Native End-to-End IoT Integration Architecture(s) 5) IoT Data Processing
  • 17. 20 Apache Kafka – The Rise of a Streaming Platform The Log ConnectorsConnectors Producer Consumer Streaming Engine
  • 18. 24 Apache Kafka == Distributed Commit Log with Replication
  • 19. 26 Kafka Trade-Offs (from IoT perspective) Pros • Stream processing, not just queuing • High throughput • Large scale • High availability • Long term storage and buffering • Reprocessing of events • Good integration to rest of the enterprise Cons • Not built for tens of thousands connections • Requires stable network and good infrastructure • No IoT-specific features like keep alive, last will or testament
  • 20. 27 (De facto) Standards for Processing IoT Data A Match Made In Heaven + =
  • 21. 28 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) Kafka-Native End-to-End IoT Integration Architecture(s) 5) IoT Data Processing
  • 22. 29 ? Architecture (High Level) Kafka BrokerKafka BrokerStreaming Platform Connect w/ MQTT connector GatewayDevicesDevicesDevicesDevice Device Tracking (Real Time) Predictive Maintenance (Near Real Time) Log Analytics (Batch) Edge Data Center / Cloud How to integrate?
  • 23. 30 MQTT Server Coordinator MQTT Server 1 MQTT Server 2 MQTT Server 3 MQTT Server 4 topic: [deviceid]/car Kafka Integration Sensor Data Stream processing Kafka Cluster End-to-End Integration from MQTT to Apache Kafka
  • 24. 31 Design Questions for End-to-End Integration • How much throughput? • Ingest-only vs. processing of data? • Analytical vs. operational deployments? • Device publish only vs. device pub/sub? • Pull vs. Push? • Low-level client vs. integration framework vs. proxy? • Integration patterns needed? (transform, route, …)? • IoT-specific features required (last will, testament, …)?
  • 25. 32 Kafka-Native Integration Options between MQTT and Apache Kafka Kafka Connect MQTT Proxy REST Proxy
  • 26. 33 Kafka-Native Integration Options between MQTT and Apache Kafka Kafka Connect MQTT Proxy REST Proxy
  • 27. 34 MQTT Source and Sink Connectors for Kafka Connect https://www.confluent.io/hub/ https://www.confluent.io/connector/kafka-connect-mqtt/
  • 28. 35 ? Integration with Kafka Connect (Source and Sink) Kafka BrokerKafka BrokerKafka Broker MQTT Broker Connect w/ MQTT connector Connect w/ MQTT connectorMQTT DevicesDevicesDevicesDevice Kafka Consumer MQTT Broker Persistent + offers MQTT-specific features Consumes push data from IoT devices Kafka Connect Kafka Consumer + Kafka Producer under the hood Pull-based (at own pace, without overwhelming the source or getting overwhelmed by the source) Out-of-the-box scalability and integration features (like connectors, converters, SMTs)
  • 30. 39 Connector for MQTT Source + Single Message Transformation (SMT) curl -s -X POST -H 'Content-Type: application/json' http://localhost:8083/connectors -d '{ "name" : "mqtt-source", "config" : { "connector.class" : "io.confluent.connect.mqtt.MqttSourceConnector", "tasks.max" : "1", "mqtt.server.uri" : "tcp://127.0.0.1:1883", "mqtt.topics" : "temperature", "kafka.topics" : "mqtt.", "transforms":"filter", "transforms.filter.type":"com.github.kaiwaehner.kafka.connect.smt.StringFilter", "transforms.filter.topic.format":"fraud" } }'
  • 31. 40 Kafka Connect - Converters MQTT Broker S3 Object Storage MQTT Source Connector AvroConverter AvroConverter S3 Sink Connector JSON Message Connect data API format byte[] (Avro) byte[] (Avro) Connect data API format AmazonS3 Object
  • 32. 41 Live Demo … MQTT Integration with Kafka Connect… Connect
  • 33. 42 Kafka Connect + MQTT Connector https://github.com/kaiwaehner/kafka-connect-iot-mqtt-connector-example
  • 34. 43 Kafka-Native Integration Options between MQTT and Apache Kafka Kafka Connect MQTT Proxy REST Proxy
  • 35. 44 MQTT Proxy Kafka BrokerKafka BrokerKafka Broker MQTT ProxyMQTT DevicesDevicesDevicesDevices Kafka Consumer MQTT Proxy MQTT is push-based Horizontally scalable Consumes push data from IoT devices and forwards it to Kafka Broker at low-latency Kafka Producer under the hood No MQTT Broker needed Kafka Broker Source of truth Responsible for persistence, high availability, reliability
  • 36. 45 Details of Confluent’s MQTT Proxy Implementation General and modular framework • Based on Netty to not re-invent the wheel (network layer handling, thread pools) • Scalable with standard load balancer • Internally uses Kafka Connect formats (allows re-using transformation and other Connect- constructs à Coming soon) Three pipeline stages • Network (Netty) • Protocol (like MQTT with QoS 0,1,2 today, later others, maybe e.g. WebSockets) • Stream (Kafka clients: Today Producers, later also consumers) Missing parts in first release • Only MQTT Publish; MQTT Subscribe coming soon • MQTT-specific features like last will or testament
  • 37. 46 Kafka-Native Integration Options between MQTT and Apache Kafka Kafka Connect MQTT Proxy REST Proxy
  • 38. 47 Confluent REST Proxy REST Proxy Non-Java Applications Native Kafka Java Applications Schema Registry REST / HTTP(S) TCP The „simple alternative“ for IoT • Simple and understood • HTTP(S) Proxy à Push-based • Security ”easier” • Scalable with standard load balancer (still synchronous HTTP) • Not for very high throughput • Implement Connect features in your client app
  • 39. 48 Confluent REST Proxy - Produce and Consume Messages
  • 40. 49 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) Kafka-Native End-to-End IoT Integration Architecture(s) 5) IoT Data Processing
  • 41. 5050 Processing Options for MQTT Data with Apache Kafka Streams Kafka native vs. additional big data cluster and technology (or others, you name it …)
  • 42. 5353 Example: Anomaly Detection System to Predict Defects in Car Engine MQTT Proxy Elastic search Grafana Kafka Cluster Kafka Connect KSQL Car Sensors Kafka Ecosystem Other Components Real Time Emergency System All Data PotentialDefect Apply Analytic Model Filter Anomalies On premise DC: Kubernetes + Confluent OperatorAt the edge
  • 43. 5454 KSQL and Deep Learning (Auto Encoder) for Anomaly Detection “CREATE STREAM AnomalyDetection AS SELECT sensor_id, detectAnomaly(sensor_values) FROM car_engine;“ User Defined Function (UDF)
  • 44. 55 Live Demo … Processing IoT data with MQTT Proxy and KSQL …
  • 45. 56 Deep Learning UDF for KSQL for Streaming Anomaly Detection of MQTT IoT Sensor Data https://github.com/kaiwaehner/ksql-udf-deep-learning-mqtt-iot
  • 46. 57 Poll What is the best choice for your IoT integration between MQTT and Kafka? 1. Kafka Connect 2. MQTT Proxy 3. REST Proxy 4. Custom Kafka Client (Java Client, Nifi, StreamSets, non-MQTT technology, …)