Apache Kafka allows us to move real-time data reliably between systems and applications. But we still need some sort of processing engine to process and transform that real-time data in order ultimately to derive value from it based on the use case in question. Fortunately, there are a number of stream processing engines available to allow us to do this, including—but not limited—to the following:
- Apache Spark: https://spark.apache.org/
- Apache Storm: http://storm.apache.org/
- Apache Flink: https://flink.apache.org/
- Apache Samza: http://samza.apache.org/
- Apache Kafka (via its Streams API): https://kafka.apache.org/documentation/
- KSQL: https://www.confluent.io/product/ksql/
Though a detailed comparison of the available stream processing engines is beyond the scope of this book, you are encouraged to explore the preceding links...