Publish and Subscribe to streams of records, like a message
Storage system so messages can be consumed asynchronously. Kafka
writes data to a scalable disk structure and replicates for
fault-tolerance. Producers can wait for write acknowledgments.
Stream processing with Kafka Streams API, enables complex
aggregations or joins of input streams onto an output stream of
Kafka has four core APIs - Producer(publish a stream of records
to one/more Kafka topics), Consumer(process the stream of records),
Streams(mix of these 2) and Connector(save to DB)
Kafka communication between the clients and the servers is done
with a simple, high-performance, language agnostic TCP protocol.
For each topic, the Kafka cluster maintains a partitioned log
Each partition is an ordered, immutable sequence of records that
is continually appended to a structured commit log.
Kafka cluster long time persists all published records depends on
your configuration it may be days as well
Offset(each partitation) is controlled by the consumer, can
consume records in any order it likes.
Topic store structure -
Kafka MirrorMaker provides geo-replication support for your
Data written to Kafka is written to disk and replicated for
Kafka Streams is a client library for building real-time
applications and microservices where the input and/or output data is
stored in Kafka clusters. Kafka Streams combines the simplicity of
writing and deploying standard Java and Scala applications on the
client side with the benefits of Kafka's server-side cluster
technology to make these applications highly scalable, elastic,
fault-tolerant, distributed etc.