Introduction Apache Kafka

Mahfooz Ahamed
3 min readMay 12, 2022

What is Kafka?

We use Apache Kafka when it comes to enabling communication between producers and consumers using message-based topics. Apache Kafka is a fast, scalable, fault-tolerant, publish-subscribe messaging system. Basically, it designs a platform for high-end new generation distributed applications. Also, it allows a large number of permanent or ad-hoc consumers.

One of the best features of Kafka is, it is highly available and resilient to node failures and supports automatic recovery. This feature makes Apache Kafka ideal for communication and integration between components of large-scale data systems in real-world data systems.

Before moving forward in Kafka deeply, let’s understand the actual meaning of term Messaging System in Kafka.

Messaging System in Kafka

  • When we transfer data from one application to another, we use the Messaging System. It results as, without worrying about how to share data, applications can focus on data only.
  • On the concept of reliable message queuing, distributed messaging is based. Although, messages are asynchronously queued between client applications and messaging system.

There are two types of messaging patterns available

  1. Point to Point
  • Here, messages are persisted in a queue. Although, a particular message can be consumed by a maximum of one consumer only, even if one or more consumers can consume the messages in the queue. Also, it makes sure that as soon as a consumer reads a message in the queue, it disappears from that queue.

2. Publish-Subscribe Messaging System

  • Here, messages are persisted in a topic. In this system, Kafka Consumers can subscribe to one or more topic and consume all the messages in that topic. Moreover, message producers refer publishers and message consumers are subscribers here.

Why Should we use Apache Kafka Cluster?

  • As we all know, there is an enormous volume of data in Big Data. And, when it comes to big data, there are two main challenges. One is to collect the large volume of data, while another one is to analyze the collected data.
  • Hence, in order to overcome those challenges, we need a messaging system. Then Apache Kafka has proved its utility

There are numerous benefits of Apache Kafka such as:

  • Tracking web activities by storing/sending the events for real-time processes.
  • Alerting and reporting the operational metrics.
  • Transforming data into the standard format.
  • Continuous processing of streaming data to the topics.

Therefore, this technology is giving a tough competition to some of the most popular applications like ActiveMQ, RabbitMQ, AWS etc. because of its wide use.

Kafka Architecture

  1. Kafka Producer API

This Kafka Producer API permits an application to publish a stream of records to one or more Kafka topics.

2. Kafka Consumer API

To subscribe to one or more topics and process the stream of records produced to them in an application, we use this Kafka Consumer API.

3. Kafka Streams API

In order to act as a stream processor consuming an input stream from one or more topics and producing an output stream to one or more output topics and also effectively transforming the input streams to output streams, this Kafka Streams API gives permission to an application.

4. Kafka Connector API

This Kafka Connector API allows building and running reusable producers or consumers that connect Kafka topics to existing applications or data systems. For example, a connector to a relational database might capture every change to a table.

Its all about Just Kick Start Introduction Kafka will write more In this Deeply..

Happy Hacking Guyss..

--

--