As mention
in https://kafka.apache.org/intro Apache Kafka is distributed streaming platform.
Kafka works on Publishers and Subscribers models.
Kafka
Internal Processing.
Kafka has four core API
1.
Producer API –
Publish a
stream of records to one or more Kafka topics
2.
Consumer API –
Subscribe to
one or more topics and process the stream of records
3.
Stream API –
Transforming
the input streams to output streams.
4.
Connector API –
Reusable
producers or consumers that connect Kafka topics to existing applications or
data system
Topic
A topic is a feed name to which records are published.
Topics in Kafka are always multi-subscriber one or more consumer
subscribe a topic.
Topic can have multiple partition. Refer below diagram
In above topic have two partition.
Kafka uses partitions to scale a topic across many
servers for producer writes. In addition, Kafka also uses partitions to
facilitate parallel consumers. Consumers consume records in parallel up to the
number of partitions.
Kafka
Topic Log Partition’s Ordering and Cardinality
Kafka maintains record order only in a single partition.
A partition is an ordered, immutable record sequence. Kafka continually
appended to partitions using the partition as a structured commit log. Records
in partitions are assigned sequential id number called the offset. The
offset identifies each record location within the partition. Topic partitions
allow Kafka log to scale beyond a size that will fit on a single server. Topic
partitions must fit on servers that host it, but topics can span many
partitions hosted on many servers. In addition, topic partitions are a unit of
parallelism - one consumer in a consumer group can only work on a partition at
a time. Consumers can run in their own process or their own thread. If a
consumer stops, Kafka spreads partitions across the remaining consumer in the
same consumer group.
Kafka can replicate partitions across a configurable
number of Kafka servers, which is used for fault tolerance. Each partition has
a leader server and zero or more follower servers. Leaders handle all read and
write requests for a partition.
Kafka
Setup
2.
Set Java home, use below command to set JAVA_HOME
setx
JAVA_HOME -m "Path". For “Path”, paste in your Java installation path .
3.
Install zookeeper download tar file - http://zookeeper.apache.org/releases.html
4.
Start zookeeper server zkServer.bat
5.
Start zkCLI
7.
Start kafak server with below command
kafka-server-start.bat ..\..\config\server.properties
8.
Now open a separate command prompt and run below command
to create topic
kafka-topics --create --zookeeper localhost:2181
--replication-factor 1 --partitions 1 --topic topic-name
9.
Use below command to fetch list of all topics
kafka-topics --list --zookeeper localhost:2181
10. To
publish the data from command line use below command
kafka-console-producer --broker-list localhost:9092
--topic topic-Name
<enter
message here>
11. To
consume the message from Kafka topic use below command
kafka-console-consumer --bootstrap-server localhost:9092
--from-beginning --topic <topic-name>
Nice to the point information
ReplyDelete