Kafka entry to mastery

Fengshuwan River Bridge 2020-11-11 20:10:23
kafka entry mastery


Link to the original text :https://blog.csdn.net/qq_43323585/article/details/105824989

Kafka An overview of

kafka What is it?

Kafka yes Apache Open source stream processing platform , The platform provides the subscription and publication of messages . With high throughput 、 Simple 、 Easy to deploy and so on .

Kafka What for?

  1. Message queue : For system decoupling 、 asynchronous communication 、 Flow, valley filling, etc .
  2. Kaka Streaming: Real time online streaming .

Message queuing working mode

Two working modes of message queuing :1. At most once 2. There is no limit to . Pictured :
 Insert picture description here
that Kafka How it works ? continue

Kafka Architecture and concepts

Some concepts still need to know , Coding can be used and can help understand ! Finally, I'll attach my git Address , I want to talk about kafka Business with my code base can be basically done

Kafka The production and consumption of news

Kafka With Topic Formal management of messages in cluster classification (Record). Every Record Belong to a Topic And each Topic There are multiple partitions (partition) Deposit Record, Each partition corresponds to a server (Broker), This Broker go by the name of leader, The partition copy corresponds to Broker Become follower.** It should be noted that only leader Can read and write .** you 're right , It's easy for us to think of zookeeper, The so-called only distributed consistency algorithm so far Paxos, This is also for Kafka The reliability of the data provides a guarantee . It may be a little abstract , Look at the picture :
 Insert picture description here  Insert picture description here

Partitions and logs

The journal is log, It's just data , It's kind of like redis Inside log. It's not what we call print logs , ha-ha
3. Every Topic Can be subscribed by multiple consumers ,Kafka adopt Topic management partiton.
4. Consumers can poll , load ( Yes Record Of key modulus ) The way to Record Deposit in partition in .
5. partition It's an ordered and immutable sequence of logs , Every Record There is only offset, Used to record consumption 、 Support persistence strategy .
6. kafka The default configuration log.retention.hours=168, Whether the message is consumed or not Record Can be preserved 168 Hours – Hard disk storage , Specific persistence policies can be customized through configuration files .
7. partiton Internal order , In the case of multiple partitions , Don't expect your husband to consume first . Write business and code to pay attention to
Why? Kafka Give up order , The local order is adopted ?
It's like hadoop The same as , Distributed cluster , Breaking physical limitations , Both the performance capacity and concurrency have been improved qualitatively , One can't make a hundred . After all Kafka It's a big data framework .

Consumer groups

Concept : It's a logical consumer , There are multiple consumer instances . If 4 Consumers subscribe at the same time topic1(4 Zones ), Then a partition will be consumed 4 Time . The introduction of consumption group can avoid repeated consumption .
code : Consumers can use subscribe and assign, use subscribe subscribe topic Consumer group must be specified !
 Insert picture description here

Kafka High performance way

Why? Kafka The throughput is so high

Kafka It can easily support millions of write requests , And the data will persist to the hard disk , Terror . Now think about it , A high-performance technology is an encapsulation of the kernel , such as Redis Called at the bottom epoll(), The most powerful is big brother OS.

Sequential writing 、mmap

Sequential writing : The hard disk is a mechanical structure , Addressing is an extremely time-consuming mechanical action , therefore Kafka In order IO, It avoids a lot of memory overhead and IO Addressing time .
mmap:Memory Mapped Files Memory mapped files , It works by using the operating system PageCache Realize the direct mapping from file to physical memory , It's written directly to pagecache in , All user operations on memory will be automatically refreshed to the disk by the operating system , Greatly reduced IO Usage rate .mmap It is equivalent to that a user mode can directly access the kernel mode shared space , The switch from user state to inner core state and copy.
 Insert picture description here

ZeroCopy

Kafka When the server responds to the client reading , Bottom use ZeroCopy technology , You don't need to copy the disk to user space , Instead, the data is transmitted directly through the kernel space , Data doesn't reach user space .
IO The model will not be repeated here , I'll write it later .
 Insert picture description here

build Kafka colony

Please look at what I wrote zookeeper and kafka Build two articles , Reference resources :Kafka Cluster building

Topic management

  • bootstrap-server: consumer , Pull data , The old version with -zookeeper This parameter . And a lot of data goes into zk It's in , This is very unreasonable .
  • broker-list: producer , Push data
  • partitions: Number of divisions
  • replication-factor : Partition replica factor

establish topic

bin/kafka-topics.sh --bootstrap-server node01:9092,node02:9092,node03:9092 --create --partitions 3 --replication-factor 2 --topic debug
  • 1

See the list of topics

bin/kafka-topics.sh --bootstrap-server node01:9092,node02:9092,node03:9092 --list
  • 1

See topic details

bin/kafka-topics.sh --bootstrap-server node01:9092,node02:9092,node03:9092 --describe --topic debug
  • 1

Delete topic

bin/kafka-topics.sh --bootstrap-server node01:9092,node02:9092,node03:9092 --delete --topic debug
  • 1

Produce a message on a topic

bin/kafka-console-producer.sh --broker-list node01:9092,node02:9092,node03:9092 --topic debug
  • 1

Consume news on a topic

bin/kafka-console-consumer.sh --bootstrap-server node01:9092,node02:9092,node03:9092 --topic debug --from-beginning
  • 1
bin/kafka-console-consumer.sh --bootstrap-server node01:9092,node02:9092,node03:9092 --topic debug --group group1
  • 1

These commands are still very useful ! It can help us test .

Check out the consumer group

bin/kafka-consumer-groups.sh --bootstrap-server node01:9092,node02:9092,node03:9092 --list
  • 1

Kafka API

The information comes from the examples of reference books , Very comprehensive , It has been uploaded to GIt, I haven't had time to sort it out yet , But I'll sort it out later , If you think it's good, give a star ! Include : Production and consumption , Custom partition , serialize , Interceptor , Business ,kafka Stream processing, etc .
Git Address :Kafka API

Kafka characteristic

acks、retries Mechanism

acks Response mechanism :

  1. acks=1,leader Write a successful , Don't wait for follower Confirm to return . There is no guarantee of a single point of failure
  2. acks=0, Send to socket cache , Confirm to return . The message security factor is the lowest
  3. acks=-1,leader At least one follower After answering, confirm to return . High availability cluster
    retries Retry mechanism :
    request.timeout.ms=3000 Default timeout retries .
    retries=2147483647xxx Retry count .
    Kafka Message semantics : Messages can be saved at least once

Idempotent writing

Idempotency : The result of multiple requests is consistent with that of one request .
Kafka Idempotent writing solutions :
 Insert picture description here
For some business idempotent problems, we can learn from . To really solve idempotency :{ The blacklist 、 Signature 、token}
It should be noted that
enable.idempotence=false Idempotent is turned off by default
Prerequisite :retries=true;acks=all

Business

because Kafka Idempotent writing does not provide spanning multiple Partition And guarantees in cross session scenarios , therefore , We need a stronger transaction assurance , Can handle multiple atoms Partition Write operation of , Either all the data is written successfully , All or nothing , This is it. Kafka Transactions, namely Kafka Business .
producer Provides initTransactions,beginTransaction,sendOffsetsToTransaction,commitTransaction,abortTransaction Five transaction methods .
consumer Provides read_committed and read_uncommitted.

KafkaEagle monitoring software

Open source address :https://github.com/smartloli/kafka-eagle
Including when I started to learn Kafka He wrote his book, too .

setup script

  1. download kafka-eagle-bin-1.4.0.tar.gz And extract the
  2. mv kafka-eagle-bin-1.4.0 /opt/
  3. After decompressing, there is a compressed package inside , Decompress again
  4. Modify environment variables
vim /etc/profile
------------------------
# kafka-eagle
export KE_HOME=/opt/kafka-eagle
# PATH
export PATH=$PATH:$JAVA_HOME/bin:$ZOOKEEPER_HOME/bin:$KE_HOME/bin
------------------------
source /etc/profile
echo $PATH
OK~
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  1. modify kafka-eagle To configure
[[email protected] conf]# vim system-config.properties
-----------------------------------------------------
kafka.eagle.zk.cluster.alias=cluster1
cluster1.zk.list=node01:2181,node02:2181,node03:2181
#cluster2.zk.list=xdn10:2181,xdn11:2181,xdn12:2181
cluster1.kafka.eagle.offset.storage=kafka
#cluster2.kafka.eagle.offset.storage=zk
kafka.eagle.metrics.charts=true
cluster1.kafka.eagle.sasl.enable=false
cluster1.kafka.eagle.sasl.protocol=SASL_PLAINTEXT
cluster1.kafka.eagle.sasl.mechanism=SCRAM-SHA-256
cluster1.kafka.eagle.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="kafka" password="kafka-eagle";
cluster1.kafka.eagle.sasl.client.id=
#cluster2.kafka.eagle.sasl.enable=false
#cluster2.kafka.eagle.sasl.protocol=SASL_PLAINTEXT
#cluster2.kafka.eagle.sasl.mechanism=PLAIN
#cluster2.kafka.eagle.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="kafka" password="kafka-eagle";
#cluster2.kafka.eagle.sasl.client.id=
######################################
# kafka sqlite jdbc driver address
######################################
#kafka.eagle.driver=org.sqlite.JDBC
#kafka.eagle.url=jdbc:sqlite:/hadoop/kafka-eagle/db/ke.db
#kafka.eagle.username=root
#kafka.eagle.password=www.kafka-eagle.org
######################################
# kafka mysql jdbc driver address
######################################
kafka.eagle.driver=com.mysql.jdbc.Driver
kafka.eagle.url=jdbc:mysql://127.0.0.1:3306/ke?useUnicode=true&characterEncoding=UTF-8&zeroDateTimeBehavior=convertToNull
kafka.eagle.username=root
kafka.eagle.password=123456
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  1. modify kafka A launch configuration , Turn on JMS
vim kafka-server-start.sh
-------------------------------------
if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then
export JMX_PORT="7379"
export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"
fi
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  1. Finally start , Found no startup permissions
chmod u+x ke.sh
./ke.sh
  • 1
  • 2
Version 1.4.0
*******************************************************************
* Kafka Eagle Service has started success.
* Welcome, Now you can visit 'http://192.168.83.11:8048/ke'
* Account:admin ,Password:123456
*******************************************************************
* <Usage> ke.sh [start|status|stop|restart|stats] </Usage>
* <Usage> https://www.kafka-eagle.org/ </Usage>
*******************************************************************
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

summary

Only this and nothing more ,3 platform zk+3 platform kafka+kafka-eagle Monitoring can do a lot of things . such as : Log collection 、 The messaging system 、 Activity tracking 、 Operational indicators 、 Stream processing, etc . I hope I can give some help to the people I see !
Supporting documents :
kafka colony : link
zk colony : link

版权声明
本文为[Fengshuwan River Bridge]所创,转载请带上原文链接,感谢

  1. 【计算机网络 12(1),尚学堂马士兵Java视频教程
  2. 【程序猿历程,史上最全的Java面试题集锦在这里
  3. 【程序猿历程(1),Javaweb视频教程百度云
  4. Notes on MySQL 45 lectures (1-7)
  5. [computer network 12 (1), Shang Xuetang Ma soldier java video tutorial
  6. The most complete collection of Java interview questions in history is here
  7. [process of program ape (1), JavaWeb video tutorial, baidu cloud
  8. Notes on MySQL 45 lectures (1-7)
  9. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  10. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  11. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  12. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  13. 【递归,Java传智播客笔记
  14. [recursion, Java intelligence podcast notes
  15. [adhere to painting for 386 days] the beginning of spring of 24 solar terms
  16. K8S系列第八篇(Service、EndPoints以及高可用kubeadm部署)
  17. K8s Series Part 8 (service, endpoints and high availability kubeadm deployment)
  18. 【重识 HTML (3),350道Java面试真题分享
  19. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  20. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  21. [re recognize HTML (3) and share 350 real Java interview questions
  22. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  23. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  24. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  25. RPC 1: how to develop RPC framework from scratch
  26. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  27. RPC 1: how to develop RPC framework from scratch
  28. 一次性捋清楚吧,对乱糟糟的,Spring事务扩展机制
  29. 一文彻底弄懂如何选择抽象类还是接口,连续四年百度Java岗必问面试题
  30. Redis常用命令
  31. 一双拖鞋引发的血案,狂神说Java系列笔记
  32. 一、mysql基础安装
  33. 一位程序员的独白:尽管我一生坎坷,Java框架面试基础
  34. Clear it all at once. For the messy, spring transaction extension mechanism
  35. A thorough understanding of how to choose abstract classes or interfaces, baidu Java post must ask interview questions for four consecutive years
  36. Redis common commands
  37. A pair of slippers triggered the murder, crazy God said java series notes
  38. 1、 MySQL basic installation
  39. Monologue of a programmer: despite my ups and downs in my life, Java framework is the foundation of interview
  40. 【大厂面试】三面三问Spring循环依赖,请一定要把这篇看完(建议收藏)
  41. 一线互联网企业中,springboot入门项目
  42. 一篇文带你入门SSM框架Spring开发,帮你快速拿Offer
  43. 【面试资料】Java全集、微服务、大数据、数据结构与算法、机器学习知识最全总结,283页pdf
  44. 【leetcode刷题】24.数组中重复的数字——Java版
  45. 【leetcode刷题】23.对称二叉树——Java版
  46. 【leetcode刷题】22.二叉树的中序遍历——Java版
  47. 【leetcode刷题】21.三数之和——Java版
  48. 【leetcode刷题】20.最长回文子串——Java版
  49. 【leetcode刷题】19.回文链表——Java版
  50. 【leetcode刷题】18.反转链表——Java版
  51. 【leetcode刷题】17.相交链表——Java&python版
  52. 【leetcode刷题】16.环形链表——Java版
  53. 【leetcode刷题】15.汉明距离——Java版
  54. 【leetcode刷题】14.找到所有数组中消失的数字——Java版
  55. 【leetcode刷题】13.比特位计数——Java版
  56. oracle控制用户权限命令
  57. 三年Java开发,继阿里,鲁班二期Java架构师
  58. Oracle必须要启动的服务
  59. 万字长文!深入剖析HashMap,Java基础笔试题大全带答案
  60. 一问Kafka就心慌?我却凭着这份,图灵学院vip课程百度云