How to develop a perfect Kafka producer client?

Code farm architecture 2021-01-22 18:46:10
develop perfect kafka producer client

Kafka At first it was from LinkedIn Adopted by the company Scala A multi partition of language development 、 Multiple copies and based on ZooKeeper Coordinated distributed messaging systems , Has been donated to Apache The foundation . at present Kafka Has been positioned as a distributed streaming platform , It's high throughput 、 Be persistent 、 Scalable horizontally 、 It is widely used to support streaming data processing and other features . At present, more and more open source distributed processing systems such as Cloudera、Storm、Spark、Flink And so on Kafka Integrate .

Kafka The reason why it is more and more popular , With it “ Play the role ” The three roles of the teacher are inseparable :

  • The messaging system : Kafka And traditional messaging systems ( Also known as message middleware ) All have system decoupling 、 Redundant storage 、 Traffic peak clipping 、 buffer 、 asynchronous communication 、 Extensibility 、 Recoverability and other functions . meanwhile ,Kafka It also provides the message sequence guarantee and the function of backtracking consumption which are difficult to be realized by most message systems .
  • The storage system : Kafka Persist messages to disk , Compared to other memory based storage systems , Effectively reduce the risk of data loss . It's because of Kafka Message persistence and multi copy mechanism , We can Kafka As a long-term data storage system , Just set the corresponding data retention policy to “ permanent ” Or enable the theme's log compression function .
  • Streaming platform : Kafka Not only does it provide a reliable data source for every popular streaming framework , It also provides a complete streaming class library , Like windows 、 Connect 、 Various operations such as transformation and aggregation .

1|0 Basic concepts

A typical Kafka The architecture includes a number of Producer、 A number of Broker、 A number of Consumer, And one. ZooKeeper colony , As shown in the figure below . among ZooKeeper yes Kafka It is used to manage the cluster metadata 、 Controller election and other operations .Producer Send the message to Broker,Broker Responsible for storing received messages to disk , and Consumer In charge of from Broker Subscribe and consume messages .

Whole Kafka The architecture introduces the following 3 A term :

  • Producer: producer , That is, the party sending the message . The producer is responsible for creating the message , Then deliver it to Kafka in .
  • Consumer: consumer , That is, the party receiving the message . Consumer connected to Kafka Go up and receive messages , Then carry on the corresponding business logic processing .
  • Broker: Service agent node . about Kafka for ,Broker Can be simply seen as an independent Kafka Service node or Kafka Service instance . In most cases, you can also change Broker Think of it as a Kafka The server , The premise is that there is only one deployed on this server Kafka example . One or more Broker Formed a Kafka colony . generally speaking , We are more used to lowercase broker To represent a service proxy node .

stay Kafka There are also two particularly important concepts — The theme (Topic) And zoning (Partition).Kafka The messages in are grouped by topic , Producers are responsible for sending messages to specific topics ( Send to Kafka Each message in the cluster has to be assigned a topic ), And consumers are responsible for subscribing to topics and consuming .

2|0 Client development

A normal production logic needs the following steps :

  1. Configure producer client parameters and create corresponding producer instances .
  2. Building messages to be sent .
  3. Send a message .
  4. Close producer instance . 

The message object built in it ProducerRecord, It's not just news , It contains multiple attributes , The business-related message body that needs to be sent is just one of them value attribute , such as “Hello, Kafka!” It's just ProducerRecord An attribute in an object .ProducerRecord Class is defined as follows ( Intercepts only member variables )

among topic and partition The fields represent the subject to which the message is to be sent and the area code .headers The field is the header of the message ,Kafka 0.11.x It's the version that introduces this property , It is mostly used to set some information related to the application , If you don't need it, you don't need to set it .key Is the key used to specify the message , It's not just additional information to the message , It can also be used to calculate the partition number so that messages can be sent to specific partitions . As mentioned earlier, messages are classified by topic , And this key The message can be sorted again , The same key All messages will be partitioned into the same partition .

3|0 Necessary parameter setting

Before creating a real producer instance, you need to configure the corresponding parameters , Like the ones that need to be connected Kafka The cluster address . Refer to... In the client code above initConfig() Method , stay Kafka Producer client KafkaProducer There is 3 Two parameters are required .

  • bootstrap.servers: This parameter is used to specify the producer client connection Kafka Cluster needs broker Address list , The specific content format is host1:port1,host2:port2, You can set one or more addresses , Separated by commas , The default value for this parameter is “”. Note that not all of them are needed here broker Address , Because the producer will start from a given broker Find other broker Information about . However, it is recommended to set at least two or more broker Address information , When any one of them goes down , Producers can still connect to Kafka On the cluster .
  • key.serializer and  value.serializer:broker The message received by the client must be in byte array (byte[]) There is a form of . Code list 3-1 It's used by producers in KafkaProducer<String, String> and ProducerRecord<String, String> The generics in <String, String> The corresponding is in the message key and value The type of , The producer client uses this method to make the code readable , But it's being sent to broker Before that, you need to change the corresponding key and value Do the corresponding serialization operation to convert to byte array .key.serializer and value.serializer These two parameters are used to specify key and value Serializer for serialization operations , There are no default values for these two parameters .

In the client development code above initConfig() A parameter is also set in the method, This parameter is used to set KafkaProducer Corresponding client id, The default value is “”. If the client does not set , be KafkaProducer Will automatically generate a non empty string , The content and form are as follows “producer-1”、“producer-2”, String character “producer-” The combination of numbers .

KafkaProducer There are many parameters in , Far from being an example initConfig() In the method, only 4 individual , Developers can modify the default values of these parameters according to the actual needs of business applications , In order to achieve the purpose of flexible deployment . In general , Ordinary developers can't remember all the parameter names , Only a general impression .

In actual use , Such as “key.serializer”、“max.request.size”、“interceptor.classes” Such strings are often wrongly written due to human factors . So , We can directly use the org.apache.kafka.clients.producer.ProducerConfig Class to do a certain degree of preventive measures , Each parameter is in ProducerConfig Classes have corresponding names , With code listing 3-1 Medium initConfig() Methods as an example , introduce ProducerConfig The results are as follows :

Notice in the code above key.serializer and value.serializer The fully qualified name of the class corresponding to the parameter is relatively long , It's also easier to make mistakes , Through here Java To make further improvements , The relevant code is as follows :

So the code is much simpler , At the same time, it further reduces the possibility of human error . After configuring the parameters , We can use it to create a producer instance , Examples are as follows :

KafkaProducer It's thread safe , You can share a single KafkaProducer example , Can also be KafkaProducer The instance is pooled for other threads to call .

KafkaProducer There are several construction methods in , Like creating KafkaProducer Instance is not set key.serializer and value.serializer These two configuration parameters , Then you need to add the corresponding serializer to the constructor , Examples are as follows :

本文为[Code farm architecture]所创,转载请带上原文链接,感谢

  1. 【计算机网络 12(1),尚学堂马士兵Java视频教程
  2. 【程序猿历程,史上最全的Java面试题集锦在这里
  3. 【程序猿历程(1),Javaweb视频教程百度云
  4. Notes on MySQL 45 lectures (1-7)
  5. [computer network 12 (1), Shang Xuetang Ma soldier java video tutorial
  6. The most complete collection of Java interview questions in history is here
  7. [process of program ape (1), JavaWeb video tutorial, baidu cloud
  8. Notes on MySQL 45 lectures (1-7)
  9. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  10. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  11. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  12. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  13. 【递归,Java传智播客笔记
  14. [recursion, Java intelligence podcast notes
  15. [adhere to painting for 386 days] the beginning of spring of 24 solar terms
  16. K8S系列第八篇(Service、EndPoints以及高可用kubeadm部署)
  17. K8s Series Part 8 (service, endpoints and high availability kubeadm deployment)
  18. 【重识 HTML (3),350道Java面试真题分享
  19. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  20. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  21. [re recognize HTML (3) and share 350 real Java interview questions
  22. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  23. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  24. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  25. RPC 1: how to develop RPC framework from scratch
  26. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  27. RPC 1: how to develop RPC framework from scratch
  28. 一次性捋清楚吧,对乱糟糟的,Spring事务扩展机制
  29. 一文彻底弄懂如何选择抽象类还是接口,连续四年百度Java岗必问面试题
  30. Redis常用命令
  31. 一双拖鞋引发的血案,狂神说Java系列笔记
  32. 一、mysql基础安装
  33. 一位程序员的独白:尽管我一生坎坷,Java框架面试基础
  34. Clear it all at once. For the messy, spring transaction extension mechanism
  35. A thorough understanding of how to choose abstract classes or interfaces, baidu Java post must ask interview questions for four consecutive years
  36. Redis common commands
  37. A pair of slippers triggered the murder, crazy God said java series notes
  38. 1、 MySQL basic installation
  39. Monologue of a programmer: despite my ups and downs in my life, Java framework is the foundation of interview
  40. 【大厂面试】三面三问Spring循环依赖,请一定要把这篇看完(建议收藏)
  41. 一线互联网企业中,springboot入门项目
  42. 一篇文带你入门SSM框架Spring开发,帮你快速拿Offer
  43. 【面试资料】Java全集、微服务、大数据、数据结构与算法、机器学习知识最全总结,283页pdf
  44. 【leetcode刷题】24.数组中重复的数字——Java版
  45. 【leetcode刷题】23.对称二叉树——Java版
  46. 【leetcode刷题】22.二叉树的中序遍历——Java版
  47. 【leetcode刷题】21.三数之和——Java版
  48. 【leetcode刷题】20.最长回文子串——Java版
  49. 【leetcode刷题】19.回文链表——Java版
  50. 【leetcode刷题】18.反转链表——Java版
  51. 【leetcode刷题】17.相交链表——Java&python版
  52. 【leetcode刷题】16.环形链表——Java版
  53. 【leetcode刷题】15.汉明距离——Java版
  54. 【leetcode刷题】14.找到所有数组中消失的数字——Java版
  55. 【leetcode刷题】13.比特位计数——Java版
  56. oracle控制用户权限命令
  57. 三年Java开发,继阿里,鲁班二期Java架构师
  58. Oracle必须要启动的服务
  59. 万字长文!深入剖析HashMap,Java基础笔试题大全带答案
  60. 一问Kafka就心慌?我却凭着这份,图灵学院vip课程百度云