In depth interpretation: Kafka abandons zookeeper and the second revolution of message system rises

InfoQ 2021-04-16 14:26:48
depth interpretation kafka abandons zookeeper


{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" author | Tina"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" Interviewing guests | Han Xin 、 Wang Guozhang "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"“ I'm very excited about this version , But our business characteristics decide that we can't stop and upgrade ...”"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"3 month 30 Japan ,Kafka The enterprise behind it Confluent The blog said , In the coming 2.8 In the version , The user can use it when he doesn't need it at all ZooKeeper Operation in case of Kafka, This version will "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":" Depend on ZooKeeper The controller based on Kafka Raft Of Quorm controller ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" In the previous version , without "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"ZooKeeper"},{"type":"text","text":",Kafka Will not run ."},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":" but "},{"type":"text","text":" Managing and deploying two different systems not only doubles the complexity of operation and maintenance , Also let Kafka Become heavy , And that limits Kafka Applications in lightweight environments , meanwhile "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"ZooKeeper The partitioning feature of the system also limits "},{"type":"text","text":"Kafka Of "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":" Carrying capacity "},{"type":"text","text":"."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" from 2019 From the year onwards ,Confluent I started planning to replace it "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"ZooKeeper. This is a "},{"type":"text","text":" A fairly large project , After more than nine months of development ,KIP-500 Early access to the code has been committed to trunk in ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" for the first time , Users can do it without ZooKeeper Operation in case of Kafka."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" This is a major upgrade in Architecture , Let's always “ heavyweight ” Of Kafka Since then, it has become simple . Lightweight single process deployment can be used as ActiveMQ or RabbitMQ And so on , It is also suitable for edge scenes and scenes using lightweight hardware ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":" Why abandon the one that has been used for ten years ZooKeeper"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"ZooKeeper yes Hadoop A subproject of , Generally used to manage large scale 、 Complex server clusters , Has its own configuration file syntax 、 Management tools and deployment patterns .Kafka By the first LinkedIn Development , Later in 2011 Open source at the beginning of the year ,2014 The company was set up by the main creators in Confluent."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Broker yes Kafka The backbone of the cluster , Responsible from the producer (producer) To consumers (consumer) Reception 、 Store and send messages . Under the current framework ,Kafka The process needs to go to ZooKeeper Register some information in the cluster , such as BrokerId, And build clusters .ZooKeeper by Kafka Provides reliable metadata storage , such as Topic\/"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":" Partition metadata 、Broker data 、ACL Information, etc. "},{"type":"text","text":"."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" meanwhile "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"ZooKeeper"},{"type":"text","text":" act as Kafka The leader of , To update the topology changes in the cluster ; according to "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"ZooKeeper"},{"type":"text","text":" Notice provided , Producers and consumers find the whole Kafka Is there any new... In the cluster Broker or Broker Failure . Most o & M operations , For example, capacity expansion 、 Partition migration and so on , Both need and "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"ZooKeeper"},{"type":"text","text":" Interaction ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" in other words ,Kafka A large part of the code base is responsible for implementing multiple Broker Partition between ( It's a journal )、 Assign leadership 、 The function of distributed system such as fault handling . And has already been widely used and verified by the industry ZooKeeper Is a key part of distributed code work ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" Suppose there is no ZooKeeper Words ,Kafka Can't even start the process . Tencent cloud intermediate - Han Xin, technical director of microservice product center, said to InfoQ say ,“ In previous releases ,"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"ZooKeeper Can be said to be Kafka The soul of the cluster .”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":" But heavily dependent on ZooKeeper, Also give Kafka It's a handicap .Kafka Along the way , Two topics that can't be avoided are the complexity of cluster operation and maintenance and the partition size that a single cluster can carry , Han Xin said , For example, Tencent cloud Kafka It has maintained tens of thousands of nodes Kafka colony , The main problems are these two ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":" First of all, from the perspective of cluster operation and maintenance ,Kafka It's a distributed system in itself . But it relies on another open source distributed system , And the system is Kafka The core of the system itself . This requires the R & D and maintenance personnel of the cluster to understand the two open source systems at the same time , We need to understand its operation principle and daily operation and maintenance ( For example, parameter configuration 、 Expansion and contraction capacity 、 Monitoring alarm, etc ) Have enough understanding and operation experience . Otherwise, it can't be recovered when the cluster has problems , It's unacceptable . therefore ,ZooKeeper It increases the cost of operation and maintenance ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":" Second, from the perspective of cluster size , Limit Kafka A core indicator of cluster size is the number of partitions that a cluster can carry . The number of partitions in a cluster has two main effects on the cluster :ZooKeeper The amount of metadata stored on and the efficiency of controller changes ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Kafka Clusters depend on a single Controller Node to handle the vast majority of ZooKeeper Read write and operation and maintenance operations , And cache all of them locally ZooKeeper Metadata on ."},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":" The number of partitions increases ,ZooKeeper The amount of metadata that needs to be stored on the server will increase , So as to enlarge ZooKeeper The load of , to ZooKeeper Clusters bring pressure , May lead to Watch Delay or loss of ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":" When Controller When a node changes , Need to carry out Leader Switch 、Controller Node re-election and other behaviors , The more partitions you have, the more ZooKeeper operation :"},{"type":"text","text":" For example, when a Kafka When the node is down ,Controller You need to write ZooKeeper All of this node's Leader Partition migration to other nodes ;"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":" new Controller When the node starts ,"},{"type":"text","text":" First of all, all the ZooKeeper The metadata on is read into the local cache "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":", The more partitions , More data , The longer it takes to recover ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Kafka The number of partitions that a single cluster can carry is very important for some services , And it's particularly important . Han Xin added with an example ,“ Tencent cloud Kafka It mainly provides services for public cloud users and internal business of the company . We've met a lot of users who need to support millions of partitions , For example, Tencent cloud Serverless、 Tencent cloud's CLS The log service 、 Waiting for some customers on the cloud , The scenario they face is that a customer needs a topic To process business logic , When the number of users reaches millions and tens of millions ,topic The expansion is terrible . Under the current framework ,Kafka A single cluster can't carry millions of partitions stably . This is also my new KIP-500 The reason why the version is so excited .”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":" Remove ZooKeeper After Kafka"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/55\/559144e8a57729a28dcc5ba34b11077e.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" To improve Kafka, Since last year Confluent Start rewriting "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"ZooKeeper function , Integrate this part of the code into Kafka Inside . they "},{"type":"text","text":" Call the new version “ Kafka on Kafka”, It means storing metadata in Kafka In itself , Not storage ZooKeeper In such an external system .Quorum The controller uses a new KRaft Protocol to ensure that metadata is accurately replicated in Arbitration . This agreement is in many ways related to ZooKeeper Of ZAB The protocol and Raft be similar . It means , The arbitration controller does not need to run from ZooKeeper Loading state . When leadership changes , The new activity controller already has all committed metadata records in memory ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" Remove ZooKeeper after ,Kafka The complexity of cluster operation and maintenance is directly halved ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" Before the architecture improves , A minimum distributed algorithm Kafka The cluster also needs six heterogeneous nodes : Three ZooKeeper node , Three Kafka node . And one of the simplest Quickstart The demo also needs to start a ZooKeeper process , And then start another one Kafka process . In the new KIP-500 In the version , A distribution Kafka The cluster only needs three nodes , and Quickstart The demo just needs one Kafka The process can ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" After the improvement, the scalability of the cluster is improved (scalability), Greatly increased Kafka The number of partitions that a single cluster can carry ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" Before that , Metadata management has always been the main bottleneck of cluster scope limitation . Especially when the cluster scale is relatively large , If appear "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"Controller Election involved in node failure 、L"},{"type":"text","text":"eader Partition migration "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":", as well as "},{"type":"text","text":" Will all ZooKeeper Read metadata into local cache , All of these operations are limited to a single Controller Read and write bandwidth of . therefore "},{"type":"text","marks":[{"type":"strong"}],"text":" One "},{"type":"text","text":" individual Kafka The total number of partitions that a cluster can manage is also limited by this single Controller The efficiency of ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Confluent Wang Guozhang, chief engineer of streaming data, explains that :“ stay KIP-500 in , We use one Quorum Controller Instead of and ZooKeeper A single interaction Controller, This Quorum Each of them Controller All nodes will pass through Raft Mechanism to back up all metadata , And one of the Leader When writing new metadata , It can also be written in batch (batch writes)Raft Log to improve efficiency . our "},{"type":"link","attrs":{"href":"https:\/\/t.co\/SNQNtJ4xqb?amp=1","title":null,"type":null},"content":[{"type":"text","text":" Experiments show that "}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":", In a cluster that can manage two million partitions ,Quorum Controller The migration process can be reduced from a few minutes to 30 seconds .”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":" Whether the upgrade needs downtime ?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" Reduce dependence 、 Expand the capacity of a single cluster ,"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":" It must be a very positive change of direction . Although the current version has not yet passed the large flow test , There may be stability problems , This is also one aspect that worries developers . But in the long run ,KIP-500 It's a big boon for the community and users ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" When the version is stable , In the end, you're going to talk about “ upgrade ” The job of . But how to upgrade , It's a new problem ,"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":" In many Kafka In the use scenario of , Business downtime is not allowed ."},{"type":"text","text":" Han Xin takes Tencent cloud business as an example ,“"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":" The message pipeline of wechat security service uses Tencent cloud's Kafka, Suppose there is a shutdown , Then some automated security services will be affected , thus "},{"type":"text","text":" Serious impact on customer experience ."},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"“"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}},{"type":"strong"}],"text":" In terms of our experience , Stop and upgrade , Tencent cloud is a very sensitive word "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"."},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}},{"type":"strong"}],"text":" Tencent cloud Kafka It has been in operation for six or seven years , We have served hundreds of internal and external customers , There hasn't been a downtime upgrade yet "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":". If downtime is required to upgrade , Then it will certainly have an impact on the customer's business , The scope of the impact depends on the importance of the customer's business . From the perspective of cloud services , The sustainability of any customer's business is very important 、 Can't be affected .”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" about Confluent Speaking of , It is necessary to provide a smooth upgrade solution without downtime ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" According to Wang Guozhang ,“ The current design is , stay 2.8 After version 3.0 Version will be a special bridging version (bridge release), In this version ,Quorum Controller It will be the same as the old version based on ZooKeeper Of Controller coexistence , We'll get rid of the old ones in later versions Controller modular .”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" For users , It means that if you want to start from 2.8 Upgrade to 3.0 A later version , for instance 3.1, We need to borrow 3.0 Version is implemented twice “ jumping ”, That is to say, first upgrade to 3.0, And then once again online smooth upgrade to 3.1. And throughout the process ,Kafka The server can interact with all kinds of low version clients , Without forcing the client to upgrade ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" And companies like Tencent , It will be "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":" Continue to use grayscale method for business upgrade verification , Han Xin said ,“ Generally, large scale upgrade operations will not be performed on the stock cluster , It's going to be a new cluster , Let some urgent demands of the business first cut gray verification , Under the condition of ensuring the stable operation of online business , Gradually expand the scale of new clusters , In this way, the business will be gradually upgraded and migrated to a new architecture .”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":" The second revolution of message system ?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Apache Kafka After appearance , Soon beat other messaging systems , Become the most mainstream application . from 2011 Launched in , After ten years of development , After being applied on a large scale , Why do you decide to use “Raft agreement ” Replace ZooKeeper Well ? Regarding this , Wang Guozhang replied InfoQ say ,Raft It's a popular consensus algorithm in recent years , But in Kafka At the beginning of design (2011 year ), not only Raft programme , Even a mature and universal consensus mechanism (consensus) The code base doesn't exist either . The most direct design at that time was based on ZooKeeper Such a highly available synchronization service project ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" In these ten years ,Hadoop A lot of software in the ecosystem is being abandoned . Now , As Hadoop A member of the ecosystem ,ZooKeeper It's becoming obsolete ? Han Xin gave a denial ,“ Every architecture or software has its own suitable application scenarios , I don't approve of the word out of date .”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" From a technical point of view , The wheel of history is rolling forward , The theoretical basis of academia and industry has been evolving , Technology has to adapt to the continuous innovation of the business, constantly evolving . There is no denying that some software will be replaced by some new software , Or some new software will be more suitable for some scenarios . For example, in the field of stream computing ,Storm、Spark、Filnk Evolution of . But the right components will always be in the right place , This is the work and responsibility of architects and R & D personnel ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Kafka Up to now , Although its architecture is constantly improved , For example, the introduction of automatic scaling 、 Multi tenant and other functions , To meet the needs of user development , But for this big improvement , And there are still situations that need to be verified , Netizens are in HackerNews There's a soul question on the table :“ If you want to design one now "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":" New system "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":", So what's the reason to choose Kafka instead of Pulsar?”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Confluent My Colin · McAb (Colin McCabe) In response to this controversy, he said , At least get rid of ZooKeeper Yes Pulsar It's a tough challenge for us to face .Kafka Remove ZooKeeper Dependence is a big selling point , signify Kafka There's only one component Broker, and Pulsar You need to ZooKeeper、Bookie、Broker( perhaps proxy) Wait for multiple components . But it's also because of Pull out a layer of storage (Bookie), What makes a rising star Pulsar In architecture, it naturally has “ Computational storage separation ” The advantages of ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" in general , In the context of enterprises accelerating to the cloud , Whether it's Kafka still Pulsar, Message system must adapt to the trend of cloud nativity , The function of separating computing from storage is also Kafka The next strategy .Confluent In another KIP-405 In the version , A hierarchical "},{"type":"link","attrs":{"href":"https:\/\/cwiki.apache.org\/confluence\/display\/KAFKA\/KIP-405%3A+Kafka+Tiered+Storage","title":null,"type":null},"content":[{"type":"text","text":" Storage mode "}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":", Using the actual situation of multiple storage media in the Cloud Architecture , Separate the computing layer from the storage layer , take “ Cold data ” and “ Thermal data ” Separate , bring Kafka Partition expansion of 、 Shrinkage capacity 、 Migration and other operations are more efficient and low consumption , At the same time, it also makes Kafka You can theoretically keep the data stream for a long time ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" In the development trend , The emergence of cloud Nativity has a great impact on the message system , Such as containerization and large-scale cloud disk , It is the problem of upper limit in single cluster performance and stack limitation Kafka, Breaking through the bottleneck of resources brings new ideas .Broker The container of , Massive cloud disks are used to stack messages , Plus KRaft Removed ZooKeeper to Kafka The limitation of metadata management , Is it for Kafka It brings about the second revolution of message system ? Will the containerized message system bring more automation in operation and maintenance ?Serverless Is it the future trend of message system ? The development of cloud and cloud origin , New wings have been added to the traditional message system , It brings a new imagination ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":" Interviewing guests :"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":" Han Xin "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":", Tencent cloud intermediate - Technical director of microservice product center , Microservice platform TSF、 Message queue CKafka \/ TDMQ、 Microservice observation platform TSW And so on . Responsible for the planning of middleware related products , Structure and Implementation , More than 13 years of R & D architecture experience . At present, we focus on the related fields of cloud computing middleware , Committed to integrating PaaS Technical resources , Build technology platform based on microservice , Provide basic support for the digital transformation of enterprises ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":" Wang Guozhang "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":", Now I work in Confluent, Chief engineer of streaming data department .Apache Kafka Members of the project management committee (PMC),Kafka Streams author . He obtained his bachelor's degree and doctor's degree respectively from the computer department of Fudan University and the computer department of Cornell University , The main research direction is database management and distributed data system . Previously worked in LinkedIn Senior engineer of data architecture group , Mainly responsible for real-time data processing platform , Include Apache Kafka and Apache Samza System development and maintenance ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]}]}
版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://javamana.com/2021/04/20210416130941166i.html

  1. 【计算机网络 12(1),尚学堂马士兵Java视频教程
  2. 【程序猿历程,史上最全的Java面试题集锦在这里
  3. 【程序猿历程(1),Javaweb视频教程百度云
  4. Notes on MySQL 45 lectures (1-7)
  5. [computer network 12 (1), Shang Xuetang Ma soldier java video tutorial
  6. The most complete collection of Java interview questions in history is here
  7. [process of program ape (1), JavaWeb video tutorial, baidu cloud
  8. Notes on MySQL 45 lectures (1-7)
  9. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  10. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  11. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  12. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  13. 【递归,Java传智播客笔记
  14. [recursion, Java intelligence podcast notes
  15. [adhere to painting for 386 days] the beginning of spring of 24 solar terms
  16. K8S系列第八篇(Service、EndPoints以及高可用kubeadm部署)
  17. K8s Series Part 8 (service, endpoints and high availability kubeadm deployment)
  18. 【重识 HTML (3),350道Java面试真题分享
  19. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  20. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  21. [re recognize HTML (3) and share 350 real Java interview questions
  22. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  23. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  24. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  25. RPC 1: how to develop RPC framework from scratch
  26. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  27. RPC 1: how to develop RPC framework from scratch
  28. 一次性捋清楚吧,对乱糟糟的,Spring事务扩展机制
  29. 一文彻底弄懂如何选择抽象类还是接口,连续四年百度Java岗必问面试题
  30. Redis常用命令
  31. 一双拖鞋引发的血案,狂神说Java系列笔记
  32. 一、mysql基础安装
  33. 一位程序员的独白:尽管我一生坎坷,Java框架面试基础
  34. Clear it all at once. For the messy, spring transaction extension mechanism
  35. A thorough understanding of how to choose abstract classes or interfaces, baidu Java post must ask interview questions for four consecutive years
  36. Redis common commands
  37. A pair of slippers triggered the murder, crazy God said java series notes
  38. 1、 MySQL basic installation
  39. Monologue of a programmer: despite my ups and downs in my life, Java framework is the foundation of interview
  40. 【大厂面试】三面三问Spring循环依赖,请一定要把这篇看完(建议收藏)
  41. 一线互联网企业中,springboot入门项目
  42. 一篇文带你入门SSM框架Spring开发,帮你快速拿Offer
  43. 【面试资料】Java全集、微服务、大数据、数据结构与算法、机器学习知识最全总结,283页pdf
  44. 【leetcode刷题】24.数组中重复的数字——Java版
  45. 【leetcode刷题】23.对称二叉树——Java版
  46. 【leetcode刷题】22.二叉树的中序遍历——Java版
  47. 【leetcode刷题】21.三数之和——Java版
  48. 【leetcode刷题】20.最长回文子串——Java版
  49. 【leetcode刷题】19.回文链表——Java版
  50. 【leetcode刷题】18.反转链表——Java版
  51. 【leetcode刷题】17.相交链表——Java&python版
  52. 【leetcode刷题】16.环形链表——Java版
  53. 【leetcode刷题】15.汉明距离——Java版
  54. 【leetcode刷题】14.找到所有数组中消失的数字——Java版
  55. 【leetcode刷题】13.比特位计数——Java版
  56. oracle控制用户权限命令
  57. 三年Java开发,继阿里,鲁班二期Java架构师
  58. Oracle必须要启动的服务
  59. 万字长文!深入剖析HashMap,Java基础笔试题大全带答案
  60. 一问Kafka就心慌?我却凭着这份,图灵学院vip课程百度云