[big data beeps 20210124] someone asked me about Kafka leader election? I'm not in a panic

Big data is fun 2021-02-23 16:45:33
big data beeps kafka leader

A message is only to be ISR All in Follower From the Leader Copy the past to be considered submitted . This prevents some data from being written in Leader, I haven't got any time to be Follower Replication goes down , Data is lost . And for Producer for , It can choose whether to wait for messages commit, This can be done by request.required.acks To set up . This mechanism ensures that as long as ISR One or more of them follower, One of them is commit You won't lose your message .

What is? ISR? Reference here :【 Big data beeps 20210123】 Don't ask , Asking is Kafka Highly reliable

There is a very important problem when Leader It's down. , How to be in Follower To elect a new Leader, because Follower Maybe a lot behind or directly crash 了 , So make sure you choose the latest Follower As new Leader. One basic principle is this , If Leader No more , new Leader Must have the original Leader commit All the news of . This requires a compromise , If Leader In a message by commit Wait for more Follower confirm , So after it goes down, there are more Follower Can be new Leader, But it can also cause a drop in throughput .

A very common election Leader The way is “ The minority is subordinate to the majority “,Kafka Not in this way . In this mode , If we had 2f+1 Copies , So in commit You have to make sure you have f+1 individual replica Copy the message , At the same time, in order to ensure that the new Leader, The number of failed copies cannot exceed f individual . There's a big advantage to this approach , The delay of the system depends on the fastest machines , In other words, for example, the number of copies is 3, So the delay depends on the fastest one Follower Not the slowest one .“ The minority is subordinate to the majority ” There are also some disadvantages in the way of , In order to ensure Leader The normal conduct of the election , It can tolerate failure Follower The number is relatively small , If you want to tolerate 1 individual Follower Hang up , Well, at least 3 More than copies , If you want to tolerate 2 individual Follower Hang up , There must be 5 More than copies . in other words , In order to ensure high fault tolerance in the production environment , There has to be a lot of copies , And a large number of copies will lead to a sharp decline in performance in the case of large amounts of data . This algorithm is more used in Zookeeper This shared cluster configuration is rarely used in systems that require large amounts of data .HDFS Of HA Functionality is also based on “ The minority is subordinate to the majority ” The way , But its data storage is not in this way .

actually ,Leader There are a lot of election algorithms , such as Zookeeper Of Zab、Raft as well as Viewstamped Replication. and Kafka What is used Leader The election algorithm is more like Microsoft's PacificA Algorithm .

Kafka stay Zookeeper Where is each Partition Dynamic maintenance of a ISR, This ISR All in replica They're all following Leader, Only ISR In order to be elected to Leader The possibility of (unclean.leader.election.enable=false). In this mode , about f+1 Copies , One Kafka Topic Can guarantee not to lose already commit Toleration on the premise of news f Failed copies , In most usage scenarios , This model is very beneficial . in fact , In order to tolerate f Failed copies ,“ The minority is subordinate to the majority ” And ISR stay commit The number of copies to wait before is the same , however ISR The total number of copies required is almost “ The minority is subordinate to the majority ” Half of the way .

Mentioned above , stay ISR At least one of them Follower when ,Kafka You can make sure that you have commit Data of is not lost , But if one of them Partition All of the replica It's all gone , There is no guarantee that the data will not be lost . In this case, there are two possible solutions :

  • wait for ISR Any one of replica“ live ” To come over , And choose it as Leader
  • Select first “ live ” Over here replica( It doesn't have to be ISR in ) As Leader

This requires a simple choice between usability and consistency . Be sure to wait if ISR Medium replica“ live ” To come over , The unavailability time may be relatively long . And if the ISR All of the replica No way “ live ” Over here. , Or the data is missing , This Partition Will never be available . Select first “ live ” Over here replica As Leader, And this replica No ISR Medium replica, So even if it doesn't guarantee that it already includes everything that has been commit The news of , It will also become Leader And as a Consumer Data source . By default ,Kafka Use the second strategy , namely unclean.leader.election.enable=true, You can also set this parameter to false To enable the first strategy .

unclean.leader.election.enable This parameter is for leader The election of 、 The availability of the system and the reliability of the data have a crucial impact . Let's analyze several typical scenarios .

If the picture above shows , Suppose that one Partition The number of copies in is 3,replica-0, replica-1, replica-2 Store separately Broker0, Broker1 and Broker2 in .AR=(0,1,2),ISR=(0,1). Set up request.required.acks=-1, min.insync.replicas=2,unclean.leader.election.enable=false. There will be Broker0 Also known as Broker0 At first Broker0 by Leader,Broker1 by Follower.

  • When ISR Medium replica-0 appear crash situations ,Broker1 Vote for the new Leader[ISR=(1)], Because of receiving min.insync.replicas=2 influence ,write Can't serve , however read Can continue normal service . In this case, the recovery plan :

  1. Try to recover ( restart )replica-0, If you can get up , The system is OK ;2. If replica-0 Cannot resume , Need to put min.insync.replicas Set to 1, recovery write function .
  • When ISR Medium replica-0 appear crash, Then replica-1 There have been crash, here [ISR=(1),leader=-1], Can't provide external services , In this case, the recovery plan :

  1. Try to recover replica-0 and replica-1, If you can get up , Then the system returns to normal ;
  2. If replica-0 get up , and replica-1 Can't get up , At this time, we still can't choose Leader, Because when setting unclean.leader.election.enable=false when ,leader Only from ISR The middle election , When ISR After all the copies in are invalid , need ISR The last invalid copy can be restored before the election Leader, namely replica-0 Fail first ,replica-1 Post failure , need replica-1 You can't vote until you recover Leader. Conservative proposal setting unclean.leader.election.enable=true, But there will be data loss , This will restore read service . The same needs to be done min.insync.replicas Set to 1, recovery write function ;
  3. replica-1 recovery ,replica-0 Cannot resume , This situation has been encountered above ,read Service available , Need to put min.insync.replicas Set to 1, recovery write function ;
  4. replica-0 and replica-1 Can't recover , This situation can be referred to the situation 2.
  • When ISR Medium replica-0,replica-1 Simultaneous downtime , here [ISR=(0,1)], Can't provide external services , In this case, the recovery plan : Try to recover replica-0 and replica-1, When any one of these copies returns to normal , It can be provided to the outside world read service . until 2 Copies back to normal ,write Function can be restored , Or will min.insync.replicas Set to 1.

This article is from WeChat official account. - Big data is fun (havefun_bigdata)

The source and reprint of the original text are detailed in the text , If there is any infringement , Please contact the yunjia_community@tencent.com Delete .

Original publication time : 2021-01-24

Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .

本文为[Big data is fun]所创,转载请带上原文链接,感谢

  1. J2EE
  2. Vue uses SDK to upload seven cows
  3. k8s-dns
  4. JavaScript mailbox verification - regular verification
  5. k8s-dashboard
  6. How many questions can you answer?
  7. Spring annotation -- transactional
  8. [k8s cluster] construction steps
  9. k8s-kubeadm
  10. k8s-etcd
  11. Using HashMap to improve search performance in Java
  12. There is no class problem when Maven publishes jar package
  13. JavaScriptBOM操作
  14. J2EE
  15. k8s-prometheus-memory
  16. k8s-prometheus disk
  17. k8s-prometheus
  18. JavaScript BOM operation
  19. k8s-prometheus-memory
  20. k8s-prometheus disk
  21. k8s-prometheus
  22. Linux Disk Command
  23. Linux FS
  24. 使用docker-compose &WordPress建站
  25. Linux Command
  26. This time, thoroughly grasp the depth of JavaScript copy
  27. Linux Disk Command
  28. Linux FS
  29. Using docker compose & WordPress to build a website
  30. Linux Command
  31. 摊牌了,我 HTTP 功底贼好!
  32. shiro 报 Submitted credentials for token
  33. It's a showdown. I'm good at it!
  34. Shiro submitted credentials for token
  35. Linux Stress test
  36. Linux Root Disk Extension
  37. Linux Stress test
  38. Linux Root Disk Extension
  39. Redis高级客户端Lettuce详解
  40. springboot学习-综合运用(一)
  41. 忘记云服务器上MySQL数据库的root密码时如何重置密码?
  42. Detailed explanation of lettuce, an advanced client of redis
  43. Springboot learning integrated application (1)
  44. Linux File Recover
  45. Linux-Security
  46. How to reset the password when you forget the root password of MySQL database on the cloud server?
  47. Linux File Recover
  48. Linux-Security
  49. LiteOS:盘点那些重要的数据结构
  50. Linux Memory
  51. Liteos: inventory those important data structures
  52. Linux Memory
  53. 手把手教你使用IDEA2020创建SpringBoot项目
  54. Hand in hand to teach you how to create a springboot project with idea2020
  55. spring boot 整合swagger2生成API文档
  56. Spring boot integrates swagger2 to generate API documents
  57. linux操作系统重启后 解决nginx的pid消失问题
  58. Solve the problem of nginx PID disappearing after Linux operating system restart
  59. JAVA版本号含义
  60. The meaning of java version number