Zookeeper principle

Xixi 2021-01-23 11:05:32
zookeeper principle


1、Zookeeper Role

  • The leader (leader), Responsible for the initiation and resolution of the vote , Update system status .
  • Learners' (learner), Including followers (follower) And the observer (bserver),follower It is used to accept the client request and return the result to the client , Take part in the voting process
  • Observer Can accept client connections , Forward the write request to leader, but observer Not in the voting process , Sync only leader The state of ,observer The goal is to extend the system , Improve read speed
  • client (client), Request originator

Zookeeper At the heart of that is atomic radio , This mechanism ensures that Server Synchronization between . The protocol that implements this mechanism is called Zab agreement .Zab There are two modes of protocol , They are recovery modes ( Elector ) And broadcast mode ( Sync ). When the service starts or after the leader crashes ,Zab It's in recovery mode , When leaders are elected , And most of them Server Finished and leader After status synchronization of , Recovery mode is over . State synchronization ensures leader and Server Have the same system state .

In order to ensure the order consistency of transactions ,zookeeper Incremental transactions adopted id Number (zxid) To identify transactions . All proposals (proposal) It was added when it was proposed zxid. In the implementation zxid It's a 64 Digit number , It's high 32 Is it epoch Used to identify leader Does the relationship change , One at a time leader Be chosen , It will have a new one epoch, The identity currently belongs to that leader The reign of . low 32 Bits for increasing count .

• Every Server There are three states in the working process :
LOOKING: At present Server I do not know! leader Who is it? , Searching for
LEADING: At present Server It's an elected leader
FOLLOWING:leader It has been elected , At present Server Keep up with it

 The rest can be inspected :[https://www.cnblogs.com/lpshou/archive/2013/06/14/3136738.html](https://www.cnblogs.com/lpshou/archive/2013/06/14/3136738.html)

2、Zookeeper The reading and writing mechanism of

*Zookeeper It's one by many server A cluster of
* One leader, Multiple follower
* Every server Keep a copy of the data
* Global data consistency
* Distributed read and write
* Update request forwarding , from leader The implementation of

3、Zookeeper Guarantee

* The update request sequence goes on , From the same client The update requests of are executed in the order in which they are sent .
* Atomicity of data updates , A data update is either successful , Or failure .
* Globally unique data view ,client No matter which one is connected to server, Data views are consistent .
* The real time , Within a certain range of events ,client Can read the latest data .

4、Zookeeper Node data operation flow

notes :1. stay Client towards Follwer Make a request to write

2.Follwer Send the request to Leader

3.Leader When received, initiate voting and notify Follwer Vote

4.Follwer Send the result of the vote to Leader

5.Leader After summarizing the results, if you need to write , Start writing and notify the Leader, then commit;

6.Follwer Return the result of the request to Client

• Follower There are four main functions :
• 1. towards Leader Send a request (PING news 、REQUEST news 、ACK news 、REVALIDATE news );
• 2 . receive Leader Message and process ;
• 3 . receive Client Request , If it's a write request , Send to Leader Vote ;
• 4 . return Client result .
• Follower The message loop processing of comes from Leader The news of :
• 1 .PING news : Heartbeat message ;
• 2 .PROPOSAL news :Leader The proposal launched , requirement Follower vote ;
• 3 .COMMIT news : Information about the latest proposal on the server side ;
• 4 .UPTODATE news : Indicates that synchronization is complete ;
• 5 .REVALIDATE news : according to Leader Of REVALIDATE result , Close to revalidate Of session Or is it allowed to receive messages ;
• 6 .SYNC news : return SYNC Result to client , This message was originally initiated by the client , Used to force the latest updates .

5、Zookeeper leader The election

• By half
– 3 Taiwan machine Hang one 2>3/2
– 4 Taiwan machine hang 2 platform 2!>4/2

• A The proposal says , I want to choose myself ,B Do you agree? ?C Do you agree? ?B say , I agree to choose A;C say , I agree to choose A.( Be careful , More than half of them here , In fact, in the real world elections have been successful .

But the computer world is very strict , In addition, understand the algorithm , To continue the simulation .)
• next B The proposal says , I want to choose myself ,A Do you agree? ;A say , More than half of me have agreed to be elected , Your proposal is invalid ;C say ,A More than half have agreed to be elected ,B The proposal is invalid .
• next C The proposal says , I want to choose myself ,A Do you agree? ;A say , More than half of me have agreed to be elected , Your proposal is invalid ;B say ,A More than half have agreed to be elected ,C The proposal is invalid .
• The election has been made Leader, The back are all follower, Can only obey Leader The order of . And here's a little detail , In fact, who starts first and who is in charge .

6、zxid
• znode The state information of the node contains czxid, So what is zxid Well ?
• ZooKeeper Every change of state , All correspond to an increasing Transaction id, The id be called zxid. because zxid The incremental nature of , If zxid1 Less than zxid2, that zxid1 It must precede zxid2 happen .

Create any node , Or update the data of any node , Or delete any node , Will result in Zookeeper The state changes , Which leads to zxid The value of the increase .

7、Zookeeper working principle
» Zookeeper At the heart of that is atomic radio , This mechanism ensures that server Synchronization between . The protocol that implements this mechanism is called Zab agreement .Zab There are two modes of protocol , They are recovery mode and broadcast mode .
When the service starts or after the leader crashes ,Zab It's in recovery mode , When leaders are elected , And most of them server And leader After status synchronization of , Recovery mode is over .
State synchronization ensures leader and server Have the same system state
» once leader Already with most of follower After state synchronization , He can start broadcasting , That is to say, enter the broadcast state . At this time when a server Join in zookeeper In service , It will start in recovery mode ,
Find out leader, And on and on leader State synchronization . By the end of synchronization , It also participates in the news broadcast .Zookeeper The service has been maintained at Broadcast state , until leader Collapsed or leader Lost most of followers Support .

» The broadcast mode needs to guarantee proposal Be dealt with in order , therefore zk Incremental transactions adopted id Number (zxid) To guarantee . All proposals (proposal) It was added when it was proposed zxid.
In the implementation zxid It's a 64 The number for , It's high 32 Is it epoch Used to identify leader Does the relationship change , One at a time leader Be chosen , It will have a new one epoch. low 32 Bit is an incremental count .
» When leader Collapse or leader Lose most of follower, Now zk Enter recovery mode , The recovery model needs to re elect a new leader, Let all server All return to the right state .
» Every Server Ask others after startup Server Who is it going to vote for .
» For others server Of ,server Every time according to their own state to reply to their own recommendations leader Of id And the last time I did business zxid( Every... When the system starts server Will recommend themselves )
» Received all Server After reply , Just figure it out zxid Which is the biggest Server, And will the Server The information is set to the next time you vote Server.
» The one who gets the most votes in the process is sever For the winner , If the winner has more than half of the votes , Then change server Be selected as leader. otherwise , Continue the process , until leader To be elected

» leader Will start to wait server Connect
» Follower Connect leader, Will be the biggest zxid Send to leader
» Leader according to follower Of zxid Determine the synchronization point
» Notify... When synchronization is complete follower Has become a uptodate state
» Follower received uptodate After the news , Can accept again client Of the requests for services

8、 Data consistency and paxos Algorithm

• It is said that Paxos The difficulty in understanding the algorithm is as admirable as the popularity of the algorithm , So let's first look at how to keep the data consistent , Here's a principle :
• In a distributed database system , If the initial state of each node is the same , Each node performs the same sequence of operations , So they end up with a consistent state .
• Paxos What problem does the algorithm solve , The solution is to ensure that each node performs the same sequence of operations . ok , It's not easy ,master Maintain a
  Global write queue , All write operations must Put this queue number , So no matter how many nodes we write , As long as the write operation is numbered , Can guarantee one
Sexual nature . you 're right , this is it , But if master Hang up .
• Paxos The algorithm uses voting to globally number write operations , At the same time , Only one write operation is approved , At the same time, concurrent write operations need to win votes ,
Only writing that gets more than half of the votes will be approval ( So there will always be only one write approved ), Other write operations failed to compete and had to initiate one more
A round of voting , That's it , In the voting day after day, year after year , All write operations are strictly numbered order . The number is strictly increasing , When a node accepts a
The number is 100 Write operations for , And then I received the number 99 Write operations for ( Because of many unforeseen reasons such as network delay ), It immediately realizes that it data
It's not the same , Automatically stop the external service and restart the synchronization process . If any node is hung up, the data consistency of the whole cluster will not be affected ( total 2n+1 platform , Unless you hang up more than n platform ).

 Recommended books :《 from Paxos To Zookeeper Principle and practice of distributed consistency 》

9、Observer
• Zookeeper Need to ensure high availability and strong consistency ;
• In order to support more clients , Need to add more Server;
• Server An increase in , The delay in the voting phase increases , Affect performance ;
• Weigh scalability and high throughput , introduce Observer
• Observer Don't vote ;
• Observers Accept client connections , And forward the write request to leader node ;
• Add more Observer node , Improve scalability , At the same time, the throughput is not affected

10、  Why? zookeeper The number of clusters , It's usually odd ?
•Leader The election algorithm uses Paxos agreement ;
•Paxos The core idea : When the majority Server Write a successful , The task data is written successfully if there is 3 individual Server, Then two write success can ; If there is 4 or 5 individual Server, Then three write success can .
•Server The number is usually odd (3、5、7) If there is 3 individual Server, At most 1 individual Server Hang up ; If there is 4 individual Server, It is also allowed at most 1 individual Server Hang up from this ,

  We can see that 3 Servers and 4 The disaster recovery capability of servers is the same , So in order to save server resources , We usually use odd numbers , Number of deployed as servers .

11、Zookeeper Data model of
» Hierarchical directory structure , Naming meets general file system specifications
» Each node is in zookeeper called znode, And it has a unique path identifier
» node Znode Can contain data and child nodes , however EPHEMERAL A node of type cannot have children
» Znode The data in can have multiple versions , For example, there are multiple data versions in a certain path , Then you need to bring the version to query the data in this path
» The client application can set the monitor on the node
» The node does not support partial read / write , It's a one-time, complete read-write

12、Zookeeper The node of
» Znode There are two types of , For a short time (ephemeral) And lasting (persistent)
» Znode The type of is determined at creation time and cannot be modified later
» brief znode At the end of the client session for ,zookeeper It will be short znode Delete , brief znode There can be no child nodes
» persistent znode Does not depend on the client session , Only when the client explicitly wants to delete the persistence znode Will be deleted
» Znode There are four forms of directory nodes
» PERSISTENT( lasting )
» EPHEMERAL( Temporary )
» PERSISTENT_SEQUENTIAL( Persist the sequential numbering of the directory nodes )
» EPHEMERAL_SEQUENTIAL( Temporarily number the directory nodes in sequence )

版权声明
本文为[Xixi]所创,转载请带上原文链接,感谢
https://javamana.com/2021/01/20210123110432110i.html

  1. 【计算机网络 12(1),尚学堂马士兵Java视频教程
  2. 【程序猿历程,史上最全的Java面试题集锦在这里
  3. 【程序猿历程(1),Javaweb视频教程百度云
  4. Notes on MySQL 45 lectures (1-7)
  5. [computer network 12 (1), Shang Xuetang Ma soldier java video tutorial
  6. The most complete collection of Java interview questions in history is here
  7. [process of program ape (1), JavaWeb video tutorial, baidu cloud
  8. Notes on MySQL 45 lectures (1-7)
  9. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  10. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  11. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  12. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  13. 【递归,Java传智播客笔记
  14. [recursion, Java intelligence podcast notes
  15. [adhere to painting for 386 days] the beginning of spring of 24 solar terms
  16. K8S系列第八篇(Service、EndPoints以及高可用kubeadm部署)
  17. K8s Series Part 8 (service, endpoints and high availability kubeadm deployment)
  18. 【重识 HTML (3),350道Java面试真题分享
  19. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  20. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  21. [re recognize HTML (3) and share 350 real Java interview questions
  22. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  23. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  24. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  25. RPC 1: how to develop RPC framework from scratch
  26. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  27. RPC 1: how to develop RPC framework from scratch
  28. 一次性捋清楚吧,对乱糟糟的,Spring事务扩展机制
  29. 一文彻底弄懂如何选择抽象类还是接口,连续四年百度Java岗必问面试题
  30. Redis常用命令
  31. 一双拖鞋引发的血案,狂神说Java系列笔记
  32. 一、mysql基础安装
  33. 一位程序员的独白:尽管我一生坎坷,Java框架面试基础
  34. Clear it all at once. For the messy, spring transaction extension mechanism
  35. A thorough understanding of how to choose abstract classes or interfaces, baidu Java post must ask interview questions for four consecutive years
  36. Redis common commands
  37. A pair of slippers triggered the murder, crazy God said java series notes
  38. 1、 MySQL basic installation
  39. Monologue of a programmer: despite my ups and downs in my life, Java framework is the foundation of interview
  40. 【大厂面试】三面三问Spring循环依赖,请一定要把这篇看完(建议收藏)
  41. 一线互联网企业中,springboot入门项目
  42. 一篇文带你入门SSM框架Spring开发,帮你快速拿Offer
  43. 【面试资料】Java全集、微服务、大数据、数据结构与算法、机器学习知识最全总结,283页pdf
  44. 【leetcode刷题】24.数组中重复的数字——Java版
  45. 【leetcode刷题】23.对称二叉树——Java版
  46. 【leetcode刷题】22.二叉树的中序遍历——Java版
  47. 【leetcode刷题】21.三数之和——Java版
  48. 【leetcode刷题】20.最长回文子串——Java版
  49. 【leetcode刷题】19.回文链表——Java版
  50. 【leetcode刷题】18.反转链表——Java版
  51. 【leetcode刷题】17.相交链表——Java&python版
  52. 【leetcode刷题】16.环形链表——Java版
  53. 【leetcode刷题】15.汉明距离——Java版
  54. 【leetcode刷题】14.找到所有数组中消失的数字——Java版
  55. 【leetcode刷题】13.比特位计数——Java版
  56. oracle控制用户权限命令
  57. 三年Java开发,继阿里,鲁班二期Java架构师
  58. Oracle必须要启动的服务
  59. 万字长文!深入剖析HashMap,Java基础笔试题大全带答案
  60. 一问Kafka就心慌?我却凭着这份,图灵学院vip课程百度云