What you may not know about zookeeper

HelloGitHub 2021-04-16 19:19:07
know zookeeper


The author of this article :HelloGitHub- Lao Xun

Hi, Here is HelloGitHub To launch the HelloZooKeeper series , Free and open source 、 Interesting 、 Entry level ZooKeeper course , For beginners with programming foundation .

Project address :https://github.com/HelloGitHub-Team/HelloZooKeeper

Today I'm going to introduce ZK Hidden features of , Nonsense , Let's get started ~

One 、JMX

JMX(Java Management Extensions, namely Java Manage extensions ) Is an application for 、 equipment 、 System and other embedded management function framework .JMX It can span a series of heterogeneous operating system platforms 、 System architecture and network transmission protocol , Flexible development of seamless integrated systems 、 Network and service management applications .

Can't understand ? If you don't understand , If you've never developed JMX Or if it's used , Let me give you a brief introduction :JMX Namely Java A standard provided by , This standard can be used to transfer some needed Java The state of the object at runtime is exposed ( These objects can be called objects MBean), It is generally used to monitor or modify some configuration information at runtime .

Since we are explaining ZK, So I have a problem now , Now I have a simple one in my hand ZK The cluster is running , I want to know which node is Leader, What do I do ?

First ZK It will register some objects as MBean, And we use it directly Java Built-in tools jconsole You can see , Now let me show you , I have a simple one here ZK colony :

$ jps
5266 QuorumPeerMain
10438 Jps
4550 Launcher
5286 QuorumPeerMain
5767 JConsole
5388 QuorumPeerMain
9215 Launcher

You can see that there are three QuorumPeerMain The process of , It represents the three nodes of the cluster .

Let's use jconsole Open the tool ( Installed jdk, This tool will automatically have )

$ jconsole

Choose any one ZK The process of , Select the connection .


JMX You can not only see the properties of the object, but also perform some methods , such as :


All the operations shown in the figure can be called by a button on the right ( You can also pass on the parameters ), About ZK in JMX More details , I won't disclose it for the time being , Then I have the chance to explain it alone , Anyway, you just need to know JMX It's just that some ordinary Java The object is exposed , You can use tools to view properties or call a standard method of the object .

Two 、 Four word command And Admin Server

Just that JMX It's troublesome to check , Because now we test access to my local process , If it's remote JVM process , use jconsole It's even more troublesome to visit , Is there a simpler way . There must be !ZK It supports some four word commands (4lw) It is used to interact with the server .

I'm going to do a simple demonstration here , The client ports of my local cluster are :2181、2182、2183, I passed telnet Command to connect to any previous node :

$ telnet localhost 2181
Trying ::1...
Connected to localhost.
Escape character is '^]'.

You're in interactive mode , Then input srvr Press enter , You get the following output

$ telnet localhost 2181
Trying ::1...
Connected to localhost.
Escape character is '^]'.
Zookeeper version: 3.6.2--803c7f1a12f85978cb049af5e4ef23bd8b688715, built on 09/04/2020 12:44 GMT
Latency min/avg/max: 0/3.8261/44
Received: 30
Sent: 29
Connections: 1
Outstanding: 0
Zxid: 0x100000009
Mode: leader
Node count: 6
Proposal sizes last/min/max: 48/48/94
Connection closed by foreign host.

srvr The command is used to view the status of the service node , From the output Mode You can see , monitor 2181 This node of the port is Leader, Let's change another one 2182 Look at the node ,Mode Namely Follower.

$ telnet localhost 2182
Trying ::1...
Connected to localhost.
Escape character is '^]'.
Zookeeper version: 3.6.2--803c7f1a12f85978cb049af5e4ef23bd8b688715, built on 09/04/2020 12:44 GMT
Latency min/avg/max: 0/0.0/0
Received: 3
Sent: 3
Connections: 1
Outstanding: 0
Zxid: 0x100000009
Mode: follower
Node count: 6
Connection closed by foreign host.

Here I want to mention that the default four word command is not all on , If you want to enable all four word commands, you need to specify... In the environment variable zookeeper.4lw.commands.whitelist=*, Or by listing specific commands ( Comma separated ) To enable some of the specified four word commands . Let's change the order , such as envi, The environment parameters of the current node will be output

os.name=Mac OS X

I'll list all the four word commands here , No more demonstrations , Leave it to the reader to try , Four word command list on the official website , The four character command is actually the alias of the command on the right , The effect is exactly the same .(* Still said TODO, And then I'll start with )

command The function of returning data
conf / configuration Configuration information ( frequently-used dataDirclientPort etc. )
cons / connections Connection information
crst / connection_stat_reset Reset connection Statistics
dump Session information and temporary nodes
envi / environment Environment variable information (os.nameuser.home etc. )
ruok Whether the service is normal
srst / stat_reset Reset statistics
srvr / server_stats Overview of server information
stat Server side information statistics
wchs / watch_summary Callback watcher A summary of
wchc / watches Registered callbacks session information aggregation
dirs log and snap File byte size
wchp / watches_by_path Register the path information of the callback
mntr / monitor All the monitoring information
isro / is_read_only Whether the current node is read-only
hash Digital summary
gtmk / get_trace_mask* Get the trace mask
stmk / set_trace_mask* Set the trace mask
lsnp / last_snapshot Information from the last snapshot
icfg / initial_configuration Initial configuration of server startup
orst / observer_connection_stat_reset Reset Observer Connection Statistics
obsr / observers obtain Observer Information about
sysp / system_properties Environment variable information and envi The difference is that it will return zookeeper Custom configuration at the beginning
lead / leader Whether the current node is Leader
voting_view Cluster vote information
zabstate ZAB Status information summary

I saw it ruok, Used to check whether the server node is started ( To be able to successfully return does not mean to be able to provide external services )


In addition to telnet Mode to call ZK The four word command provided ,ZK It also provides a more friendly way to Admin Server.

ZK After the start , By default, it will listen to the local 8080 Port and start a Jetty The container serves as Web The server , If you want to access /commands The path will get :


Direct access to these hyperlinks can have the same effect as the previous four word command ~ You can also directly URL Visit ip:port/commands/<commandName>.

With mntr For example , You can directly access http://localhost:8080/commands/mntr perhaps http://localhost:8080/commands/monitor It's all the same .


because Admin Server It is enabled by default , And accept from any IP Request , For security reasons, you can configure environment variables zookeeper.admin.serverAddress=, Add requests like this IP Or directly through zookeeper.admin.enableServer=false Ban Admin Server.

ZK The official recommendation is to use it directly Admin Server Of , To replace the four word command on the command line , Based on these official interfaces, we can do some ZK Monitoring platform .

3、 ... and 、 Dynamic configuration

ZK In general, the cluster configuration is done by reading the configuration file at startup , It won't change after that , And if I want to add a new node to the cluster , You need to modify the configuration file and restart it to take effect .

But in 3.5.0 after , ZK Updated the function of dynamic configuration , Cluster configuration no longer requires downtime reconfiguration , It can be modified directly at run time , You can directly add or delete nodes for the cluster , Modify their roles , You can even modify the counting rules of the cluster ! Is it a blockhouse !


3.1 The counting rules

Before that , Let me introduce you ZK Two kinds of counting rules supported .

3.1.1 Over half mechanism

This is a ZK Default counting rules , For various server clusters, you need ACK Scene , Suppose the current configuration is like this :


because Observer It won't count in the vote , The actual participating machines are the first three nodes :1、2、3

I don't care Leader Who is it? , The default counting rule requires at least two of the three nodes to be submitted successfully ACK( Or other information that needs to be counted ), This election ( Or a proposal ) Can continue to be submitted , That's more than half the mechanism .

3.1.2 Group weights

ZK It also provides a new counting rule , This rule supports dividing each node into different groups ( Of course, there can be only one group ), Different nodes in the same group can also be assigned different weights , Let me give you an example :



group Beginning and weight The beginning corresponds to the grouping and weight configuration respectively , The regulations are as follows :

  • group The format is group.<groupId>=<serverId>:<serverId>...
  • weight The format is weight.<serverId>=<weight>
  • serverId Each service node is configured in myid Number in
  • Each node can only belong to one group

Let's continue with my current configuration , Now there are three group, The respective weights are calculated as follows :

group1  The weight sum of  = server1  The weight of  + server2  The weight of  + server3  The weight of  = 1 + 1 + 1 = 3
group2  The weight sum of  = server4  The weight of  + server5  The weight of  + server6  The weight of  + server7  The weight of  + server8  The weight of = 1 + 1 + 1 + 1 + 1 = 5
group3  The weight sum of  = server9  The weight of  = 1

If it works now ACK The service nodes of are 1、4、5、8、9 And counting votes with this configuration

  • First of all to see group1 Only server1 Successful reply ACK, The weight value is 1, Did not exceed group1 Weight summation 3 More than half , therefore group1 amount to ACK failed
  • Look again group2 Yes 4、5、8 Three nodes responded successfully ACK, The weight value is 3, More than the group2 Weight summation 5 More than half of (3 > 5/2), therefore group2 ACK success
  • And then look group3, Because there is only one node 9, And successfully reply ACK, So also satisfied more than group3 Weight summation 1 More than half of (1 > 1/2), therefore group3 ACK success
  • Finally, the statistics are successful ACK Of group Whether the quantity exceeds the whole group More than half of the quantity , Now there is 2 individual group success ACK(2 > 3/2) , So the final ACK adopt

It looks like the default half mechanism ,No,No,No. The reason I look similar here is because I set the weights to 1, If it's set to another number ?


In the scene just now, only group1 Of ACK In the end, it failed , The reason is because only server1 A node successfully replied , But if I put group1 Change the weight of ( The other two group omitted )



Now? group1 The sum of all the weights becomes 3 + 1 + 1 = 5,server1 The weight of is 3 了 , Even if it's the only node that replies , Also exceeded group1 More than half of (3 > 5/2), If it is the weight configuration at this time ,group1 It's also a success ACK Of .

There is another point that must be made , stay ZK When you read these configurations, each one is calculated group The weight sum of , If you calculate one of them group The sum of the weights is 0, Then group Has been removed from the counting rules .

Unlike the default half mechanism , With weight configuration , You can make Observer Participating .

3.2 The recommended configuration

Said along while , How to enable the voting rule of this weight ?

  • ( recommend ) stay zoo.cfg Middle configuration dynamicConfigFile Option is used to specify the dynamically configured path address , Will all server 、group and weight The initial configuration is moved to the configuration file of the path .
  • Will all server 、group and weight The initial configuration is directly in zoo.cfg In file

As long as it's in the configuration file ZK Found to have group perhaps weight Configuration at the beginning , It means enabling the weight counting rule , Otherwise, use the default half mechanism .

Our previous server The prefix configuration is like this :


Actually server The configuration format should be like this , The client port can be configured in server The configuration of the ( After the last semicolon )


If it's configured like this ,zoo.cfg There is no need to configure clientPort Options. .

So according to the recommended configuration ,zoo.cfg That's it ( Please adjust the path according to the reader's computer )


And I am /Users/junjiexun/develop/zk/zk01/conf/zoo.dyn.cfg You can configure




More details can be found in the official documents

3.3 Dynamic modification

So here comes the question , After I configure it like this , How can we dynamically add or delete nodes to the cluster ?

Java The client of provides a getConfig Methods

ZooKeeper client = new ZooKeeper(""3000null);
byte[] config = client.getConfig(falsenull);
System.out.println(new String(config));

The printed result is


This information looks the same as what we configured zoo.dyn.cfg It's a bit like it, but it's a little different , It's the difference ZK Help us automatically fill in the format , And this returned data ZK Where does it exist ?ZK At startup, the following nodes will be created in the root path by default


and getConfig The data returned is actually /zookeeper/config Node data , And the permission of this node is only Read, I don't believe you use it. getData try , The data returned is the same

ZooKeeper client = new ZooKeeper(""3000null);
byte[] config = client.getData("/zookeeper/config"falsenull);
System.out.println(new String(config));

Now you can get this configuration , How to modify it ?ZK There are two ways of doing this : Command line 、Java API.

If you use Java Built in command line tools , One of the supported commands reconfig command , Parameter is :

reconfig [-s] [-v version] [[-file path] | [-members serverID=host:port1:port2;port3[,...]*]] | [-add serverId=host:port1:port2;port3[,...]]* [-remove serverId[,...]*]

The other is to use Java Client code , What we've been using before is ZooKeeper This class , He also has a subclass called ZooKeeperAdmin, This subclass has reconfigure Method can modify the configuration , Now let me demonstrate , But before that, I have to explain the features of dynamic configuration modification

  • Dynamic modification configuration is divided into : Incremental and non incremental ways
  • Because what's actually being modified is /zookeeper/config Node data , By default, this node only has Read jurisdiction , So you can either use the administrator's permission to modify it directly , Or you can configure it in an environment variable zookeeper.skipACL=yes skip ACL The check
  • When using incremental mode to modify configuration , The counting rule of cluster must be more than half mechanism !
  • When modifying the configuration in a non incremental way , Both mechanisms can .
  • take Follower Remove... From the cluster configuration , It's just equivalent to demoting it to Observer, It is still able to provide external services , And it can also be accepted that Leader The news of
  • take Leader When removed from the cluster configuration , Will cause a significant performance impact , The whole cluster is selecting new Leader It was impossible to provide external services before , Please try not to
  • It's much easier to add nodes , New nodes will automatically connect with Leader Synchronize data

3.3.1 Incrementally delete nodes

Suppose I have a total of 3 Nodes , It's more than half a mechanism ( It has to be ), Three ID Namely 10000000000、2、3, We try to ID by 3 Delete the node of , I use direct configuration here skipACL Skip permission verification ( The same below )

ZooKeeperAdmin client = new ZooKeeperAdmin(""3000null);
List<String> leavingServers = new ArrayList<>();
byte[] reconfigure = client.reconfigure(null, leavingServers, null, -1null);
System.out.println(new String(reconfigure));

The data returned by this interface is /zookeeper/config Configuration information after modification , You can see the new configuration and 3 The data disappears


This version=400000004 What is it ?ZK Version control is also provided by default for the modified configuration , After successful startup, it will be configured in your dynamicConfigFile Automatically generate a file under the path , I am here zoo.cfg.dynamic.300000000 Readers may not be the same as mine . This 300000000 It's the version number , And when I put ID by 3 After deleting the node of ,ZK Another file was generated automatically zoo.cfg.dynamic.400000004 This 400000004 It's the new version number , If we have requirements for the version number of the current cluster configuration during modification, we can change it in reconfigure The fourth parameter in the method can be filled with the required target version number , My example is -1 Represents ignoring the version number , and delete、setData Of version Field is an intention .

3.3.2 Incrementally add nodes

Let's take the node again 3 Add it back

ZooKeeperAdmin client = new ZooKeeperAdmin(""3000null);
List<String> joiningServers = new ArrayList<>();
byte[] reconfigure = client.reconfigure(joiningServers, nullnull, -1null);
System.out.println(new String(reconfigure));

The new configuration is


node 3 Add it back , And the version number has changed , One more zoo.cfg.dynamic.400000013 file

3.3.3 Non incremental

ZooKeeperAdmin client = new ZooKeeperAdmin(""3000null);
List<String> newMembers = new ArrayList<>();
byte[] reconfigure = client.reconfigure(nullnull, newMembers, -1null);
System.out.println(new String(reconfigure));

In this case, it is equivalent to ID by 2 The node of has been deleted ( Of course, you can also add new nodes , I won't demonstrate it here )

3.4 Section

There are actually three different ways Java API Three parameters of joiningServersleavingServersnewMembers, and Java API In addition to using List You can also use String( Comma separated ) The same effect can be achieved . Dynamic addition and deletion of service nodes enables us to adjust the entire ZK The service capability of cluster ( I think dynamic increase is more useful ).

The voting rule of grouping weight provides a new voting strategy , Especially with dynamic configuration , You can modify the weights at run time , But overall , The voting rules of grouping weight are rather weak , I don't know what kind of scenarios I can use ( High performance machines can be more powerful ? The problem is that it's all cloud services right now , Containerized and virtualized , The configuration of the machine can be adjusted dynamically , And the general machine configuration is the same , I can't think of any use )

If readers have ideas about the use of grouping weight, they can share them with you ~ For more dynamic configuration, please refer to Official documents

Four 、ZK monitor

ZK 3.6 After that, I added Metrics yes ZK Provide the monitoring indicators that users can query , It's also said on the official website that it can be combined Prometheus perhaps Grafana To use . what ? You haven't used it at all , I haven't even heard of ! That's a coincidence. I haven't played these two things before , Take this opportunity to learn with you , Create a Hello World, But because this series is based on ZK Dominant , So it's easy to configure ,Let's GO!

4.1 Prometheus

Mac It's very easy to install , Other platforms can go to the official website to download compressed packages

$ brew install prometheus

My default installation path here is /usr/local/Cellar/prometheus/2.23.0

Before that, we need to be in ZK Node configuration zoo.cfg Add two lines


I'm the three local nodes , The other two nodes need to be changed to 7001 and 7002, Because ports can't be duplicated

After the modification Prometheus Default configuration , On my computer, the path is /usr/local/etc/prometheus.yml It is amended as follows

  scrape_interval: 15s
  - job_name: "test-zk"
      - targets: ["localhost:7000", "localhost:7001", "localhost:7002"]

job_name You can get up , The key is targets Destination address and scrape_interval Access interval , After modification , You can start it Prometheus

cd /usr/local/Cellar/prometheus/2.23.0
$ ./bin/prometheus --config.file=/usr/local/etc/prometheus.yml

And then visit localhost:9090 You can see the following interface :


Just tick it Enable autocomplete You can type it in the input box , You'll get a hint right away , I'll input a few parameters here


That's the simple demonstration , The rest is for the readers ~

4.2 Grafana

Mac install Grafana It's very simple

$ brew install grafana

After installation , It can be started by command

$ grafana-server --config=/usr/local/etc/grafana/grafana.ini --homepath /usr/local/share/grafana --packaging=brew cfg:default.paths.logs=/usr/local/var/log/grafana cfg:default.paths.data=/usr/local/var/lib/grafana cfg:default.paths.plugins=/usr/local/var/lib/grafana/plugins

Grafana The default port is 3000, visit localhost:3000 The default user name and password are admin, You can see the home page


Grafana Nature supports Prometheus Data source , You can add


The default configuration just needs to be modified URL( The default is localhost:9090)

Then you need to add one dashboard The template of ,ZK The official provided us with a template , Sweet!


10465 Where does this figure come from ? Templates for official documents

Be accomplished ! Than Prometheus It's much better ~


About ZK That's all for monitoring . Traditional kung fu , nudges ~

5、 ... and 、ZK Visual open source project introduction

Use the command line ZK It's too troublesome , So visualization is necessary , Here are some good visualization clients , There are local clients , There are plenty of them Web service , Get it on demand ~

  • PrettyZoo:https://github.com/vran-dev/PrettyZoo, visualization GUI client , Each platform has installation files , Connection required ZK When serving , It's very convenient to have such a tool at hand
  • zkdash:https://github.com/ireaderlab/zkdash,JavaScript + Python visualization Web client , It's a device that can run directly Web service , The disadvantage is that Python2.7 Developed , If there is no need for secondary development, there will be no problem
  • zoonavigator-web:https://github.com/elkozmon/zoonavigator-web,TypeScript Written visualizations Web client , It's a device that can run directly Web service
  • visual-zookeeper:https://github.com/ghostg00/visual-zookeeper,Electron + React Write the client

6、 ... and 、 Last

Old rules , If you have any questions about the article, it can also be suggestions or right ZK Questions about the principle part , Welcome to the warehouse , perhaps Read the original Let's talk about the topic .


Share GitHub The interesting 、 Entry-level open source projects .
256 Original content
official account

「 Click to follow 」 More surprises waiting for you !


  1. 【计算机网络 12(1),尚学堂马士兵Java视频教程
  2. 【程序猿历程,史上最全的Java面试题集锦在这里
  3. 【程序猿历程(1),Javaweb视频教程百度云
  4. Notes on MySQL 45 lectures (1-7)
  5. [computer network 12 (1), Shang Xuetang Ma soldier java video tutorial
  6. The most complete collection of Java interview questions in history is here
  7. [process of program ape (1), JavaWeb video tutorial, baidu cloud
  8. Notes on MySQL 45 lectures (1-7)
  9. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  10. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  11. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  12. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  13. 【递归,Java传智播客笔记
  14. [recursion, Java intelligence podcast notes
  15. [adhere to painting for 386 days] the beginning of spring of 24 solar terms
  16. K8S系列第八篇(Service、EndPoints以及高可用kubeadm部署)
  17. K8s Series Part 8 (service, endpoints and high availability kubeadm deployment)
  18. 【重识 HTML (3),350道Java面试真题分享
  19. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  20. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  21. [re recognize HTML (3) and share 350 real Java interview questions
  22. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  23. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  24. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  25. RPC 1: how to develop RPC framework from scratch
  26. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  27. RPC 1: how to develop RPC framework from scratch
  28. 一次性捋清楚吧,对乱糟糟的,Spring事务扩展机制
  29. 一文彻底弄懂如何选择抽象类还是接口,连续四年百度Java岗必问面试题
  30. Redis常用命令
  31. 一双拖鞋引发的血案,狂神说Java系列笔记
  32. 一、mysql基础安装
  33. 一位程序员的独白:尽管我一生坎坷,Java框架面试基础
  34. Clear it all at once. For the messy, spring transaction extension mechanism
  35. A thorough understanding of how to choose abstract classes or interfaces, baidu Java post must ask interview questions for four consecutive years
  36. Redis common commands
  37. A pair of slippers triggered the murder, crazy God said java series notes
  38. 1、 MySQL basic installation
  39. Monologue of a programmer: despite my ups and downs in my life, Java framework is the foundation of interview
  40. 【大厂面试】三面三问Spring循环依赖,请一定要把这篇看完(建议收藏)
  41. 一线互联网企业中,springboot入门项目
  42. 一篇文带你入门SSM框架Spring开发,帮你快速拿Offer
  43. 【面试资料】Java全集、微服务、大数据、数据结构与算法、机器学习知识最全总结,283页pdf
  44. 【leetcode刷题】24.数组中重复的数字——Java版
  45. 【leetcode刷题】23.对称二叉树——Java版
  46. 【leetcode刷题】22.二叉树的中序遍历——Java版
  47. 【leetcode刷题】21.三数之和——Java版
  48. 【leetcode刷题】20.最长回文子串——Java版
  49. 【leetcode刷题】19.回文链表——Java版
  50. 【leetcode刷题】18.反转链表——Java版
  51. 【leetcode刷题】17.相交链表——Java&python版
  52. 【leetcode刷题】16.环形链表——Java版
  53. 【leetcode刷题】15.汉明距离——Java版
  54. 【leetcode刷题】14.找到所有数组中消失的数字——Java版
  55. 【leetcode刷题】13.比特位计数——Java版
  56. oracle控制用户权限命令
  57. 三年Java开发,继阿里,鲁班二期Java架构师
  58. Oracle必须要启动的服务
  59. 万字长文!深入剖析HashMap,Java基础笔试题大全带答案
  60. 一问Kafka就心慌?我却凭着这份,图灵学院vip课程百度云