We know 「 Master slave replication is the cornerstone of high availability 」, When the slave library is down, the request can still be sent to the master library or other slave libraries , however Master Downtime , Can only respond to read operations , The write request can no longer be executed .
So the master-slave replication architecture faces a serious problem , The main library is down , Unable to execute 「 Write operations 」, Can't automatically select one Slave Switch to a Master, That is, it can't fail over automatically .
Late at night with my girlfriend ……( Omit here 10000 word ), Sudden downtime , You can't lift your pants up from the bed and switch between master and slave by hand , Then inform other programmers to change the address to the new main library online .
After such a toss, I have been switched from my girlfriend to my ex boyfriend , I can't do it . So we have to have a highly available solution , So ,Redis The government provides a highly available solution —— sentry (Sentinel).
Redis The principle of sentry group
“ The iteration of technology is very fast , But the thinking precipitated from technology benefits for life . So don't worry about midlife crisis , People who are worried about midlife crisis usually have a hard time growing up . As long as we grow up , As long as our cognition is constantly breaking through , You don't have to worry about midlife crisis , The world always needs those talents . ”
“65 Brother : Margo , Although I don't have a girlfriend , however , Prepare for a rainy day, I want to master this sentinel mode , To prevent me from being disturbed with my girlfriend in the middle of the night , Let's talk about the realization principle of sentry . ”
Three sentries are used to form a cluster , Three data nodes ( One master and two slaves ) Way to build , As shown in the figure below :
Redis The sentry cluster
The construction of sentry group The demonstration will not be repeated here , Readers in need can click on the bottom left corner 「 Read the original 」 see .
65 Brother, you've heard of 「 Wudang sect 」 Founder Zhang San is crazy ?Redis Master slave architecture is like Wudang , It's the leader Master. If the leader hangs up , You need to choose an able person from the seven swordsmen of Wudang to be the leader . This requires a department to monitor the life and death of the leader and the life status of other Wudang disciples , And can vote from Wudang disciples to elect a capable person as the new leader , Then a press conference will be held to announce the new leader's message to the world . This 「 department 」 It's the sentry .
Sentinels will encounter the following problems in electing a new leader :
The main task of the sentinel department is : Monitoring the whole Wudang 、 Choose a new leader , Inform the whole Wudang and the whole Wulin .
The sentry is Redis A mode of operation of , It's focused on Redis example ( Master node 、 From the node ) Monitoring the operation status of , When the master node fails, a series of mechanisms can be used to select the master and switch between the master and the slave , Achieve failover , Make sure that the whole Redis Availability of the system . combination Redis Of Official documents :https://redis.io/topics/sentinel, You can know Redis Sentinels have the following capabilities :
Sentinel is also a Redis process , It's just that we don't provide external reading and writing services , Usually, the sentry should be configured as an odd number , Why? ? And listen to 「 Code byte 」 Analyze slowly .
“65 Brother : In the end 「 sentry 」 How does this mysterious department realize these three abilities ? ”
Let's look at the Sentinels from the whole picture , A brief understanding of the whole operation process , Then we will analyze each task in detail . Start with monitoring …...
Sentinel It's just a special department of Wudang disciples , By default ,Sentinel Pass the message to all Wudang disciples once a second through flying pigeons 、 The leader and the sentry ( Include Master、Slave、 other Sentinel , ) send out PING command , If slave Did not respond within the specified time 「 sentry 」 Of PING command ,「 sentry 」 I thought this guy might be belching , He will be recorded as 「 Offline status 」;
If master The leader didn't respond at the specified time 「 sentry 」 Of PING command , The sentry decided that the leader was off the line , Start execution 「 Automatic switch master representative or leader in a certain field 」 The process of .
PING There are two ways to reply to an order :
“65 Brother : How do sentinels judge 「 representative or leader in a certain field 」 Hiccups ? What should I do if the leader swindles the corpse ? ”
In order to prevent the leader from 「 Feign death 」,「 sentry 」 Designed 「 Subjective offline 」 and 「 Objective offline 」 Two signals .
Sentinels use PING Command to detect the leader 、 slave The state of life . If it's an invalid reply , The sentry marked this guy as 「 Subjective offline 」. It's Wudang boy detected , That is to say slave role . Then mark it directly 「 Subjective offline 」.
because master The leader is still ,slave My belch has little influence on Wudang . It's still open for meetings , Martial arts and swordsmanship 、 Eat and drink hot …...
If it's detected to be master The leader is finished , At this time, the Sentry can't simply mark 「 Subjective offline 」, Open a new leader election .
Because there may be misjudgment , The leader didn't belch , Once the leader switch is activated , Subsequent electors 、 Call for a press conference ,slave Take time with new master Synchronizing data consumes a lot of resources .
therefore 「 sentry 」 To reduce the probability of miscarriage of justice , Miscalculation usually occurs when the cluster network is under great pressure 、 Network congestion , Or when the main reservoir itself is under high pressure .
Since it's easy for a person to misjudge , Let's vote together . The sentry mechanism is similar , The cluster mode composed of multiple instances is adopted for deployment , This is the sentry group . Introduce several sentinel examples to judge together , You can avoid a single sentry because your network is not good , And misjudge that the main database is offline .
meanwhile , The probability of multiple sentinel networks being unstable at the same time is small , They make decisions together , The miscalculation rate can also be reduced .
Judge master There can't be only one 「 sentry 」 The final say , Only half of the Sentinels judged master already 「 Subjective offline 」, Only at this time can master Marked as 「 Objective offline 」, That is to say, it is an objective fact , The leader is really belching , Hua Tuo can't be cured in his second life .
Only master Judged as 「 Objective offline 」, It will further trigger the sentry to start the master-slave switching process .
Objective offline
Simply speaking , Subjective offline is that the sentinel thinks the node is down , And the objective offline is not only the sentinel thinks that the node is down , And after the sentry communicates with other sentries , Up to a certain number of sentinels think it's time for the man to belch .
there 「 A certain amount of 」 It's a legal quantity (Quorum), It's determined by the sentinel monitoring configuration , Explain the configuration :
# sentinel monitor <master-name> <master-host> <master-port> <quorum> # Examples are as follows : sentinel monitor mymaster 127.0.0.1 6379 2
This configuration item is used to tell the sentinel which master node to listen on :
「 Objective offline 」 The standard is , When there is N A sentinel instance , Want to have N/2 + 1 Let's take an example to judge master by 「 Subjective offline 」, In order to finally determine Master by 「 Objective offline 」, It's more than half the mechanism .
“65 Brother : Since judgment master I'm off the line , Then it's time to choose a new leader . ”
「 sentry 」 My second task , Select new master representative or leader in a certain field . You need to choose a new leader from Wudang disciples according to certain rules , After selecting the leader , new master Lead all the disciples to eat and drink together .
According to a certain 「 filter 」 + 「 Scoring 」 Strategy , elect 「 The strongest King 」 As the leader , That is to say, through some conditions of audition filtering some 「 The incompetent 」, Then we will score and rank all the beauties who have passed the audition , Choose the highest as the new master.
As shown in the figure :
new master choice
It's not a good idea for a pretty guy who is often disconnected from the Internet , Would you , Even if it becomes master, But soon the network broke down , You have to choose a new one master, It's not for fun , We have to rule out !
“65 Brother : What are the screening criteria ? ”
down-after-milliseconds \* 10
: If the slave database is always disconnected from the master database , And the number of disconnection times exceeds a certain threshold (10 Time ), We have reason to believe that , The network condition of this slave database is not very good , You can sift this out of the library .Filter out inappropriate slave after , Then enter the scoring link . There are three rules for three rounds of scoring , The rules are :
slave_repl_offset
And master_repl_offset
Progress gap ( The closer one's martial arts is to the previous leader's, the more powerful one will be ), If it's all the same , Let's move on to the next rule . It's just a comparison slave And the old master Copy progress gap ;“65 Brother : Why hold a press conference ? ”
Re elect a new master Such things as headmaster , What a big deal , How can we not tell the world . What's more slave I also need to know who the new leader is , Follow the new leader to be popular and drink spicy health care together .
The last task ,「 sentry 」 Will be new 「master representative or leader in a certain field 」 The connection information is sent to other slave Wudang disciples , And let slave perform replacaof command , New 「master representative or leader in a certain field 」 Establishing a connection , And copy the data to learn all the martial arts of the new leader .
besides ,「 sentry 」 You also need to inform the whole Wulin of the connection information of the new leader ( client ), Make everyone want to visit 、 Those who seek advice can find the new leader , In this way, many matters can be handed over to the new leader for decision ( Transfer the read / write request to the new master).
The main task of the sentry is to achieve the goal
Sentinels carry out tasks and targets
「 sentry 」 The Department is not alone , Many people work together to form a 「 The sentry cluster 」, Even though there are some 「 sentry 」 I was killed by Lao Wang , Other 「 sentry 」 We can still work together to complete the monitoring 、 New leader election and notice slave 、master And everyone in the Wulin ( client ).
When deploying sentry clusters , Sentinel configuration is only set up to monitor master IP and port, There is no connection information configured for other sentinels .
sentinel monitor <master-name> <ip> <redis-port> <quorum>
How do sentinels know each other ? How do you know slave And monitor their ? By which 「 sentry 」 To perform master-slave switching ?
With these questions , follow 「 Code byte 」 Let's go back to the source together , Deep into the heart of the sentinel cluster .
“65 Brother : How do sentinels know each other ? ”
Sentinels can communicate with each other, date and do things , Mainly due to Redis Of pub/sub
Release / Subscribe mechanism .
The sentry and master Establish communication , utilize master Provide release / The subscription mechanism publishes its own information , Like height and weight 、 Are you single? 、IP、 port ……
master There is one __sentinel__:hello
A dedicated channel for , Used to publish and subscribe messages between sentinels . It's like __sentinel__:hello
Wechat group , Sentinels use master Set up a wechat group to release their own news , At the same time, follow the news from other sentinels .
Redis pub/sub Mechanism
When multiple sentinel instances have done publish and subscribe operations on the main database , They can know each other's IP Address and port , To discover and connect with each other .
Redis Manage messages separately through channels , The channels here are actually different wechat groups . such as “ Codebyte reader Technology Group ” It's a technology sharing group . Friends can pay attention to the official account , The background to reply “ Add group ”, Growing up together .
“65 Brother : The Sentinels are connected , But we need to talk to slave Establishing a connection , Otherwise, we can't monitor them , How do you know slave And monitor their ? ”
You bet , It's not enough to connect sentinels to form a cluster , I need to follow slave Establishing a connection , Or you can't monitor them , Unable to make heartbeat judgment on master-slave Library .
besides , If there is a master-slave switch, you have to notify slave Follow the new master Set up a connection to perform data synchronization . The principle of data synchronization in master-slave architecture can be changed step by step 《Redis High availability : You call this master-slave architecture data consistency synchronization 》.
The key is to use master To achieve , The sentry turned to master send out INFO
command , master The leader naturally knows what he has salve My little brother's . therefore master After receiving the command , It will be slave The list tells the sentry .
The sentry is based on master Responsive slave List information with every salve Establishing a connection , And continuously monitor the sentry based on this connection .
As shown in the figure , sentry 2 towards Master send out INFO
command ,Master Just put slave The list goes back to the sentinel 2, sentry 2 According to slave List connection information with each slave Establishing a connection , And realize continuous monitoring based on this connection .
The rest of the Sentinels also monitor based on this .
INFO Command acquisition slave Information
“65 Brother :master After belching , There are so many sentinels , Which Sentry is going to carry out the new master Switching ? ”
It's the sentry's judgment master “ Objective offline ” similar , It was also elected by vote .
Any sentinel judge master “ Subjective offline ” after , Will send to other sentinel friends is-master-down-by-addr
command , Good friends are based on their own master The state of connection between them responds to Y
perhaps N
,Y
To vote for , N
It's against .
If a sentinel gets the majority of sentinels “ Affirmative vote ” after , You can mark master by “ Objective offline ”, The Yes vote is through the sentinel profile quorum Configuration item settings .
sentinel monitor <master-name> <ip> <redis-port> <quorum>
For example, a total of 3 A group of sentinels , that quorum Can be configured to 2, When a sentry gets 2 Yes, yes , You can mark master “ Objective offline ”, Of course, this vote includes your own one .
A sentinel with a majority vote can send orders to other sentinels , State that you want to perform master-slave switching . And let the other sentinels vote , The voting process is called “Leader The election ”.
Want to be “Leader” It's not that simple , You have to have two brushes . The following conditions need to be met :
If the sentry group has 2 An example , here , A sentinel wants to be Leader, Must obtain 2 ticket , instead of 1 ticket . therefore , If a sentinel goes down , that , At this time, the cluster is unable to switch between master and slave databases . therefore , Usually we will at least configure 3 A sentinel example .
This is also the reason why sentry clusters are deployed in an odd number , Even numbers are unnecessary and wasteful .
The election process is shown in the figure below :
Redis Sentinels perform master-slave switching
“65 Brother : new master It's chosen , How to publicize the world ? ”
A press conference, of course , Invite news related media reports to spread , Interested people naturally pay attention to subscription related events , And act on events .
stay Redis It's similar , adopt pub/sub Mechanisms release different events , Let the client subscribe to the message here . The client can subscribe to sentry messages , The sentinel has a lot of subscription channels , Different channels contain different key events in the process of master-slave switch .
That is to say, in different “ Wechat group ” Publish different events , Let the people who are interested in the event into the group .
+switch-master:master The address has changed .
After knowing these channels , So that the client can subscribe to the message from the sentry . After the client reads the Sentinel's configuration file , You can get the sentry's address and port , Network with the sentry .
then , We can execute subscription commands on the client side , To get different event messages .
Take a chestnut : The following commands subscribe to “ Events in which all instances enter the objective offline state ”
SUBSCRIBE +odown
Did you find out ,Redis Of pub/sub The publish subscribe mechanism is particularly important , With pub/sub Mechanism , Between the sentry and the sentry 、 Between the sentry and the slave 、 The connection can be established between the sentry and the client , The release of various events is also realized through this mechanism .
Sentinel In the configuration file down-after-milliseconds Option specifies Sentinel Determine the length of time it takes for the instance to enter the subjective logoff : If an example is in down-after-milliseconds In milliseconds , In succession Sentinel Return invalid reply , that Sentinel The data corresponding to this instance will be modified , This indicates that the instance has entered the subjective offline state .
Make sure that the configuration of all sentinel instances is consistent , Especially the subjective judgment value down-after-milliseconds. Because this value is not configured consistently on different sentinel instances , As a result, the sentinel cluster has not reached a consensus on the failed main database , So we didn't switch the main database in time , The end result of cluster service instability .
down-after-milliseconds It is the maximum connection timeout that we determine that the master-slave database is disconnected . If in down-after-milliseconds In milliseconds , The master and slave nodes are not connected through the network , We can think that the master-slave node is disconnected . If the disconnection occurs more than 10 Time , This shows that the network condition of the slave database is not good , Not suitable as a new master library .
The main task of the sentry is
Redis The sentinel mechanism is to achieve Redis One of the high availability means of uninterrupted service . Data synchronization of master-slave architecture cluster , It is the basic guarantee of data reliability ; Main library down , Automatic execution of master-slave switching is the key support for uninterrupted service .
Redis Sentry mechanism realizes the automatic switch between master and slave , I'm not afraid to be with my female friend any more master It's down. :
The principle of sentry group
In order to avoid the failure of master-slave switch after single sentry failure , And to reduce the miscarriage of justice , And the sentinel group was introduced ; Sentinel cluster needs some mechanisms to support its normal operation :
Master slave switch , It's not a random choice of a sentry to execute , It's arbitration by vote , Select a Leader, By this Leader Responsible for master-slave switching .
This article is from WeChat official account. - Code byte (MageByte)
The source and reprint of the original text are detailed in the text , If there is any infringement , Please contact the yunjia_community@tencent.com Delete .
Original publication time : 2021-04-01
Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .