It's right to be independent , It's also right to be in the circle , The point is to figure out what kind of life you want to live , What price are you willing to pay for this .
We usually will Redis Use as cache , Improve read response performance , once Redis Downtime , All data in memory is lost , If visit database directly now, a large amount of traffic hits MySQL It could lead to more serious problems .
In addition, slowly read from the database to Redis Performance must not be better than from Redis Fast access , It can also slow down the response .
Redis In order to achieve rapid recovery without fear of downtime , Designed two killer maces , Namely AOF（Append Only FIle） Journal and RDB snapshot .
Learn a technique , Usually only contact with scattered technical points , I didn't build a complete knowledge framework and architecture system in my mind , There is no system view . It's going to be hard , And there will be a look as if they will , And then forget , A face of meng .
follow 「 Code byte 」 Eat it all together Redis, Deep mastery Redis Core principles and practical skills . Build a complete knowledge framework , Learn to look at the overall situation and organize the whole knowledge system .
The hard core of this paper , I suggest you like it , Calm down and read , I believe there will be a lot of harvest .
Last one 《Redis Core ： The secret that can't be broken quickly 》 Analysis of the Redis The core data structure of 、IO Model 、 Threading model 、 Use appropriate data coding according to different data . Deep understanding of the real fast reasons ！
This article will focus on the following points ：
- After downtime , How to recover quickly ？
- It's down. ,Redis How to avoid data loss ？
- What is? RDB memory dump ？
- AOF Log implementation mechanism
- What is? Copy on write technology ？
The knowledge points involved are shown in the figure ：
The panorama can be expanded around two dimensions , Namely ：
Application dimension ： Cache usage 、 Cluster application 、 Ingenious use of data structure
System dimension ： Can be classified as three high
- High performance ： Threading model 、 The Internet IO Model 、 data structure 、 Persistence mechanism ;
- High availability ： Master slave copy 、 The sentry cluster 、Cluster Fragmentation cluster ;
- High expansion ： Load balancing
Redis A series of chapters Around the mind map below , This time we're exploring together Redis A high performance 、 The secret of persistence .
Have a panorama , Master the system view .
The system view is actually crucial , To some extent , When solving problems , With a system view , It means you have a basis 、 Position and solve problems methodically .
RDB memory dump , Fast recovery from downtime
65 Brother ：Redis It's down for some reason , Will cause all traffic to hit the back end MySQL, I started right away Redis, But its data is stored in memory , After the restart, there is still no data , How to prevent the loss of restart data ？
65 Don't worry ,「 Code byte 」 Take you step by step in-depth understanding to the end Redis How to recover quickly after downtime .
Redis Data stored in memory , Can we consider writing the data in memory to disk ？ When Redis When you restart, you can quickly restore the data saved on the disk to the memory , In this way, the service can be provided normally after restart .
65 Brother ： I came up with a plan , Every time you execute 「 Write 」 Write to disk while operating memory
The solution has a fatal problem ： Each write instruction not only writes to memory, but also writes to disk , Disk performance is too slow relative to memory , It can lead to Redis Performance is greatly reduced .
65 Brother ： How to avoid the problem of simultaneous writing ？
We usually will Redis Use as a cache , So even Redis Not all data saved , It can also be obtained through the database , therefore Redis It won't save all the data , Redis Data persistence using 「RDB Data snapshot 」 To achieve rapid recovery from downtime .
65 Brother ： What is RDB Memory snapshot ？
stay Redis perform 「 Write 」 In the process of instruction , Memory data changes all the time . So called memory snapshot , Refers to Redis The state data of the data in memory at a certain moment .
It's like time is fixed at a certain moment , When we take pictures , A moment can be completely recorded by taking pictures .
Redis It's similar to this , That is to take a moment's data as a file , Write to disk . This snapshot file is called RDB file ,RDB Namely Redis DataBase Abbreviation .
Redis By timing RDB memory dump , So you don't have to do it every time 「 Write 」 All instructions are written to disk , Just write to the disk while the memory snapshot is executed . It not only ensures that only fast can not be broken , It's also persistent , Fast recovery from downtime .
When doing data recovery , Direct will RDB The file is read into memory to complete the recovery .
65 Brother ： What kind of data do you want to take a snapshot of ？ Or how often to take a snapshot ？ This will affect the execution efficiency of the snapshot .
65 I'm not bad , Starting to think about data efficiency . stay 《Redis Core ： The secret that can't be broken quickly 》 We know that his single thread model determines that we should try our best to avoid blocking the operation of the main thread , avoid RDB File generation blocks the main thread .
Generate RDB Strategy
Redis Two instructions are provided to generate RDB file ：
- save： Main thread execution , It will block ;
- bgsave： call glibc Function of
forkGenerate a subprocess to write RDB file , Snapshot persistence is completely left to the child process to handle , The parent process continues to process client requests , Generate RDB The default configuration of the file .
65 Brother ： That's doing... On memory data 「 snapshot 」 When , Can memory data be modified ？ That is, whether the write instruction can be processed normally ？
First of all, we need to be clear about , Avoid obstruction and RDB It's not the same thing to be able to handle write operations during file generation . Although the main thread is not blocked , At that time, in order to ensure the consistency of the snapshot data , Can only handle read operations , You cannot modify the data that is executing the snapshot .
Obviously , In order to generate RDB And pause the write operation ,Redis It's not allowed .
65 Brother ： that Redis How to handle write requests at the same time , Simultaneous generation RDB What about the documents ？
Redis Multi process using the operating system Copy on write technology COW(Copy On Write) To achieve snapshot persistence , This mechanism is very interesting , And few people know . Multi process COW It's also an important indicator of the knowledge span of programmers .
Redis Will be called on persistence glibc Function of
fork Generate a subprocess , Snapshot persistence is completely left to the child process to handle , The parent process continues to process client requests .
When the subprocess was just generated , It shares code and data segments in memory with the parent process . At this point you can think of the father-child process as a conjoined baby , Sharing the body .
This is a Linux The mechanism of the operating system , In order to save memory resources , So try to share them as much as possible . In the moment of process separation , Memory growth has barely changed .
bgsave The subprocess can share all the memory data of the main thread , Read data from the main thread and write it to RDB file .
SAVE Order or
BGSAVE Command to create a new RDB When you file , The program will check the key in the database , Keys that have expired will not be saved to newly created RDB In file .
When the main thread executes the write instruction to modify the data , This data will be copied ,
bgsave The child process reads the copy data and writes it to RDB file , So the main thread can modify the original data directly .
This ensures the integrity of the snapshot , It also allows the main thread to modify the data at the same time , Avoid the impact on normal business .
Redis Will use bgsave Take a snapshot of all the data in the current memory , This operation is done by the child process in the background , This allows the main thread to modify the data at the same time .
65 Brother ： That can be executed every second RDB Documents , In this way, even if there is an outage, it will lose at most 1 Second data .
Taking full data snapshots too often , There are two serious performance overhead ：
- Frequent generation RDB File write to disk , Disk pressure is too high . There will be a last RDB Not finished , The next one is coming up again , Fall into a dead cycle .
- fork Out bgsave The child process blocks the main thread , The more memory the main thread has , The longer the blocking time .
Advantages and disadvantages
Snapshot recovery is fast , But generate RDB File frequency is not easy to grasp , If the frequency is too low, more data will be lost due to downtime ; Too fast , It's going to cost extra .
RDB Use binary + Write to disk by data compression , Small file size , Fast data recovery .
Redis except RDB Beyond full snapshot , And designed AOF Post log , Now let's talk about AOF journal .
AOF Post log , Avoid downtime and data loss
AOF The log stores Redis The sequence of server instructions ,AOF The log records only the instructions that modify the memory .
hypothesis AOF The log records from Redis All modified instruction sequences since instance creation , Then you can pass on an empty Redis The instance executes all instructions in sequence , That is to say 「 replay 」, To restore Redis The status of the memory data structure of the current instance .
Compare the log before and after writing
Write a Prelog （Write Ahead Log, WAL）： Before actually writing data , Write the modified data to the log file , Recovery is guaranteed .
such as MySQL Innodb Storage engine Medium redo log（ Redo log ） It's a data log that records changes , Record the modification log before modifying the data, and then modify the data .
Post log ： Execute first 「 Write 」 Command request , Write data to memory , Log it again .
When Redis Accept to 「set key MageByte」 Command to write data to memory ,Redis It will be written in the following format AOF file .
- 「*3」： Indicates that the current instruction is divided into three parts , Every part is 「$ + Numbers 」 start , This part is followed by the specific 「 Instructions 、 key 、 value 」.
- 「 Numbers 」： The command for this part 、 key 、 The size of bytes occupied by the value . such as 「$3」 This part contains 3 Bytes , That is to say 「set」 Instructions .
65 Brother ： Why? Redis How about using post write log ？
The post write log avoids the extra checking overhead , There is no need to check the syntax of the executed command . If you use pre write logs , You need to check the grammar first , Otherwise, the log records the wrong command , When using log recovery, there will be an error .
in addition , Log after writing , It won't block the current 「 Write 」 Command execution .
65 Brother ： That's it AOF It's safe ？
Silly child , It's not that simple . If Redis Just finished executing the instructions , There's no log down yet , It is possible to lose the data related to this command .
also ,AOF It avoids the blocking of the current command , But there is a risk of blocking the next command .AOF Logs are executed by the main thread , Writing logs to disk , If the disk pressure is high, it will lead to slow disk writing , Leading to subsequent 「 Write 」 Command blocking .
Did you find out , These two problems are related to disk writeback , If we can reasonably control 「 Write 」 After the instruction is executed AOF When the log is written back to disk , The problem is solved .
Write back strategy
In order to improve the efficiency of writing files , When the user calls
write function , When writing some data to a file , The operating system usually stores the write data in a memory buffer temporarily , Wait until the buffer is full 、 Or beyond the specified time limit , To write the data in the buffer to the disk .
Although this approach improves efficiency , But it also brings security issues for writing data , Because if the computer goes down , Then the write data stored in the memory buffer will be lost .
So , The system provides
fdatasync Two synchronous functions , They can force the operating system to write the data in the buffer to the hard disk immediately , So as to ensure the security of writing data .
Redis Provided AOF Configuration item
appendfsync The write back strategy directly determines AOF The efficiency and security of persistence capabilities .
- always： Synchronous write back , When the write instruction is finished, it will be
aof_bufThe contents of the buffer are flushed to AOF file .
- everysec： Write back every second , The write instruction is finished , The log will only write AOF File buffer , Synchronize buffer contents to disk every second .
- no： Operating system control , Write execution complete , Write the log to AOF File memory buffer , It's up to the operating system to decide when to write to disk .
There is no best of both strategies , We need to make a trade-off between performance and reliability .
always Synchronous write back can prevent data loss , But every one of them 「 Write 」 Instructions need to be written to disk , The worst performance .
everysec Write back every second , Avoid the performance overhead of synchronous writeback , In case of downtime, one second bit of data written to disk may be lost , There's a trade-off between performance and reliability .
no Operating system control , Write after executing the write instruction AOF The file buffer can perform subsequent 「 Write 」 Instructions , Best performance , But it's possible to lose a lot of data .
65 Brother ： So how do I choose a strategy ？
According to the requirements of the system for high performance and high reliability , To choose the write back strategy . To sum up ： For high performance , Just choose No Strategy ; If you want high reliability assurance , Just choose Always Strategy ; If you allow a little bit of data loss , I hope the performance will not be affected too much , Then choose Everysec Strategy .
Advantages and disadvantages
advantage ： Log only after successful execution , Avoid instruction syntax checking overhead . meanwhile , It won't block the current 「 Write 」 Instructions .
shortcoming ： because AOF What is recorded is the contents of each instruction , Please refer to the log format above for the specific format . Every instruction needs to be executed during recovery , If the log file is too large , The whole recovery process will be very slow .
In addition, the file system also has limits on file size , Can't save too large a file , The file gets bigger , Additional efficiency will also be lower .
The log is too large ：AOF Rewrite mechanism
65 Brother ：AOF What if the log file is too large ？
AOF Write a Prelog , It's a record of each 「 Write 」 Command operation . Don't like RDB Full snapshot leads to performance loss , But the execution speed didn't RDB fast , At the same time, too large log file can also cause performance problems , For those who are quick but not broken Redis For this real man , I can't stand the problem caused by too large log .
therefore ,Redis Designed a killer 「AOF Rewrite mechanism 」,Redis Provides
bgrewriteaof Instructions are used for AOF Keep your weight down .
Its principle is to open up a subprocess to traverse memory and convert it into a series of Redis Operation instructions of , Serialize to a new AOF Log file . Increment occurred during operation after serialization AOF The log is appended to this new AOF Log file , Replace the old one immediately after the addition AOF The log file is missing , The job of slimming is done .
65 Brother ： Why? AOF Rewriting mechanism can reduce the size of log files ？
The rewriting mechanism is 「 Changeable one 」 function , Put multiple instructions in the old log , After rewriting, it becomes an instruction .
As shown below ：
Three LPUSH Instructions , after AOF After rewriting, a , For scenes that have been modified many times , The reduction effect is more obvious .
65 Brother ： After rewriting AOF The log gets smaller , Finally, the operation log of the latest data of the whole database is written to disk . Will rewriting block the main thread ？
「 Margo 」 As mentioned above ,AOF Logs are written back by the main thread ,AOF The process of rewriting is actually the backstage process bgrewriteaof complete , Prevent blocking the main thread .
and AOF The log is written back by the main thread , The rewriting process is made up of a background subprocess bgrewriteaof To complete , This is also to avoid blocking the main thread , Cause database performance degradation .
in general , All together Two logs , Copy memory data once , The old ones, respectively AOF Logs and new AOF Rewrite logs and Redis Data copy .
Redis Will be rewritten in the process of receiving 「 Write 」 Command operations are also recorded to the old AOF Buffers and AOF Rewrite buffer , In this way, the rewriting log also keeps the latest operation . Wait until all operation records of the copied data are rewritten , The latest operation to rewrite the buffer record is also written to the new AOF In file .
Every time AOF When rewriting ,Redis A memory copy will be executed first , For traversing data to generate rewriting records ; Use two logs to ensure that during rewriting , The newly written data will not be lost , And keep the data consistent .
65 Brother ：AOF Rewriting also has a rewriting log , Why it's not shared AOF My own log ？
That's a good question , There are two reasons ：
- One reason is that when a parent-child process writes the same file, there is bound to be competition , Controlling competition means that the performance of the parent process will be affected .
- If AOF The rewriting process failed , So the original AOF The document is equivalent to being contaminated , Can't do recovery . therefore Redis AOF Rewrite a new file , If rewriting fails , Just delete this file , Not to the original AOF Documents have an impact . When the rewriting is done , Just replace the old file .
Redis 4.0 Hybrid logging model
restart Redis when , We seldom use rdb To restore memory state , Because a lot of data will be lost . We usually use AOF Log replay , But replay AOF Log performance is relative rdb It's a lot slower , In this way Redis When the examples are large , It takes a long time to start .
Redis 4.0 To solve this problem , Brings a new persistence option —— Mix persistence . take rdb The content of the file and the incremental AOF Log files exist together . there AOF Logs are no longer full logs , It's the increment from the beginning of persistence to the end of persistence AOF journal , Usually this part AOF The log is very small .
So in Redis When restarting , You can load rdb The content of , Then replay the increment AOF Log can completely replace the previous AOF Full file replay , The restart efficiency has been greatly improved .
therefore RDB Memory snapshots are executed at a slightly slower rate , On two occasions RDB Use during snapshot AOF Log all that happened during the period 「 Write 」 operation .
In this way, the snapshot does not need to be executed frequently , At the same time as AOF Just record what happened between two snapshots 「 Write 」 Instructions , You don't need to record all the operations , Avoid large files .
Redis Designed bgsave And copy on write , Avoid as much as possible the impact on read and write instructions during snapshot execution , Frequent snapshots can put pressure on the disk and fork Block main thread .
Redis Two killer maces are designed to achieve rapid recovery from downtime , Data is not lost .
Avoid log overload , Provides AOF Rewrite mechanism , According to the latest status of the data in the database , Write operations that generate data as new logs , And through the background to complete without blocking the main thread .
comprehensive AOF and RDB stay Redis 4.0 Provides a new persistence strategy , Hybrid logging model . stay Redis When restarting , You can load rdb The content of , Then replay the increment AOF Log can completely replace the previous AOF Full file replay , The restart efficiency has been greatly improved .
Last , About AOF and RDB The choice of ,「 code Brother word section 」 There are three suggestions ：
- When data cannot be lost , Memory snapshot and AOF It's a good choice to use a mixture of ;
- If minute level data loss is allowed , You can just use RDB;
- If only AOF, priority of use everysec Configuration options , Because it strikes a balance between reliability and performance .
After two articles Redis Series articles , Readers are interested in Redis There should be an overall understanding .
Next 「 Margo 」 It will bring a real battle ,《Redis High availability ： The mystery of master-slave architecture 》 actual combat + The principle is presented to you ！
Coming soon ......
Hard core, good writing
Focus on 「 Code byte 」, Every time it's hard core . If you have any harvest after reading, please 「 give the thumbs-up 、 Share 、 Collection 」, Thank you for your support .
The readership has been opened , The big guys in the group and the big factories , You can not only extrapolate, you can also learn . add to 「 Margo 」 Personal wechat , reply 「 Add group 」 Growing up together ！
redis Core technology and actual combat : https://time.geekbang.org/column/intro/329
redis Deep Adventure ： Core principles and application practice : https://juejin.cn/book/6844733724618129422/section/6844733724714614797
redis Design and practice : https://weread.qq.com/web/reader/d35323e0597db0d35bd957bk73532580243735b90b45ac8