Understand the data consistency problem of redis and MySQL with 8 graphs

Bald brother 2020-11-10 17:23:55
understand data consistency problem redis


about Web Come on , To some extent, the increase of users and visits promotes the change and progress of project technology and architecture . There may be some of the following :

  1. Page concurrency and visits are not much ,MySQL Enough to support The development of their own logic business . In fact, we can not add cache . Cache static pages at most .
  2. Page concurrency increased significantly , There's some pressure on the database , And some data updates are less frequent Repeatedly inquired Or query speed slower . Then you can consider using caching technology to optimize . Save high hit objects to key-value Formal Redis in , that , If the data is hit , Then you can avoid the inefficient db. From efficient redis Data found in .
  3. Of course , There may be other problems , You also cache pages through static pages 、cdn Speed up 、 Even load balancing can improve system concurrency . No introduction here .


Caching ideas are everywhere

We start with an algorithm problem to understand the meaning of caching .

problem 1:

  • Enter a number n(n<20), seek n!;

analysis 1

  • Just think about algorithms , The problem of numerical crossing is not considered . Of course we know n!=n * (n-1) * (n-2) * ... * 1= n * (n-1)!; Then we can use a recursive function to solve the problem .
static long jiecheng(int n) {
if(n==1||n==0)return 1;
else {
return n*jiecheng(n-1);
Copy code 

In this way, each input request needs to be executed n Time . problem 2:

  • Input t Group data ( It could be hundreds of ), Each group has one. xi(xi<20), seek xi!;

analysis 2

  • If you use recursive , Input t Group data , Each input is xi, Then the number of times to execute each time is : image-20201106175003051 Every time you enter Xi Too big or t Too many metropolises cause a lot of burden ! The time complexity is O(n2)
  • So can you change your mind . you 're right 、 yes The meter . A watch is often used for ACM In the algorithm, , It is often used to solve multiple groups of input and output 、 Graph theory search results 、 Path storage problem . that , For this factorial . We just need to apply for an array , According to the number, store the required data in the array from front to back , After that, we can output the array value directly , The idea is clear :
import java.util.Scanner;
public class test {
public static void main(String[] args) {
// TODO Auto-generated method stub
Scanner sc=new Scanner(System.in);
int t=sc.nextInt();
long jiecheng[]=new long[21];
for(int i=1;i<21;i++)
for(int i=0;i<t;i++) {
int x=sc.nextInt();
Copy code 
  • Time complexity O(n). The idea here is the same as cache Thoughts are about the same . First put the data in jiecheng[21] To store in an array . Perform a calculation . When we continue to access later, it is equivalent to asking static array value . Every time for O(1 The operation of ).

Cache application scenarios

Caching is suitable for high concurrency scenarios , Increase service capacity . Mainly from Frequently accessed data Or query The high cost From slow media to faster media , For instance from Hard disk —> Memory . We know that most relational databases are Read and write based on hard disk Of , Its efficiency and resources are limited , and redis It's memory based , The speed of reading and writing varies greatly . When the concurrency is too high, the performance of relational database reaches the bottleneck , You can strategically put frequently accessed data into Redis Improve system throughput and concurrency .

For common websites and scenarios , Relational databases may be slow in two places :

  • Reading and writing IO Poor performance
  • A data may be obtained by a large amount of calculation

So using caching can reduce the number of disks IO The number of times and the number of calculations in a relational database . The speed of reading is also reflected in two aspects :

  • Memory based , Read and write faster
  • Using hash algorithm to locate the result directly does not need to calculate

So for a decent , A little bit of a website , Caching is very necessary Of , and Redis It is undoubtedly one of the best choices .


Something to be aware of

Improper use of cache will cause many problems . So some details need to be carefully considered and designed . Of course, the most difficult data consistency is analyzed separately below .

Whether to use cache

Projects can't use caching just to use caching , Caching is not necessarily suitable for all scenarios , If the Data consistency is extremely demanding , Or, Data changes frequently and there are not many queries , Or there's no concurrency at all 、 Simple queries don't necessarily require caching , It may also waste resources and make the project cumbersome and difficult to maintain , And use redis More or less cache may encounter data consistency problems, which need to be considered .

Cache design is reasonable

When designing caching , Multiple table queries are likely to be encountered , If you encounter the key value pair of multi table query cache, you need to consider it reasonably , Whether it's split or together ? Of course, if there are many kinds of combinations, but few of them often appear, you can also cache them directly , The specific design should be based on the business needs of the project , There's no absolute standard .

Expiration strategy options

  • The cache contains relatively hot and commonly used data ,Redis Resources are also limited , We need to choose a reasonable policy to let the cache expire and delete . We have learned operating system We also know that there is a FIFO algorithm in the implementation of computer cache (FIFO); Least recently used algorithm (LRU); The best elimination algorithm (OPT); Minimum access page algorithm (LFR) Wait for the disk scheduling algorithm . Design Redis You can also learn from it when caching . According to the time FIFO It's the best way to achieve . And Redis stay overall situation key Support expiration strategy .
  • And the expiration time should also be set according to the system situation , If the hardware is better, it can be a little longer now , But too long or too short an expiration date may not be good , If it is too short, the cache hit rate may not be high , And too long may cause a lot of cold data stored in Redis No release .

Data consistency issues *

In fact, the problem of data consistency is mentioned above . Caching is not recommended if there is a high requirement for consistency . Let's sort out the cached data . stay Redis Data consistency problems are often encountered in caching . For a cache , Here's a list of things :


read: from Redis Read from , If Redis There is no , Then from MySQL Get updates Redis cache . The following flowchart describes the general scenario , It's not controversial :


Write 1: Update the database first , Update the cache again ( Common low concurrency )


Update database information , Update again Redis cache . This is the normal practice , The cache is based on the database , From the database .

But there may be some problems , For example, if the update cache fails ( Downtime and other conditions ), Will make the database and Redis Data inconsistency . cause DB The new data , Caching old data .

Write 2: So let's delete the cache , Write to database again ( Low concurrency optimization )


Problem solved

This situation can effectively avoid Write 1 Prevent writing Redis Failure problem . Delete cache for update . The ideal is for the next visit Redis Go to... For free MySQL Get the latest value into the cache . However, this situation is limited to low concurrency scenarios, not to high concurrency scenarios .

The problem is

Write 2 Although I can Seems to write Redis An abnormal problem . It seems like a better solution, but there are problems in the high concurrency solution . We are Write 1 Discussed if the library update is successful , Cache update failure will result in dirty data . Our ideal is to delete the cache so that Next thread Access is suitable for updating cache . The problem is : If this The next thread came too early 、 Too clever What about it ? image-20201106191042265

Because you don't know who is the first and who is the second , Who is slow and who is slow? . As shown above , There will be Redis Cache data and MySQL atypism . Of course you can be right key Conduct locked . But lock is such a heavyweight thing that has a great impact on the concurrent function , Don't use a lock if you can ! The above-mentioned situation will still cause Caching is old data ,DB It's new data . And if the cache doesn't expire, this problem will persist .

Write 3: Delay double delete strategy


This is the delay double delete strategy , It can be relieved in Write 2 Update in MySQL In the process, there are read threads entering, which causes Redis Caching and MySQL Data inconsistency . The way is Delete cache -> Update cache -> Time delay ( A few hundred ms)( Asynchronous ) Delete cache again . Even on the way to update the cache Write 2 The problem of . Cause data inconsistency , But delay ( It depends on the business , Usually hundreds of ms) Deleting again can quickly resolve the inconsistency .

But there are loopholes in the plan , For example, delete the error for the second time 、 Write many read high concurrency MySQL The pressure of the visit and so on . Of course you can choose to use MQ Wait for message queuing to resolve asynchronously . In fact, the practical solution is very difficult to take into account the foolproof , Therefore, many big men may be spurted because of some mistakes in the design process . As the author of vegetables, I will not make a fool of myself here , Everybody, welcome to contribute your project .

Write 4: Operate the cache directly , Write... Regularly sql( Suitable for high concurrency )

When there is A bunch of concurrency ( Write ) After throwing it , Even if the previous schemes use message queuing asynchronous communication, it is difficult to give users a comfortable experience . And for large-scale operations sql There will also be a lot of pressure on the system . So another solution is to directly operate the cache , Write the cache to sql. because Redis This kind of non relational database is based on memory operation KV It's a lot faster than the traditional relationship .


The above applies to business design in high concurrency , At this time Redis Data based ,MySQL Data is auxiliary . Insert... Regularly ( It's like a data backup library ). Of course , This kind of high concurrency is often due to the business to read Write There may be different requirements for the order of , Maybe with Message queue as well as lock The completion of data and order for business may be due to high concurrency 、 The uncertainty and instability of multithreading , Improve business reliability .

All in all , The more High concurrency 、 The more right High data consistency requirements The scheme of data consistency in the design scheme needs Consider and take into account Of More complicated 、 The more . The above is also the author's aim at Redis Learning and self divergence of data consistency problems ( Rats ) Study , Welcome to the group 973961276 Let's talk about technology , If there is an explanation that is unreasonable, or please correct it !

本文为[Bald brother]所创,转载请带上原文链接,感谢

  1. 【计算机网络 12(1),尚学堂马士兵Java视频教程
  2. 【程序猿历程,史上最全的Java面试题集锦在这里
  3. 【程序猿历程(1),Javaweb视频教程百度云
  4. Notes on MySQL 45 lectures (1-7)
  5. [computer network 12 (1), Shang Xuetang Ma soldier java video tutorial
  6. The most complete collection of Java interview questions in history is here
  7. [process of program ape (1), JavaWeb video tutorial, baidu cloud
  8. Notes on MySQL 45 lectures (1-7)
  9. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  10. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  11. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  12. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  13. 【递归,Java传智播客笔记
  14. [recursion, Java intelligence podcast notes
  15. [adhere to painting for 386 days] the beginning of spring of 24 solar terms
  16. K8S系列第八篇(Service、EndPoints以及高可用kubeadm部署)
  17. K8s Series Part 8 (service, endpoints and high availability kubeadm deployment)
  18. 【重识 HTML (3),350道Java面试真题分享
  19. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  20. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  21. [re recognize HTML (3) and share 350 real Java interview questions
  22. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  23. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  24. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  25. RPC 1: how to develop RPC framework from scratch
  26. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  27. RPC 1: how to develop RPC framework from scratch
  28. 一次性捋清楚吧,对乱糟糟的,Spring事务扩展机制
  29. 一文彻底弄懂如何选择抽象类还是接口,连续四年百度Java岗必问面试题
  30. Redis常用命令
  31. 一双拖鞋引发的血案,狂神说Java系列笔记
  32. 一、mysql基础安装
  33. 一位程序员的独白:尽管我一生坎坷,Java框架面试基础
  34. Clear it all at once. For the messy, spring transaction extension mechanism
  35. A thorough understanding of how to choose abstract classes or interfaces, baidu Java post must ask interview questions for four consecutive years
  36. Redis common commands
  37. A pair of slippers triggered the murder, crazy God said java series notes
  38. 1、 MySQL basic installation
  39. Monologue of a programmer: despite my ups and downs in my life, Java framework is the foundation of interview
  40. 【大厂面试】三面三问Spring循环依赖,请一定要把这篇看完(建议收藏)
  41. 一线互联网企业中,springboot入门项目
  42. 一篇文带你入门SSM框架Spring开发,帮你快速拿Offer
  43. 【面试资料】Java全集、微服务、大数据、数据结构与算法、机器学习知识最全总结,283页pdf
  44. 【leetcode刷题】24.数组中重复的数字——Java版
  45. 【leetcode刷题】23.对称二叉树——Java版
  46. 【leetcode刷题】22.二叉树的中序遍历——Java版
  47. 【leetcode刷题】21.三数之和——Java版
  48. 【leetcode刷题】20.最长回文子串——Java版
  49. 【leetcode刷题】19.回文链表——Java版
  50. 【leetcode刷题】18.反转链表——Java版
  51. 【leetcode刷题】17.相交链表——Java&python版
  52. 【leetcode刷题】16.环形链表——Java版
  53. 【leetcode刷题】15.汉明距离——Java版
  54. 【leetcode刷题】14.找到所有数组中消失的数字——Java版
  55. 【leetcode刷题】13.比特位计数——Java版
  56. oracle控制用户权限命令
  57. 三年Java开发,继阿里,鲁班二期Java架构师
  58. Oracle必须要启动的服务
  59. 万字长文!深入剖析HashMap,Java基础笔试题大全带答案
  60. 一问Kafka就心慌?我却凭着这份,图灵学院vip课程百度云