Preface
about Web Come on , To some extent, the increase of users and visits promotes the change and progress of project technology and architecture . There may be some of the following :
- Page concurrency and visits are not much ,MySQL
Enough to support
The development of their own logic business . In fact, we can not add cache . Cache static pages at most . - Page concurrency increased significantly , There's some pressure on the database , And some data updates are less frequent
Repeatedly inquired
Or query speedslower
. Then you can consider using caching technology to optimize . Save high hit objects to key-value Formal Redis in , that , If the data is hit , Then you can avoid the inefficient db. From efficient redis Data found in . - Of course , There may be other problems , You also cache pages through static pages 、cdn Speed up 、 Even load balancing can improve system concurrency . No introduction here .
Caching ideas are everywhere
We start with an algorithm problem to understand the meaning of caching .
problem 1:
- Enter a number n(n<20), seek
n!
;
analysis 1:
- Just think about algorithms , The problem of numerical crossing is not considered . Of course we know
n!=n * (n-1) * (n-2) * ... * 1= n * (n-1)!
; Then we can use a recursive function to solve the problem .
static long jiecheng(int n) {
if(n==1||n==0)return 1;
else {
return n*jiecheng(n-1);
}
}
Copy code
In this way, each input request needs to be executed n
Time . problem 2:
- Input t Group data ( It could be hundreds of ), Each group has one. xi(xi<20), seek
xi!
;
analysis 2:
- If you use
recursive
, Input t Group data , Each input is xi, Then the number of times to execute each time is :Every time you enter Xi Too big or t Too many metropolises cause a lot of burden ! The time complexity is O(n2)
- So can you change your mind . you 're right 、 yes
The meter
. A watch is often used for ACM In the algorithm, , It is often used to solve multiple groups of input and output 、 Graph theory search results 、 Path storage problem . that , For this factorial . We just need to apply for an array , According to the number, store the required data in the array from front to back , After that, we can output the array value directly , The idea is clear :
import java.util.Scanner;
public class test {
public static void main(String[] args) {
// TODO Auto-generated method stub
Scanner sc=new Scanner(System.in);
int t=sc.nextInt();
long jiecheng[]=new long[21];
jiecheng[0]=1;
for(int i=1;i<21;i++)
{
jiecheng[i]=jiecheng[i-1]*i;
}
for(int i=0;i<t;i++) {
int x=sc.nextInt();
System.out.println(jiecheng[x]);
}
}
}
Copy code
- Time complexity O(n). The idea here is the same as
cache
Thoughts are about the same . First put the data in jiecheng[21] To store in an array . Perform a calculation . When we continue to access later, it is equivalent to asking static array value . Every time for O(1 The operation of ).
Cache application scenarios
Caching is suitable for high concurrency scenarios , Increase service capacity . Mainly from Frequently accessed data
Or query The high cost
From slow media to faster media , For instance from Hard disk
—> Memory
. We know that most relational databases are Read and write based on hard disk
Of , Its efficiency and resources are limited , and redis It's memory based , The speed of reading and writing varies greatly . When the concurrency is too high, the performance of relational database reaches the bottleneck , You can strategically put frequently accessed data into Redis Improve system throughput and concurrency .
For common websites and scenarios , Relational databases may be slow in two places :
- Reading and writing IO Poor performance
- A data may be obtained by a large amount of calculation
So using caching can reduce the number of disks IO The number of times and the number of calculations in a relational database . The speed of reading is also reflected in two aspects :
- Memory based , Read and write faster
- Using hash algorithm to locate the result directly does not need to calculate
So for a decent , A little bit of a website , Caching is very necessary
Of , and Redis It is undoubtedly one of the best choices .
Something to be aware of
Improper use of cache will cause many problems . So some details need to be carefully considered and designed . Of course, the most difficult data consistency is analyzed separately below .
Whether to use cache
Projects can't use caching just to use caching , Caching is not necessarily suitable for all scenarios , If the Data consistency is extremely demanding , Or, Data changes frequently and there are not many queries , Or there's no concurrency at all 、 Simple queries don't necessarily require caching , It may also waste resources and make the project cumbersome and difficult to maintain , And use redis More or less cache may encounter data consistency problems, which need to be considered .
Cache design is reasonable
When designing caching , Multiple table queries are likely to be encountered , If you encounter the key value pair of multi table query cache, you need to consider it reasonably , Whether it's split or together ? Of course, if there are many kinds of combinations, but few of them often appear, you can also cache them directly , The specific design should be based on the business needs of the project , There's no absolute standard .
Expiration strategy options
- The cache contains relatively hot and commonly used data ,Redis Resources are also limited , We need to choose a reasonable policy to let the cache expire and delete . We have learned
operating system
We also know that there is a FIFO algorithm in the implementation of computer cache (FIFO); Least recently used algorithm (LRU); The best elimination algorithm (OPT); Minimum access page algorithm (LFR) Wait for the disk scheduling algorithm . Design Redis You can also learn from it when caching . According to the time FIFO It's the best way to achieve . And Redis stayoverall situation key
Support expiration strategy . - And the expiration time should also be set according to the system situation , If the hardware is better, it can be a little longer now , But too long or too short an expiration date may not be good , If it is too short, the cache hit rate may not be high , And too long may cause a lot of cold data stored in Redis No release .
Data consistency issues *
In fact, the problem of data consistency is mentioned above . Caching is not recommended if there is a high requirement for consistency . Let's sort out the cached data . stay Redis Data consistency problems are often encountered in caching . For a cache , Here's a list of things :
read
read
: from Redis Read from , If Redis There is no , Then from MySQL Get updates Redis cache . The following flowchart describes the general scenario , It's not controversial :
Write 1: Update the database first , Update the cache again ( Common low concurrency )
Update database information , Update again Redis cache . This is the normal practice , The cache is based on the database , From the database .
But there may be some problems , For example, if the update cache fails ( Downtime and other conditions ), Will make the database and Redis Data inconsistency . cause DB The new data , Caching old data .
Write 2: So let's delete the cache , Write to database again ( Low concurrency optimization )
Problem solved
This situation can effectively avoid Write 1 Prevent writing Redis Failure problem . Delete cache for update . The ideal is for the next visit Redis Go to... For free MySQL Get the latest value into the cache . However, this situation is limited to low concurrency scenarios, not to high concurrency scenarios .
The problem is
Write 2 Although I can Seems to write Redis An abnormal problem
. It seems like a better solution, but there are problems in the high concurrency solution . We are Write 1 Discussed if the library update is successful , Cache update failure will result in dirty data . Our ideal is to delete the cache so that Next thread
Access is suitable for updating cache . The problem is : If this The next thread came too early 、 Too clever What about it ?
Because you don't know who is the first and who is the second , Who is slow and who is slow? . As shown above , There will be Redis Cache data and MySQL atypism . Of course you can be right key Conduct locked
. But lock is such a heavyweight thing that has a great impact on the concurrent function , Don't use a lock if you can ! The above-mentioned situation will still cause Caching is old data ,DB It's new data . And if the cache doesn't expire, this problem will persist .
Write 3: Delay double delete strategy
This is the delay double delete strategy , It can be relieved in Write 2 Update in MySQL In the process, there are read threads entering, which causes Redis Caching and MySQL Data inconsistency . The way is Delete cache -> Update cache -> Time delay ( A few hundred ms)( Asynchronous ) Delete cache again . Even on the way to update the cache Write 2 The problem of . Cause data inconsistency , But delay ( It depends on the business , Usually hundreds of ms) Deleting again can quickly resolve the inconsistency .
But there are loopholes in the plan , For example, delete the error for the second time 、 Write many read high concurrency MySQL The pressure of the visit and so on . Of course you can choose to use MQ Wait for message queuing to resolve asynchronously . In fact, the practical solution is very difficult to take into account the foolproof , Therefore, many big men may be spurted because of some mistakes in the design process . As the author of vegetables, I will not make a fool of myself here , Everybody, welcome to contribute your project .
Write 4: Operate the cache directly , Write... Regularly sql( Suitable for high concurrency )
When there is A bunch of concurrency ( Write )
After throwing it , Even if the previous schemes use message queuing asynchronous communication, it is difficult to give users a comfortable experience . And for large-scale operations sql There will also be a lot of pressure on the system . So another solution is to directly operate the cache , Write the cache to sql. because Redis This kind of non relational database is based on memory operation KV It's a lot faster than the traditional relationship .
The above applies to business design in high concurrency , At this time Redis Data based ,MySQL Data is auxiliary . Insert... Regularly ( It's like a data backup library ). Of course , This kind of high concurrency is often due to the business to read
、 Write
There may be different requirements for the order of , Maybe with Message queue
as well as lock
The completion of data and order for business may be due to high concurrency 、 The uncertainty and instability of multithreading , Improve business reliability .
All in all , The more High concurrency
、 The more right High data consistency requirements
The scheme of data consistency in the design scheme needs Consider and take into account
Of More complicated 、 The more
. The above is also the author's aim at Redis Learning and self divergence of data consistency problems ( Rats ) Study , Welcome to the group 973961276 Let's talk about technology , If there is an explanation that is unreasonable, or please correct it !