Typical UNIX All systems support one process to create multiple threads (thread). stay Linux Process foundation I mentioned ,Linux Organize operations by process ,Linux Threads in are also process based . Although the implementation is different from others UNIX System , but Linux There is no difference between real multithreading and real multithreading in logic and usage .

 

Multithreading

Let's take a look at multithreading first . stay Linux From program to process in , We see a representation of a program in memory . The whole process of running this program , There is only one right of control . When a function is called , This function takes control , Become active (active) function , Then run the instructions in the function . meanwhile , The other functions are out of field , Not running . As shown in the figure below :

Linux From program to process

 

We see , The squares are connected by arrows . It's like the functions are connected on a single line , The computer performs the operations defined in each function like a pipeline . Such a program is called a single threaded program .


Multithreading is to allow multiple control rights in a process , So that multiple functions can be activated at the same time , So that the operation of multiple functions can run at the same time . Even single CPU The computer , You can also switch between instructions in different threads , This results in the effect of multithreading running at the same time . As shown in the figure below , It's a multithreaded process :

main() To func3() Until then main() Form a thread , Besides func1() and func2() Make up the other two threads . Operating systems generally have system calls that allow you to run a function into a new thread .

 

Remember when we were Linux From program to process The functions and uses of the stack mentioned in . A stack , Only the bottom frame can be read and written . Corresponding , Only the function corresponding to the frame is activated , In working condition . To achieve multithreading , We have to get around the limitations of the stack . So , When creating a new thread , Let's build a new stack for this thread . Each stack corresponds to a thread . When a stack is executed to all ejects , The corresponding thread completes the task , And call it a day . therefore , Multithreaded processes have multiple stacks in memory . Multiple stacks are separated by a certain space , In case of stack growth . Each thread can call the parameters and variables in the bottom frame of its own stack , And share memory with other threads Text,heap and global data Area . Corresponding to the above example , We need to have... In our process space 3 Stack .

( It should be noted that , For multithreading , Because there are multiple stacks in the same process space , Any blank area that is filled will lead to stack overflow The problem of .)

 

Concurrent

Multithreading is equivalent to a concurrent (concunrrency) System . Concurrent systems generally perform multiple tasks at the same time . If multiple tasks can share resources , Especially when writing a variable at the same time , We need to solve the problem of synchronization . for instance , We have a multi-threaded train ticketing system , With global variables i Store the remaining votes . Multiple threads are constantly selling tickets (i = i - 1), Until the remaining votes are 0. So each one needs to do the following :

 

/*mu is a global mutex*/
while (1) {                        /*infinite loop*/
    if (i != 0) i = i -1
    else {
      printf("no more tickets");
      exit();
    }
}

 

If only one thread executes the above program ( It's like selling tickets at a window ), No problem . But if multiple threads execute the above program ( It's equivalent to selling tickets at multiple windows ), We'll have problems . We'll see , The fundamental reason is that each thread that happens at the same time can be used to i Read and write .

We have if The structure will give CPU Two instructions , One is to judge whether there are any remaining tickets (i != 0), One is selling tickets (i = i -1). A thread will first determine whether there is a ticket ( For example, at this time i by 1), But there is a time window between the two instructions , Other threads may perform ticket selling operations in this time window (i = i -1), The condition that causes this thread to sell tickets no longer holds . But because the thread has already executed the judgment instruction , So there's no way to know i There is a change , So go ahead with the ticket order , To sell tickets that don't exist (i Become negative ). For a real ticketing system , It's going to be a serious mistake ( Sold too many tickets , The train is full ).

In the case of concurrency , The order of instruction execution is determined by the kernel . Inside the same thread , Instructions are executed in sequence , But it's hard to say which of the instructions between different threads will be executed first . If the result of running depends on the sequence of different threads , Then there will be competitive conditions (race condition), In such a situation , The result of the computer is hard to predict . We should try our best to avoid the formation of competitive conditions . The most common way to solve the contention condition is to make two separated instructions form an atomic operation that cannot be separated (atomic operation), Other tasks cannot be inserted into atomic operations .

 

Multithreading synchronization

For multithreaded programs , Sync (synchronization) It means that only one thread is allowed to access a resource in a certain period of time . And in this time , No other threads are allowed to access this resource . We can use mutex (mutex), Condition variables, (condition variable) And read-write lock (reader-writer lock) To synchronize resources .

 

1) The mutex

Mutex is a special variable , It's locked (lock) And open (unlock) Two states . Mutexes are generally set as global variables . An open mutex can be obtained by a thread . Once you get , This mutex will lock , After that, only the thread has the right to open . Other threads that want to get mutexes , Will wait until the mutex is opened again . We can think of mutex as a toilet that can hold only one person , When someone enters the bathroom , You can lock the bathroom from the inside . Others can only wait outside the mutex for that person to come out , To get in . The people waiting outside are not in line , Who first saw that the bathroom was empty , You can rush in first .

The above problem is easy to solve with mutex , The program for each thread can be changed to :

 

/*mu is a global mutex*/
while (1) {                /*infinite loop*/
  mutex_lock(mu);           /*aquire mutex and lock it, if cannot, wait until mutex is unblocked*/
  if (i != 0) i = i - 1;
  else {
    printf("no more tickets");
    exit();
  }
  mutex_unlock(mu);         /*release mutex, make it unblocked*/
}

 

First execution mutex_lock() The thread will get mu. Others want to get mu The thread of must wait , Until the first thread executes to mutex_unlock() Release mu, To get mu, And continue executing the thread . So the thread is mutex_lock() and mutex_unlock() In the operation between , Not affected by other threads , It's an atomic operation .

When you need attention , If there is a thread still using the original program ( That is, not trying to get mu, And directly modify i), Mutex does not prevent the program from being modified i, Mutexes lose the meaning of protecting resources . therefore , Mutex mechanism requires programmers to write perfect programs to realize the function of mutex . The same is true of the other mechanisms we will talk about below .

 

2) Condition variables,

Conditional variable is another common variable . It is also often saved as a global variable , And work with mutexes .

 

Suppose such a situation : Yes 100 One worker , Each person is responsible for decorating a room . When there is 10 When the decoration of a room is finished , The boss told the corresponding ten workers to have a beer together .

How can we achieve ? The boss asked the workers to decorate the room , Check the number of rooms that have been decorated . But multithreading , There is a risk of competitive conditions . in other words , Other workers are likely to finish the work between the worker's house decoration and inspection . Solve it in the following way :

 

/*mu: global mutex, cond: global codition variable, num: global int*/
mutex_lock(mu)
num = num + 1;                      /*worker build the room*/
if (num <= 10) {                     /*worker is within the first 10 to finish*/
    cond_wait(mu, cond);            /*wait*/
    printf("drink beer");
}
else if (num = 11) {                /*workder is the 11th to finish*/
  cond_broadcast(mu, cond);         /*inform the other 9 to wake up*/
}
mutex_unlock(mu);

 

Conditional variables are used above . In addition to working with mutexes, conditional variables , You also need to work with another global variable ( there num, That's the number of decorated rooms ). This global variable is used to form the conditions .

 

The idea is as follows . We let the workers decorate the room (num = num + 1) after , Check the number of rooms that have been decorated ( num < 10 ). because mu It's locked , So there won't be any other workers decorating the room in the meantime ( change num Value ). If the worker is in the top ten , So we call cond_wait() function .
cond_wait() Do two things , One is to release mu, So that other workers can build houses . The other is waiting , until cond The notice of . In this case , Qualified threads start waiting .

When there is a notice ( The tenth room has been built ) On arrival ,condwait() It'll lock up again mu. Recovery of threads , Let's do the next sentence prinft("drink beer") ( Drink a beer !). So let's start here , until mutex_unlock(), It forms another mutex structure .

that , The first ten calls cond_wait() How does the thread get the notification ? We noticed that elif if, That is to say, building the second 11 People in a room , Is responsible for calling cond_broadcast(). This function will give all calls cond_wait() Thread broadcast notification for , To get those threads back running .

 

Condition variables are especially suitable for multiple threads waiting for a condition to occur . If you don't use conditional variables , Then each thread needs to constantly try to get the mutex and check whether the condition occurs , This greatly wastes the resources of the system .

 

3) Read-write lock

Read write locks are very similar to mutexes .r、RW lock There are three states :  Shared read lock (shared-read),  Mutex write lock (exclusive-write lock),  open (unlock). The latter two states are exactly the same as the former two states of mutex .

One unlock Of RW lock Can be obtained by a thread R Lock or W lock .

If obtained by a thread R lock ,RW lock It can be continuously obtained by other threads R lock , Instead of waiting for the thread to release R lock . however , If other threads want to get W lock , It has to wait until all threads holding the shared read lock release their own R lock .

If a lock is acquired by a thread W lock , So other threads , Whether you want to get R Lock or W lock , Must wait for the thread to release W lock .

such , Multiple threads can read shared resources at the same time . Dangerous write operations are protected by mutex .

 

We need to synchronize concurrent systems , This makes programming difficult for programmers . But multithreading system can solve many problems IO The bottleneck problem . For example, we monitor network ports . If we had only one thread , So we have to monitor , Receiving request , Handle , reply , Monitor again . If we use a multithreaded system , You can let multiple threads listen . When one of our threads is processing , We can also have other threads continue to listen , such , It greatly improves the utilization of the system . As the data gets bigger and bigger , Server read and write operation is more and more today , It's quite significant . Multithreading can also make more efficient use of multithreading CPU Environment .

( It's like cooking , Constantly switch to deal with different dishes .)

 

The program in this paper adopts pseudo C Writing . Different languages have different function names ( such as mutex_lock). The focus here is on logical concepts , Rather than concrete implementations and language specifications .

 

summary

multiple threads, multiple stacks

race condition

mutex, condition variable, RW lock