For an application, that is, an operating system process , It has both kernel space ( Share with other processes ), There's also user space ( Process private ), They are all in the virtual address space . User processes cannot access kernel space , It can only access user space , Copy data from user space to kernel space , And then deal with it .


What is? IO?


IO In the model , Let's talk about IO?

     In computer system I/O It's input. (Input) And the output (Output) It means , For different operation objects , Can be divided into disks I/O Model , The Internet I/O Model , Memory mapping I/O, Direct I/O、 database I/O etc. , Any interaction system with input and output type can be regarded as I/O System , It can also be said that I/O It is the channel of data exchange and human-computer interaction of the whole operating system , This concept has nothing to do with the chosen development language , It's a general concept .


   In today's system I/O But it has a very important position , Now it's possible for the system to handle a large number of files , A lot of database operations , And it all depends on the operating system I/O performance , The bottleneck of the current system is often due to I/O Performance . therefore , To solve the problem of disk I/O The problem of slow performance , Cache is added to the system architecture to improve the response speed ; Or some high-end servers start from the hardware level , Using solid state drives (SSD) To replace the traditional mechanical hard disk ; The optimization space of a system , It's often inefficient I/O On the link , Rarely see a system CPU、 The performance of memory is the bottleneck of the whole system .

So the data is Input Where to ,Output Where to ?

Input( Input ) Data into memory ,Output( Output ) Data to IO equipment ( disk 、 Network and other devices that need to interact with memory ) in ;

Main memory ( Usually DRAM) An area of , To cache the contents of the file system , Contains all kinds of data and metadata .


IO Interface


IO Direct data transfer between device and memory is through IO Interface , The operating system encapsulates IO Interface , When we program, we can directly use ;

It's used to say , If you want to communicate with peripherals , Just through these system calls .


Cache everywhere

  1. Pictured , When the program calls various file operation functions , User data (User Data) To disk (Disk) The flow chart of is as shown in the figure . The picture depicts Linux The hierarchical relationship of the file operation function and the location of the memory cache layer . The black solid line in the middle is the boundary between user state and kernel state .

  2. Analyze this picture from the top down , First of all C Language stdio Related file operation functions defined by the library , These are all cross platform encapsulation functions implemented in user state .stdio The file operation function implemented in has its own stdio buffer, This is the cache implemented in user mode . The reason for using caching here is simple —— System calls are always expensive . If the user code is smaller size Constant reading or writing of documents ,stdio The library will read or write multiple times through buffer Aggregation can improve the efficiency of the program .stdio The library also supports fflush(3) Function to actively refresh buffer, Actively call the underlying system call to update immediately buffer The data in . Specially ,setbuf(3) Function can be used for stdio User status of the library buffer Set it up , Even cancel buffer Use .


  3. System called read(2)/write(2) There is also a layer between real disk read and write buffer, The term is used here Kernel buffer cache To refer to this layer of caching . stay Linux Next , File caching is customarily called Page Cache, The cache of lower level devices is called Buffer Cache. These two concepts are easily confused , Here is a brief introduction to the conceptual differences :Page Cache To cache the contents of a file , It's related to the file system . The contents of the file need to be mapped to the actual physical disk , This mapping is done by the file system ;Buffer Cache For caching storage device blocks ( For example, disk sector ) The data of , I don't care if there's a file system ( The metadata of the file system is cached in Buffer Cache in ).


  4. Sum up , Since the discussion Linux Under the IO operation , Nature is to skip stdio The user state of the library , The concept of system call level is discussed directly . Yes stdio Library IO Students who are interested in learning about . The kernel level caching of files is stored in the file system Page Cache Medium . So the following discussion is basically about IO Related system calls and file systems Page Cache Some mechanisms of .



Linux IO Stack

Although we can simply read the data of peripherals through system call , In fact, it benefits from Linux complete IO Stack architecture .




The figure is visible , From the interface of the system call down ,Linux Under the IO There are three levels of stack :

  1. File system layer , With write(2) For example , The kernel copied write(2) Parameter to the file system Cache in , And synchronize to the lower level in time

  2. Block layer , Management block device IO queue , Yes IO Request to merge 、 Sort ( Remember the operating system course IO Scheduling algorithm ?)

  3. Equipment layer , adopt DMA Interact directly with memory , Complete the interaction between data and specific devices

Combine this picture , Think Linux System programming used in Buffered IO、mmap(2)、Direct IO.


How do these mechanisms relate to Linux IO The stack is connected ?

The picture above is a little complicated , Draw a sketch , Add the location of these mechanisms :


Conventional Buffered IO Use read(2) What is the process of reading a file ?

Suppose you're going to read a cold file (Cache Does not exist in the ),open(2) After opening the file kernel, a series of data structures are established , Next call read(2), Go to the file system level , Find out Page Cache There is no disk map for this location... In , Then create the corresponding one Page Cache And associated with related sectors . Then the request continues to the block device layer , stay IO Line up in the queue , After receiving a series of scheduling, it reaches the device driver layer , At this time, the general use of DMA Read the corresponding disk sector to Cache in , then read(2) Copy the data to the user state provided by the user buffer In the middle (read(2) The parameters of ).


There are several copies of the whole process ?

From disk to Page Cache For the first time , from Page Cache To user state buffer It's the second time .


and mmap(2) What did you do ?

mmap(2) Put... Directly Page Cache Mapped to the address space of user state , therefore mmap(2) There is no second copy process to read a file in .


that Direct IO What did you do ?

It's a much tougher mechanism , Directly let user state and block IO Layer docking , Give up directly Page Cache, Copy data directly from disk and user mode .


What are the benefits ?

Write operations directly map the process's buffer To disk sector , With DMA The way to transmit data , Reduce the need to Page Cache A copy of the layer , Improve the efficiency of writing .


For reading , For the first time, it must be faster than the traditional way , But later reading is not as good as the traditional way ( Of course, you can do it yourself in user mode Cache, That's what some commercial databases do ).

Except for the traditional Buffered IO You can use offset more freely + Length of the way to read and write files outside ,mmap(2) and Direct IO There are requirements for data alignment by page ,Direct IO It also limits read and write to an integer multiple of the underlying storage block size ( even to the extent that Linux 2.4 And the integral multiple of the logical block of the file system ). So the interface is getting lower and lower , In exchange for the apparent efficiency improvement behind , More needs to be done at the application level . So I want to make good use of these advanced features , In addition to a deep understanding of the mechanism behind it , We also need to work hard on system design .



Blocking / Non blocking and synchronization / asynchronous

I understand IO The concept of , Now let's talk about blocking 、 Non blocking 、 Sync 、 asynchronous .

Blocking / Non blocking

The target is the caller's own situation


Refers to the caller after calling a function , Waiting for the return value of this function , The thread is suspended .

Non blocking

Refers to the caller after calling a function , Do not wait for the return value of the function , Thread continues to run other programs ( Perform other operations or iterate over whether the function returns a value )


Sync / asynchronous

The target is the callee


Refers to the callee after being called , After operating all the actions contained in the function , Then return the return value


Refers to the callee after being called , First return the return value , And then do the other actions included in the function .


Five kinds IO Model

Let's say recvfrom/recv Function as an example , Both functions are kernel functions of the operating system , For from ( Connected )socket Receive data on , And capture the address of the data sending source .
recv The function prototype :

ssize_t recv(int sockfd, void *buff, size_t nbytes, int flags)
  sockfd: Receiver socket descriptor
  buff: For storage recv Buffer of data received by function
  nbytes: To specify buff The length of
  flags: It's usually set to 0

The Internet IO The essence is socket The read ,socket stay linux The system is abstracted as a stream ,IO It can be understood as the operation of convection . For once IO visit ( With read give an example ), The data will be copied to the buffer of the operating system kernel first , Then it will copy from the buffer of the operating system kernel to the address space of the application .

So , When one recv When the operation occurs , It goes through two stages :


 The first stage : Wait for the data to be ready  (Waiting for the data to be ready).
The second stage : Copy data from the kernel into the process  (Copying the data from the kernel to the process).

about socket In terms of flow :

 First step : It usually involves waiting for data packets on the network to arrive , Then it's copied to a buffer in the kernel .
The second step : Copy data from kernel buffer to application buffer .


Blocking IO(Blocking IO)

Refers to the caller after calling a function , Waiting for the return value of this function , The thread is suspended . It's like you go to the fitting room in the mall , There is someone in it , Then you just wait outside the door .( The whole process is blocked )

BIO program flow

When the user process calls recv()/recvfrom() This system call ,kernel He began IO The first stage of : Prepare the data ( For network IO Come on , Many times the data hasn't arrived at the beginning . such as , I haven't received a complete UDP package . This is the time kernel Just wait for enough data to come ). This process needs to wait , That is to say, it needs a process to copy the data to the buffer of the operating system kernel . On the user process side , The whole process will be blocked ( Of course , It's the process's own choice of blocking ).

Second stage : When kernel Wait until the data is ready , It will take data from kernel Copy to user memory , then kernel Return results , The user process is released block The state of , Run it again .

therefore ,blocking IO It is characterized by IO Both phases of implementation are block 了 .

advantage :

1.      Able to return data in time , No delay ;

2.      It's easy for kernel developers ;

shortcoming :

     For users, waiting is the price of performance ;


Non blocking IO

Refers to the caller after calling a function , Do not wait for the return value of the function , Thread continues to run other programs ( Perform other operations or iterate over whether the function returns a value ). It's like you want to drink water , The water is not boiling yet , You go to the water dispenser every other time , Until the water boils .( Blocking while copying data )


Non blocking IO program flow

When the user process issues read In operation , If kernel The data in is not ready , Then it won't block User process , But immediately return to a error. From the perspective of user process , It launched a read After the operation , There is no need to wait , It's about getting an immediate result . The result of user process judgment is a error when , It knows the data is not ready , So it can send... Again read operation . once kernel The data in is ready , And again received the user process's system call, Then it immediately copies the data to the user memory , Then return .

therefore ,nonblocking IO It is characterized by continuous active inquiry of user process kernel Is the data ready .


Synchronous non blocking mode is compared with synchronous blocking mode :

advantage :

Be able to do other work while waiting for the task to be completed ( Including submitting other tasks , That is to say “ backstage ” You can have multiple tasks executing at the same time ).


shortcoming :

The response delay for task completion increases , Because it takes a while to poll read operation , The task may be completed at any time between polls . This leads to a reduction in overall data throughput .

IO Multiplexing

I/O It means the Internet I/O, Multiple means multiple TCP Connect ( namely socket perhaps channel), Reuse refers to reusing one or more threads . It means that a thread or group of threads processes multiple connections . For example, when students finish their homework in class, they raise their hands , The teacher went down to check his homework .( To a IO port , Two calls , Two returns , More than blocking IO There's no advantage ; The key is to be able to achieve multiple at the same time IO Port for listening , You can read to multiple readers at the same time / Write operated IO Function for polling detection , Until data is readable or writable , To really call IO Operation function .)

IO Multiplex program flow

This model is actually similar to BIO It's exactly the same , It's all blocked , But in socket Add a layer of proxy select,select You can monitor multiple socekt Is there any data , Improve performance in this way .
Once it is detected that one or more file descriptions have data coming ,select Function returns , And then call recv function ( This one is also blocked ), Data copied from kernel space to user space ,recv The function returns .

Multiplexing is characterized by a mechanism in which a process can wait at the same time IO File descriptor , The kernel monitors these file descriptors ( socket descriptor ), Any one of them enters the read ready state ,select, poll,epoll The function returns . For the way you watch , Can be divided into select, poll, epoll Three ways .


IO Multiplexing is blocking in select,epoll On top of such system calls , And there's no blocking in the real I/O System calls such as recvfrom above .


stay I/O During programming , When multiple client access requests need to be processed at the same time , You can use multithreading or I/O Multiplexing technology for processing .I/O Multiplexing technology by putting multiple I/O The blocks are multiplexed to the same select On the block , Thus, the system can process multiple client requests at the same time in the case of single thread . With traditional multithreading / Multi process model ,I/O The biggest advantage of multiplexing is low system overhead , The system does not need to create new extra processes or threads , There is no need to maintain the running of these processes and threads , Reduce the maintenance workload of the system , Save system resources ,I/O The main application scenarios of multiplexing are as follows :

1. The server needs to process multiple sockets that are listening or connected at the same time .

2. The server needs to handle multiple network protocol sockets at the same time .


When a user process makes a system call , They're waiting for the data to arrive , The way to deal with it is different , Just wait , polling ,select or poll polling , Two stage process :

There was a blockage in the first stage , Some don't block , Some can be blocked or not blocked .

The second stage is blocked .

From the whole IO In the process , They are all sequential , So it can be classified as a synchronous model (synchronous). It is the process that waits actively and checks the status to the kernel .

Signal driven IO

Signal driven IO program flow

Install in user mode SIGIO Signal processing functions ( use sigaction Function or signal Function to install custom signal processing functions ), namely recv function . Then the user mode program can perform other operations without being blocked .
Once the data comes , The operating system notifies the user mode program by signal , The user mode program jumps to a custom signal processing function .
It is called in the signal processing function recv function , receive data . After data is copied from kernel space to user space ,recv The function returns .recv Functions are not blocked by waiting for data to arrive .
This approach makes asynchronous processing possible , Signals are the basis of asynchronous processing .


stay Linux in , The way to inform is by signal :

If the process is busy doing something else in user mode , Then force the interruption , Call a pre registered signal processing function , This function can determine when and how to handle this asynchronous task . Because the signal processing function is suddenly burst in , So it's like an interrupt handler , There are a lot of things that can't be done , So to be on the safe side , It's usually the event “ registration ” As soon as you drop into the queue , And then go back to what the process was doing .

If the process is busy doing something else in kernel mode , For example, read and write disks in synchronous blocking mode , Then we have to hang up the notice , When the kernel state is busy , It's about to return to user mode , And then trigger the signal to inform .

If the process is now suspended , There is nothing to do sleep 了 , Then wake up the process , Next time there will be CPU In my spare time , It will be scheduled to this process , Trigger signal notification .

asynchronous API It's light to say , It's hard to do , This is mainly about API For the implementers of .Linux The asynchronous IO(AIO) Support is 2.6.22 It was introduced , Many asynchronous system calls are not supported IO.Linux The asynchronous IO It was originally designed for databases , So through asynchronous IO The read and write operations of are not cached or buffered , This can't take advantage of the operating system's caching and caching mechanism .

Many people put Linux Of O_NONBLOCK Think it's asynchronous , But in fact, this is the synchronous non blocking method mentioned above . It's important to point out that , although Linux Upper IO API A little rough , But every programming framework has encapsulated asynchrony IO Realization . The operating system does less , Leave more freedom to users , It is UNIX Philosophy of design , It's also Linux One of the reasons why programming frameworks are blooming .


asynchronous IO

asynchronous IO program flow

asynchronous IO Is the most efficient .

    asynchronous IO adopt aio_read Function implementation ,aio_read Submit a request , And submit a buffer in user mode space . Even if there's no data coming in the kernel ,aio_read The function immediately returns , Applications can handle other things .
    When the data comes , The operating system automatically copies data from kernel space to aio_read User mode buffer submitted by function . The copy is completed and the user mode program is signaled , After the user mode program gets the data, it can perform subsequent operations .

asynchronous IO And signal driven IO Different ?
    It's where the data is when the signal tells the user program . asynchronous IO Data has been copied from kernel space to user space ; And signal driven IO It's still in kernel space , Waiting for the recv Function to copy data to user state space .
    asynchronous IO Actively copy data to user state space , Actively push data to user state space , No call required recv Method to pull data from kernel space to user state space . asynchronous IO It's a mechanism for pushing data , Compared to signal processing IO The mechanism of pulling data is more efficient .
   Pushing data is done directly , And pulling data needs to call recv function , There's extra overhead in calling functions , So the efficiency is low .


  Leave the reader with a few questions to think about :

1、 How to design IO The scale of reading and writing , Improve IO The efficiency of ?

2、 How to understand random IO And order IO ?         

3、 How to improve high concurrency IO Efficiency and concurrent processing capability of ?