Framework: see Linux high performance network IO + reactor model

cscw 2021-06-23 22:14:29
framework linux high performance network


The Internet I/O, It can be understood as data flow on the network . Usually we're based on socket Create a link with the remote end TCP perhaps UDP passageway , Then read and write . Single socket when , Using one thread can efficiently handle ; But if it is 10K individual socket Connect , Or more , How do we do high performance processing ?

  • Introduction to basic concepts
  • The Internet I/O The process of reading and writing
  • linux The next five networks I/O Model
  • Multiplexing I/O Understand a wave of
  • Reactor Model
  • Proacotr Model

Official account , Communicate together : Sneak forward

github Address , thank star

Introduction to basic concepts

  • process ( Threads ) Switch

    * All systems have the ability to schedule processes , It can suspend a currently running process , And resume the previously suspended process 
  • process ( Threads ) The block

    * Running process , Sometimes you wait for the execution of other events to complete , Like waiting for a lock , request I/O Read and write ; The waiting process will be blocked by the automatic execution of the system , At this point, the process does not occupy CPU
  • File descriptor

    * stay Linux, A file descriptor is an abstract concept used to express a reference to a file , It's a nonnegative integer . When the program opens an existing file or creates a new file , The kernel returns a file descriptor to the process 
  • linux signal processing

    * Linux A process can accept signal values from the system or process when it is running , Then according to the signal value to run the corresponding capture function ; The signal is a software simulation of a hardware interrupt

In the chapter of zero copy mechanism User space and kernel space and buffer , It's omitted here

The Internet IO The process of reading and writing

  • When launching on user space socket Socket read operation , Will cause context switching , User process blocking (R1) Waiting for the network data stream to arrive , Copy from NIC to kernel ;(R2) Then copy from the kernel buffer to the user process buffer . At this point, the process switch is restored , Processing the data you get
  • Here we give socket The first stage of the read operation has an alias R1, The second stage is called R2
  • When launching on user space socket Of send In operation , Causes context switching , User process blocking wait (1) Data is copied from the user process buffer to the kernel buffer . data copy complete , At this point, the process switch is restored

linux Five kinds of networks IO Model

Blocking type I/O (blocking IO)

ssize_t recvfrom(int sockfd,void *buf,size_t len,unsigned int flags, struct sockaddr *from,socket_t *fromlen);

  • The most basic I/O The model is blocking I/O Model , It's also the simplest model . All operations are performed sequentially
  • Blocking IO In the model , User space applications perform a system call (recvform), Will cause the application to be blocked , Until the data in the kernel buffer is ready , And copy the data from the kernel to the user process . Finally, the process is awakened by the system to process data
  • stay R1、R2 Two successive stages , The whole process is blocked

Non-blocking type I/O (nonblocking IO)

  • Non blocking IO It's also a kind of synchronization IO. It's based on polling (polling) Mechanism realization , In this model , The socket is opened in a non blocking form . That is to say I/O The operation will not be completed immediately , however I/O The operation will return an error code (EWOULDBLOCK), Prompt operation not completed
  • Polling for kernel data , If the data is not ready , Then return to EWOULDBLOCK. The process goes on and on recvfrom call , Of course, you can pause to do something else
  • Until the kernel data is ready , Then copy the data to user space , Then the process gets the non error code data , Then data processing . We need to pay attention to , The whole process of copying data , The process is still in a blocked state
  • The process is in R2 Phase blocking , Although in R1 The stage is not blocked , But you need to keep polling

Multiplexing I/O (IO multiplexing)

  • Generally, there will be a large number of socket Connect , If you can query the read and write status of multiple sockets at a time , If any one is ready , Then deal with it , It's a lot more efficient . This is it. “I/O Multiplexing ”, Multichannel means more than one socket Socket , Reuse means reusing the same process
  • linux Provides select、poll、epoll Wait for multiplexing I/O How to implement
  • select or poll、epoll It's a blocking call
  • And blocking IO Different ,select Not until socket Data all arrive and then process , It's a part of it socket When the data is ready, the user process will be resumed for processing . How to know that part of the data is ready in the kernel ? answer : Leave it to the system
  • The process is in R1、R2 Stages are also blocking ; But in the R1 There's a trick in the stage , In multiple processes 、 In the environment of multithreading programming , We can assign only one process ( Threads ) To block calls select, Other threads can be liberated

Signal driven I/O (SIGIO)

  • Need to provide a signal capture function , And on and on socket Socket Association ; launch sigaction Once called, the process can be freed to handle other things
  • When the data is ready in the kernel , The process will receive a SIGIO The signal , Then interrupt to run the signal capture function , call recvfrom Read data from the kernel into user space , Reprocessing data
  • You can see that the user process is not blocked in R1 Stage , but R2 It's still blocking waiting

asynchronous IO (POSIX Of aio_ Series of functions )

  • Relative synchronization IO, asynchronous IO Initiate an asynchronous read in the user process (aio_read) After the system call , Whether the kernel buffer data is ready or not , Will not block the current process ; stay aio_read After the system call returns, the process can process other logic
  • socket When the kernel is ready , The system copies data directly from the kernel to user space , The user process is then signaled
  • R1、R2 In both phases, the process is non blocking

Multiplexing IO Understand a wave of


int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);
  • 1) Use copy_from_user Copy from user space fd_set To kernel space
  • 2) Register callback function __pollwait
  • 3) Traverse all of fd, Call its corresponding poll Method ( about socket, This poll The method is sock_poll,sock_poll Call the tcp_poll,udp_poll perhaps datagram_poll)
  • 4) With tcp_poll For example , Its core implementation is __pollwait, That is, the callback function registered above
  • 5)\__pollwait Our main job is to put current( The current process ) Hang to the device's waiting queue , Different devices have different waiting queues , about tcp_poll Come on , The waiting queue is sk->sk_sleep( Note that hanging a process in the wait queue does not mean that the process is sleeping ). A message is received at the device ( Network devices ) Or fill in the file data ( Disk device ) after , Will wake up the process waiting for sleep on the queue , At this time current I was awakened
  • 6)poll Method returns a statement describing whether the read and write operations are ready mask Mask , According to this mask Mask to fd_set assignment
  • 7) If you go through all of fd, It has not returned a readable / writable mask Mask , It will call schedule_timeout Is to call select The process of ( That is to say current) Go to sleep
  • 8) When the device driver reads and writes its own resources , Will wake up the process waiting for sleep on the queue . If it exceeds a certain time limit (timeout Appoint ), No one wakes up , Call select The process will be awakened again CPU, And then go through it again fd, Judge whether there is a ready fd
  • 9) hold fd_set Copy from kernel space to user space

select The shortcomings of

  • Every time you call select, All need to put fd Sets are copied from user state to kernel state , The cost is in fd A lot of times it's big
  • Every call at the same time select You need to traverse all the passed in the kernel fd, The cost is in fd A lot of times it's big
  • select The number of file descriptors supported is too small , The default is 1024


int epoll_create(int size);
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
int epoll_wait(int epfd, struct epoll_event *events,int maxevents, int timeout); 
  • call epoll_create, It's going to be in the kernel cache Li Jian Red and black trees For storage later epoll_ctl From the socket, At the same time, it will create another rdllist Double linked list Used to store events that are ready . When epoll_wait Invocation time , Just look at this rdllist Two way linked list data can be
  • epoll_ctl In the epoll Add to object 、 modify 、 When deleting an event , Is in rbr Operating in red and black trees , Very fast
  • Add to epoll Events in will be associated with devices ( Such as network card ) Establish a callback relationship , The callback method is called when the corresponding event occurs on the device , Add events to rdllist In a two-way list ; This callback method in the kernel is called ep_poll_callback

epoll Two trigger modes of

  • epoll Yes EPOLLLT and EPOLLET Two trigger modes ,LT It's the default mode ,ET yes “ High speed ” Pattern ( Only support no-block socket)

    * LT( Level trigger ) In mode , As long as the file descriptor has data to read ,** Every time epoll_wait Will trigger its read event **
    * ET( Edge trigger ) In mode , Yes detected I/O When an event is , adopt epoll_wait Call to get a file descriptor with event notification , For file descriptors , As readable , The file descriptor must be read all the way to empty ( Or return EWOULDBLOCK),** Or next time epoll_wait This event will not be triggered **

epoll comparison select The advantages of

  • solve select Three disadvantages

    * ** For the first disadvantage **:epoll The solution is epoll_ctl Function . Every time you register a new event to epoll When in the handle ( stay epoll_ctl It is specified in EPOLL_CTL_ADD), Will take all of fd Copy into the kernel , Not in epoll_wait Duplicate copies when .epoll Guaranteed every fd It will only be copied once in the whole process (epoll_wait There is no need to duplicate )
    * ** For the second disadvantage **:epoll For each fd Specify a callback function , When the device is ready , When waking up the waiters on the waiting queue , I'll call this callback function , And this callback function will be ready to fd Add a ready list .epoll_wait In fact, you can check whether there are ready ones in the ready list fd( No traversal required )
    * ** For the third disadvantage **:epoll There is no such restriction , It supports FD The upper limit is the maximum number of files that can be opened , The number is generally much higher 2048, for instance , stay 1GB Memory on the machine is approximately 10 All around , In general, this number has a lot to do with system memory 
  • epoll A high performance

    * epoll Red black tree is used to save the file descriptor events that need to be monitored ,epoll_ctl The operation of adding, deleting and modifying is fast
    * epoll You don't need to traverse to get ready fd, Return to the ready list directly
    * linux2.6 Then I used mmap technology , Data no longer needs to be copied from the kernel to user space , Zero copy

About epoll Of IO The model is a question of synchronous asynchrony

  • Concept definition

    * Sync I/O operation : Cause the request process to block , until I/O Operation is completed
    * asynchronous I/O operation : Does not cause the request process to block , Asynchrony only deals with I/O Notification after the operation is completed , Not actively reading and writing data , Read and write the data by the system kernel
    * Blocking , Non blocking : process / Whether the data to be accessed by the thread is ready , process / Whether the thread needs to wait 
  • asynchronous IO The idea is to ask for no blocking I/O call . It was introduced that I/O The operation is divided into two stages :R1 Wait for the data to be ready .R2 Copy data from the kernel to the process . although epoll stay 2.6 The kernel is followed by mmap Mechanism , Make it in R2 Stages don't need to be copied , But it is in R1 It's still blocked . So it's classified as synchronous IO

Reactor Model

Reactor The central idea is to deal with everything I/O Event registration to a center I/O On the multiplexer , At the same time, the main thread / The process is blocked on the multiplexer ; Once you have I/O Events come or are ready , The multiplexer returns , And will be registered in advance I/O Events are distributed to the corresponding processor

Introduction to related concepts :

  • event : It's state ; such as : Read ready event Refers to the state in which we can read data from the kernel
  • Event separator : The waiting for an event to happen is usually handed over to epoll、select; And events come at random , Asynchronous , So you need to call... In a loop epoll, The corresponding encapsulated module in the framework is the event separator ( To understand simply is to say to epoll encapsulation )
  • Event handler : After the event occurs, the process or thread is needed to handle , This handler is the event handler , General and event separators are different threads

    Reactor The general process of

  • 1) The application is Event separator register Read write ready event and Read write ready event handler
  • 2) The event separator waits for the read / write ready event to occur
  • 3) Read write ready event occurs , Activate the event separator , Separator calls read / write ready event handler
  • 4) The event handler first reads data from the kernel into user space , And then process the data

Single thread + Reactor

Multithreading + Reactor

Multithreading + Multiple Reactor

Proactor The general flow of the model

  • 1) The application registers with the event separator Read the completion event and Read complete event handler , And send an asynchronous read request to the system
  • 2) The event separator waits for the read event to complete
  • 3) In the process of separator waiting , The actual operation of the kernel is executed by the read thread of the kernel , And copy the data to the process buffer , Finally, the event separator is informed of the arrival of read completion
  • 4) The event separator is listening for Read the completion event , Activate The processor that reads the completion event
  • 5) The read completion event handler processes the data in the user process buffer directly

    Proactor and Reactor The difference between

  • Proactor It's based on asynchrony I/O The concept of , and Reactor It's usually based on multiplexing I/O The concept of
  • Proactor There is no need to copy data from the kernel to user space , This is done by the system

Welcome refers to a mistake in the text

Reference article


  1. redis cluster如何支持pipeline
  2. How does redis cluster support pipeline
  3. 上海 | 人英网络 | 招Java开发25-35K、React前端开发25-40K
  4. Shanghai | Renying network | recruit java development 25-35k, react front end development 25-40k
  5. SpringCloud+Docker+Jenkins+GitLab+Maven实现自动化构建与部署实战
  6. Spring cloud + docker + Jenkins + gitlab + Maven to realize automatic construction and deployment
  7. 性能工具之linux三剑客awk、grep、sed详解
  8. Performance tools of Linux three swordsmen awk, grep, sed
  9. 一次“不负责任”的 K8s 网络故障排查经验分享
  10. An "irresponsible" experience sharing of k8s network troubleshooting
  11. 性能工具之linux三剑客awk、grep、sed详解
  12. Performance tools of Linux three swordsmen awk, grep, sed
  13. 使用Spring Data JPA 访问 Mysql 数据库-配置项
  14. Accessing MySQL database with spring data JPA - configuration item
  15. 一次“不负责任”的 K8s 网络故障排查经验分享
  16. An "irresponsible" experience sharing of k8s network troubleshooting
  17. 注册中心ZooKeeper,Eureka,Consul,Nacos对比
  18. Linux最常用的指令大全!快看看你掌握了吗?
  19. Comparison of zookeeper, Eureka, consult and Nacos
  20. Linux most commonly used instruction encyclopedia! Let's see. Do you have it?
  21. Matrix architecture practice of Boshi fund's Internet open platform based on rocketmq
  22. 字节面试,我这样回答Spring中的循环依赖,拿下20k offer!
  23. Byte interview, I answer the circular dependence in spring like this, and get 20K offer!
  24. oracle 11g查看alert日志方法
  25. How to view alert log in Oracle 11g
  26. 手写Spring Config,最终一战,来瞅瞅撒!
  27. Handwritten spring config, the final battle, come and see!
  28. 用纯 JavaScript 撸一个 MVC 框架
  29. Build an MVC framework with pure JavaScript
  30. 使用springBoot实现服务端XML文件的前端界面读写
  31. Using springboot to read and write the front interface of server XML file
  32. 【Javascript + Vue】实现随机生成迷宫图片
  33. [Javascript + Vue] random generation of maze pictures
  34. 大数据入门:Hadoop伪分布式集群环境搭建教程
  35. Introduction to big data: Hadoop pseudo distributed cluster environment building tutorial
  36. 八股文骚套路之Java基础
  37. commons-collections反序列化利用链分析(3)
  38. Java foundation of eight part wensao routine
  39. Analysis of common collections deserialization utilization chain (3)
  40. dubbogo 社区负责人于雨说
  41. Yu Yu, head of dubbogo community, said
  42. dubbogo 社区负责人于雨说
  43. Yu Yu, head of dubbogo community, said
  44. 设计模式 选自《闻缺陷则喜》此书可免费下载
  45. The design pattern is selected from the book "you are happy when you hear defects", which can be downloaded free of charge
  46. xDAI被选为 Swarm 的侧链解决方案,将百倍降低 Swarm 网络Gas费
  47. L2 - 深入理解Arbitrum
  48. Xdai is selected as the side chain solution of swarm, which will reduce the gas cost of swarm network 100 times
  49. L2 - deep understanding of arbitrum
  50. Java全栈方向学习路线
  51. 设计模式学习04(Java实现)——单例模式
  52. Java full stack learning route
  53. Design pattern learning 04 (Java implementation) - singleton pattern
  54. Mybatis学习01:利用mybatis查询数据库
  55. Mybatis learning 01: using mybatis to query database
  56. Java程序员从零开始学Vue(01)- 前端发展史
  57. Java程序员从零开始学Vue(05)- 基础知识快速补充(html、css、js)
  58. Java programmers learn Vue from scratch
  59. Java programmers learn Vue from scratch (05) - quick supplement of basic knowledge (HTML, CSS, JS)
  60. 【Java并发编程实战14】构建自定义同步工具(Building-Custom-Synchronizers)