Nagle The idea of the algorithm is good , Avoid filling the network with small packets , Improve network utilization . But when Nagle Algorithm encountered delayed ACK The tragedy happened .Delayed ACK The original intention is to improve TCP performance , With the response data ACK, At the same time, avoid muddleheaded Window Syndrome , You can have one ack Identify multiple segments to save overhead .

    Tragedy happens in this situation , Suppose one end sends data and waits for the other end to respond , The protocol is divided into header and data , Unfortunately, I chose write-write, And then again read, That is, send the header first , Send data again , Finally, wait for a response . The pseudo code of the sender is like this

write(head);  
write(body);  
read(response);

The processing code of the receiver is similar to this :

read(request);  
process(request);  
write(response);

   It is assumed that head and body They are all relatively small , When enabled by default nagle Algorithm , And the first time it was sent , according to nagle Algorithm , The first paragraph head It can be sent immediately , Because no Waiting for confirmation ; The receiver receives head, But the package is incomplete , Continue to wait for body Reach and delay ACK; The sender continues to write body, Now nagle The algorithm works , because head Not yet. ACK, therefore body To delay sending . This results in the phenomenon that both the sender and the receiver are waiting for each other to send data , The sender waits for the receiver ACK head To continue sending body, And the receiver is waiting for the sender to send body And delay ACK, The tragedy is unspeakable . In this case, you have to wait for one end to time out and send data before you can go on .

   It is because nagle Algorithm and delayed ack Influence , Plus this write-write-read The way of programming has caused a lot of netizens to discuss why the performance of their own network programs is so poor . Then a lot of people will suggest in the post Ban Nagle Algorithm , Set up TCP_NODELAY by true You can disable nagle Algorithm . But is this really the only and best way to solve the problem ?

   In fact, the problem is not nagle In the algorithm , The problem is write-write-read This kind of application programming . Ban nagle The algorithm can solve the problem temporarily , But it's forbidden nagle Algorithms also bring great disadvantages , The network is full of small packets , The utilization rate of the network is not going up , In extreme cases , A large number of small packets lead to network congestion or even collapse . therefore , Can we not ban it or not good , Later, we will talk about the circumstances under which it needs to be disabled nagle Algorithm . For most applications , It's usually a continuous request —— Response model , There's a request and a response , So the request package ACK Actually It can be delayed until it is sent with the response , under these circumstances , In fact, you just need to avoid write-write-read Form of the call can avoid the delay phenomenon , utilize writev Do clustering or take head and body Write together , And then again read, become write-read-write-read To call , There is no need to disable nagle The algorithm can also do without delay late .

   writev It's a system call , stay Java It's used in GatheringByteChannel.write(ByteBuffer[] srcs, int offset, int length) Methods to do aggregate writing . There may be a point of value here , Many students look at java nio Almost no framework uses this writev call , There's a reason . Mainly because Java Of write I am right ByteBuffer There is a temporary cache , and writev No caching , In the test write Rather than writev More efficient , So it's usually more recommended that users will head and body Put it in the same Buffer To avoid calling writev.

   Now we'll do an actual code test to end the discussion . This example is very simple , The client sends a line of data to the server , The server simply returns this line of data to . The client can choose to send it twice Hair , It's just a send . It's two times write-write-read, One hair is write-read-write-read, Look at the difference between the two forms of delay different . Be careful , stay windows Test the following code on the , The client and server must be on two separate machines , Seems to be winsock Yes loopback The connection is handled differently .

    Server source code :

 

package net.fnil.nagle;  
  
import java.io.BufferedReader;  
import java.io.InputStream;  
import java.io.InputStreamReader;  
import java.io.OutputStream;  
import java.net.InetSocketAddress;  
import java.net.ServerSocket;  
import java.net.Socket;  
  
  
public class Server {  
    public static void main(String[] args) throws Exception {  
        ServerSocket serverSocket = new ServerSocket();  
        serverSocket.bind(new InetSocketAddress(8000));  
        System.out.println("Server startup at 8000");  
        for (;;) {  
            Socket socket = serverSocket.accept();  
            InputStream in = socket.getInputStream();  
            OutputStream out = socket.getOutputStream();  
  
            while (true) {  
                try {  
                    BufferedReader reader = new BufferedReader(new InputStreamReader(in));  
                    String line = reader.readLine();  
                    out.write((line + "\r\n").getBytes());  
                }  
                catch (Exception e) {  
                    break;  
                }  
            }  
        }  
    }  
}

 

The server is bound to the local 8000 port , And listen for connections , When connected, it blocks reading a line of data , And return the data to the client .

Client code :

 

package net.fnil.nagle;  
  
import java.io.BufferedReader;  
import java.io.InputStream;  
import java.io.InputStreamReader;  
import java.io.OutputStream;  
import java.net.InetSocketAddress;  
import java.net.Socket;  
  
  
public class Client {  
  
    public static void main(String[] args) throws Exception {  
        //  Whether to write separately head and body  
        boolean writeSplit = false;  
        String host = "localhost";  
        if (args.length >= 1) {  
            host = args[0];  
        }  
        if (args.length >= 2) {  
            writeSplit = Boolean.valueOf(args[1]);  
        }  
  
        System.out.println("WriteSplit:" + writeSplit);  
  
        Socket socket = new Socket();  
  
        socket.connect(new InetSocketAddress(host, 8000));  
        InputStream in = socket.getInputStream();  
        OutputStream out = socket.getOutputStream();  
  
        BufferedReader reader = new BufferedReader(new InputStreamReader(in));  
  
        String head = "hello ";  
        String body = "world\r\n";  
        for (int i = 0; i < 10; i++) {  
            long label = System.currentTimeMillis();  
            if (writeSplit) {  
                out.write(head.getBytes());  
                out.write(body.getBytes());  
            }  
            else {  
                out.write((head + body).getBytes());  
            }  
            String line = reader.readLine();  
            System.out.println("RTT:" + (System.currentTimeMillis() - label) + " ,receive:" + line);  
        }  
        in.close();  
        out.close();  
        socket.close();  
    }  
  
}

 

    Client through a writeSplit Variable to control whether to write separately head and body, If true, Write first head To write body, Otherwise it would be head add body Write once . The logic of the client is also very simple , Connect to the server , Send a line , Wait for the answer and print RTT, loop 10 The last time to close the connection .


   First , We will writeSplit Set to true, That is, write a line twice , The results of my local test , My machine is ubuntu 11.10:

 

WriteSplit:true  
RTT:8 ,receive:hello world  
RTT:40 ,receive:hello world  
RTT:40 ,receive:hello world  
RTT:40 ,receive:hello world  
RTT:39 ,receive:hello world  
RTT:40 ,receive:hello world  
RTT:40 ,receive:hello world  
RTT:40 ,receive:hello world  
RTT:40 ,receive:hello world  
RTT:40 ,receive:hello world

 

    You can see , The time interval between each request and response is 40ms, Except for the first time .linux Of delayed ack yes 40ms, Not what I thought 200ms. The first time immediately ACK, Seems to follow linux Of quickack mode of , I'm not particularly clear here , If you have a clear classmate, please advise me .

     Next , We will still writeSplit Set to true, But the client is disabled nagle Algorithm , That is, the client code is connect Add a line before :

Socket socket = new Socket();  
socket.setTcpNoDelay(true);  
socket.connect(new InetSocketAddress(host, 8000));

    Run down the test again :

 

WriteSplit:true  
RTT:0 ,receive:hello world  
RTT:0 ,receive:hello world  
RTT:0 ,receive:hello world  
RTT:0 ,receive:hello world  
RTT:1 ,receive:hello world  
RTT:0 ,receive:hello world  
RTT:0 ,receive:hello world  
RTT:0 ,receive:hello world  
RTT:0 ,receive:hello world  
RTT:0 ,receive:hello world

 

   It's much more normal at this time , Most of the RTT All the time 1 Under milliseconds . Sure enough Nagle The algorithm can solve the delay problem .
   If we don't ban nagle Algorithm , And will be writeSplit Set to false, Also is to head and body Write once , Run the test again ( What I remember will be setTcpNoDelay This line is deleted ):

 

WriteSplit:false  
RTT:7 ,receive:hello world  
RTT:1 ,receive:hello world  
RTT:0 ,receive:hello world  
RTT:0 ,receive:hello world  
RTT:0 ,receive:hello world  
RTT:0 ,receive:hello world  
RTT:0 ,receive:hello world  
RTT:0 ,receive:hello world  
RTT:0 ,receive:hello world  
RTT:0 ,receive:hello world

 

 

   It turns out that nagle The effect of the algorithm is similar to . In this case , What other reason do we have to ban nagle The algorithm ? Through me in xmemcached Of Testing in pressure testing , Enable nagle The algorithm even has a certain efficiency advantage in the access of small data ,memcached The protocol itself is a continuous request response model . If the above test is in windows Run up , Will find RTT The biggest will be 200ms above , so winsock Of delayed ack Timeouts are 200ms.

   Last question , When should it be disabled nagle Algorithm ? When your application is not such a continuous request —— Response model , But when you need to send a lot of small data in one direction in real time, or when the request is intermittent Septal , It should be disabled nagle Algorithms to improve responsiveness . One of the most obvious examples is telnet application , You always want to type in a line of data and send it to the server immediately , And then I see the response right away , It's not that I have to type in a lot of orders or wait 200ms To see the response .

   It's me on it nagle Algorithm and delayed ack Understanding and testing of , Please let me know if there are any mistakes .