This article will introduce in detail  HTTP 2 All aspects of the agreement , The knowledge points are as follows :

  • HTTP 2 Connection establishment

  • HTTP 2 The relationship between frame and stream in a video

  • HTTP 2 The secret of traffic saving in the Internet :HPACK Algorithm

  • HTTP 2 Agreement Server Push The ability of

  • HTTP 2 Why flow control ?

  • HTTP 2 Problems with the agreement

One 、HTTP 2 Connection establishment

Different from the stereotype of many people HTTP 2 The agreement itself does not stipulate that it must be based on TLS/SSL above , In fact, with ordinary TCP The connection can also be done HTTP 2 Connection establishment . But now for the sake of safety, all browsers on the market only support the browser based on TLS/SSL Of HTTP 2 agreement . In short, we can build on TCP Connect the above HTTP 2 The agreement calls it H2C, And build on TLS/SSL What is above the agreement can be understood as H2 了 .

Enter the command :

tcpdump -i eth0 port 80 and host -w h2c.pcap &

And then use curl Access based on TCP Connect , That is to say port 80 Port of HTTP 2 Site ( There is no way to access it with a browser , Because browsers don't allow )

curl --http2 -v

In fact, you can have a general understanding of the connection establishment process by looking at the log :

We will TCPDump Coming out pcap Copy files to local , And then use Wireshark Open it and restore the whole HTTP 2 Connection establishment message :

First of all HTTP 1.1 Upgrade to HTTP 2 agreement

Then the client also needs to send a “ Magic frame ”:

Finally, you need to send a setup frame :

after , Let's take a look , be based on TLS Of HTTP 2 How the connection is made , Considering encryption and other factors , We need to do some preparatory work ahead of time . Can be in Chrome Download the plug-in from .

Then open any web page, as long as you see that the lightning icon is blue, it means that the site supports HTTP 2; Otherwise, it doesn't support . Here's the picture :

take Chrome Browser's TLS/SSL Information like that Output to a log file , Additional system variables need to be configured , As shown in the figure :

And then we put our Wireshark in SSL Related settings are also configured .

So the browser is doing TLS When the protocol interacts , Relevant encryption and decryption information will be written to this log In file , our Wireshark I'll use this log The information in the file to decrypt our TLS message .

With the above foundation , We can start to analyze based on TLS Connected HTTP 2 It's agreed . For example, we visit tmall The site of  Then open our Wireshark.

Take a look at the label and you can see that , yes TLS After the connection is established Then continue sending magic frames and setting frames , Only on behalf of HTTP 2 The connection is really established . Let's see TLS The message client hello This information :

Among them the alpn Agreement information Which two protocols can the client accept .server hello The news Let's make it clear that We're going to use H2 agreement .

This is also HTTP 2 comparison spdy One of the most important advantages of the agreement :spdy The agreement is strongly dependent on TLS/SSL, The server has no choice . and HTTP 2 The protocol will be carried when the client makes a request alpn This extension , In other words, when the client sends a request, it will tell the server which protocols I support . So that the server can choose , If I need to go TLS/SSL.

Two 、HTTP 2 The relationship between frame and stream in a video

Simply speaking ,HTTP 2 That is to simulate the transport layer on the application layer TCP in “ flow ” The concept of , So it solved HTTP 1.x The problem of queue congestion in the protocol , stay 1.x Agreement ,HTTP Protocols are made up of messages , Same article TCP Connected to the , The response to the previous message did not come back , Subsequent messages cannot be sent . stay HTTP 2 in , Removed this restriction , The so-called “ news ” Defined as “ flow ”, The order between streams can be disordered , But the order of the frames in the stream can't be disordered . Pictured :

That is to say, in the same line TCP Connected to the , There can be more than one stream flow , These streams One by one frame The frame of , There is no sequential relationship between flows , But there is a sequence of frames within each stream . Take a look at this picture 135 And the numbers are actually stream id,WebSocket Although there is also the concept of frame , But because WebSocket There is no stream id, therefore Websocket It has no multiplexing function .HTTP 2 Because of the stream id So there's the ability to multiplex . Can be in one TCP There is... On the connection n A flow , It means that the server can process concurrently n And then respond to the same request at the same time TCP Connected to the . Of course, this is in the same line TCP Connect to transmit n individual stream There are limits to our ability to , stay HTTP 2 When the connection is established ,setting frame This setting information will be included in . Such as below When visiting tmall's website , The browser carries setting The message in the frame is marked The browser HTTP 2 The client of can support concurrent, the biggest stream is 1000.

On the same day, the cat server returns this setting Frame response time , It tells the browser , The maximum concurrency I can support stream by 128.

meanwhile We also need to know ,HTTP 2 Agreement flow id An odd number represents a stream initiated by the client , Even number represents the flow initiated by the server ( It can be understood as the active push of the server ).

3、 ... and 、 HTTP 2 The secret of traffic saving in the Internet :HPACK Algorithm

Compared with the HTTP 1.x agreement ,HTTP 2 The protocol also makes a great improvement in traffic consumption . It's mainly divided into three parts : Static dictionary , Dynamic dictionary , And Huffman code . You can install the following tools to detect The effect on traffic savings :

apt-get install nghttp2-client

And then you can detect that some of them are turned on HTTP 2 The site of , Basically, the traffic saved is 100% 25 rise , If you visit frequently There will be more :

For traffic consumption , Actually HTTP 2 comparison HTTP 1.x The biggest improvement of the protocol is HTTP 2 We can deal with HTTP My head is compressed , And in the past HTTP 1.x Agreement ,gzip It's impossible to wait for header To compress , Especially for the vast majority of requests , Actually header The largest proportion of .

Let's first look at the static dictionary , As shown in the figure :

It's not hard to understand , It's just that we use our usual HTTP Head , Use fixed numbers to represent , Of course, it can save traffic . What we should pay attention to here is There are some value The situation is complicated header, their value There is no static dictionary . such as cache-control This cache control field , There are too many values behind this to be solved by static dictionary , It's just Hoffman coding . The figure below shows HPACK This compression algorithm Play the role of saving traffic :

for example , Let's take a look at 62 This Head ,user-agent It means browser , Generally, the header information will not change when we request , So in the end hpack After algorithm optimization When it's retransmitted later You just need to transmit 62 This number represents what it means .

Another example is the picture below :

It's the same , When multiple requests are sent continuously , Most of the time, the only thing that changes is path, The rest of the header information is unchanged , So based on this scenario , In the end, it's just path This is a header message .

And finally, let's see hpack The core of the algorithm : Huffman code . The core idea of Huffman coding is to use shorter coding when the frequency is higher , Those with lower frequency use longer encoding (HTTP 2 The predecessor of the agreement spdy The protocol uses dynamic Huffman coding , and HTTP 2 The protocol chooses static Huffman coding ).

Let's take a few examples :

For example, this header frame , Pay attention to this method:get The head information of . because method:get The index value in the static index table is 2. For this kind of key and value All values in the index table , We use one byte, which is 8 individual bit To mark , The first bit Fixed for 1, be left over 7 Bits are used to represent the values in the index table , here method:get The value of the index table is 2, So this value is 1000 0010, The conversion 16 Hexadecimal is 0x82.

Look at another group ,key In the index table ,value Not in the index table header Example .

about key In the index table ,value Not in the index table , Fixed is 01 The first byte , Back 6 individual bit(111010 Conversion to decimal is 58) Is the value of the static index , user-agent In the index index The value of is 58 Plus 01 At the beginning 2 individual bit To convert to binary is 01111010,16 It's just 7a 了 . And then look at the second byte ,0xd4,0xd4 To convert to binary is 1 101 0100, The first bit It's a Huffman code , hinder 7 individual bit This key-value Of value It takes a few bytes to represent , Here is 101 0100 The conversion 10 Hexadecimal is 84, That is to say user-agent hinder value need 84 In bytes , Let's count the bytes in the figure below 16*5+ first row d4 hinder 4 Bytes , Exactly equal to 84 Bytes .

Finally, one more key and value Examples that are not in the index table .

Four 、HTTP 2 Agreement Server Push The ability of

We mentioned before ,H2 comparison H1.x The biggest improvement of the agreement is H2 It can be on a single line TCP On the basis of connection Transmit at the same time n individual stream. To avoid H1.x The congestion problem of the team leader in the protocol . In fact, in most front-end pages , We can also use H2 Agreed Server Push Ability Further improve the loading speed of the page . For example, we usually use a browser to access a Html When the page is , Only when html The page returns to the browser , The browser kernel resolves to this Html Page with CSS perhaps JS And so on , The browser will send the corresponding CSS perhaps JS request , When CSS and JS After coming back The browser will render further , Such a process usually causes the browser to be on the white screen for a period of time, thus reducing the user experience . With H2 After the agreement , When a browser accesses a Html When the page reaches the server , The server can actively push the corresponding CSS and JS Content to the browser , So you can omit the browser and resend CSS and JS The requested step .

Some people are right Server Push There is a degree of misunderstanding , I think this technology can let the server send “ notice ”, Even with WebSocket Compare . This is not the case ,Server Push It just saves the browser the process of sending requests . Only when “ If you don't push this resource , The browser will request this resource ” When , The browser will use the pushed content . Otherwise, if the browser itself does not request a resource , Then pushing this resource will only consume bandwidth in vain . Of course, if the client is communicating with the server instead of the browser , that HTTP 2 The agreement can be completed naturally push The function of push . So they all use HTTP 2 In the case of an agreement , Is it the client or the browser that communicates with the server There are some differences in function .

Now to demonstrate this process , Let's write a piece of code . Considering browser access HTTP 2 The site has to be built on TLS Connect above , We first need to generate the corresponding certificate and secret key .

Then open HTTP 2, On receiving Html Take the initiative when you ask push Html It is quoted in CSS file .

package main
import (
"net/http" ""
) func main() { e := echo.New()
e.Static("/", "html")
// It is mainly used to verify whether it is successfully opened http2 Environmental Science
e.GET("/request", func(c echo.Context) error {
req := c.Request()
format := `
Protocol: %s<br>
Host: %s<br>
Remote Address: %s<br>
Method: %s<br>
Path: %s<br>
return c.HTML(http.StatusOK, fmt.Sprintf(format, req.Proto, req.Host, req.RemoteAddr, req.Method, req.URL.Path))
}) // Upon receipt of html On request At the same time, take the initiative push html It is quoted in css file , There's no need to wait for the browser to make a request
e.GET("/h2.html", func(c echo.Context) (err error) {
pusher, ok := c.Response().Writer.(http.Pusher)
if ok {
if err = pusher.Push("/app.css", nil); err != nil {
println("error push")
} } return c.File("html/h2.html")
e.StartTLS(":1323", "cert.pem", "key.pem")

then Chrome When visiting this page , look down NetWork panel :

You can see this CSS file We take the initiative push Over here . Let's take a look at Wireshark.

You can see it stream id by 13 Of Is a client initiated request , because id It's singular , In this stream in , There is still push_promise frame , This frame is sent to the browser by the server , Take a look at his details .

You can see that this frame is used to tell the browser , I take the initiative push Which resource is it for you , Of this resource stream-id yes 6. In the picture, we also see one stream-id by 6 Of   data In the transmission , This is the server initiative push Coming out CSS file . Come here , Once complete Server Push The interaction is over .

But in practical online applications Server Push When The challenge is far greater than ours demo It's a lot of complexity . First of all, most cdn supplier ( Unless you build it yourself cdn) Yes Server Push My support is limited . It's impossible for us to make every resource request go directly to our source server , Most of the static resources are in front of CDN in . secondly , For static resources , We also have to consider the impact of caching , If it's a static resource request sent by the browser itself , The browser can decide whether I really need to request this resource according to the cache state , and Server Push It was initiated by the server , In most cases, the server does not know whether the cache of this resource has expired . Of course, it can be received in the browser push Promise After that frame , Query your own cache state and initiate RST_STREAM frame , Tell the server that I have a cache for this resource , There's no need to continue sending , But you can't guarantee that RST_STREAM On reaching the server , Server initiative push Out of the data The frame hasn't been sent out yet . So there will still be a certain amount of bandwidth waste . On the whole ,Server Push It's also a very effective way to improve the front-end user experience , Used Server Push in the future Browser performance metrics idle indicators Generally, it can improve 3-5 times ( After all, browsers don't have to wait for parsing Html Ask later CSS and JS 了 ).

5、 ... and 、HTTP 2 Why flow control ?

Many people don't understand , Why? TCP The transport layer has implemented flow control , Our application layer HTTP 2 And flow control . Let's take a look at a picture .

stay HTTP 2 Agreement , Because we support multiplexing , That means we can send multiple stream In the same article TCP Connecting , Above picture , Each color represents one stream, You can see We have 4 Kind of stream, every last stream And then there is n individual frame, This is very dangerous , Suppose we use multiplexing in the application layer , Will appear n individual frame At the same time, it is continuously sent to the target server , When the traffic reaches its peak, it will trigger TCP Congestion control , So that the following frame It's all blocked up , The server response is too slow .HTTP 1.x This problem does not exist because multiplexing is not supported in . And we mentioned it many times before , A request from the client to the server goes through many proxy servers , The memory size and network condition of these proxy servers may be different , So do a flow control on the application layer to avoid triggering as much as possible TCP It's necessary to control the flow of water . stay HTTP 2 Traffic control strategy in the protocol , Follow these principles :

  1. Both client and server have the ability of flow control .

  2. The transmitter and receiver can set the flow control capability independently .

  3. Only data Frames need flow control , other header Frame or push promise Frames and so on don't need to be .

  4. Flow control capability only for TCP Both ends of the connection , Even if there is a proxy server in the middle , It's not transmitted to the source server .

Visit Zhihu's website to take a look at the package .

These signs window_update The frame of It's called flow control frame . Let's open one at will , You can see the frame size that the traffic control frame tells us .

Smart as you can think of , since HTTP 2 We can do flow control , You can do it, too . Let's say in HTTP 1.x Agreement , We visited one Html page , There will be JS and CSS There are pictures and other resources , We send these requests at the same time , But these requests don't have the concept of priority , Who goes out first and who comes back first is unknown ( Because you don't know that either CSS and JS Is the request on the same line TCP Connected to the , Since it's scattered in different places TCP in , So it's uncertain which is fast or slow ), But from a user experience perspective , sure CSS The highest priority , And then there was JS, Finally, the picture , This can greatly reduce the browser's white screen time . stay HTTP 2 in To achieve this ability . For example, we visit sina The site of , Then grab the bag and you can see :

You can have a look at this CSS  The priority of the frame :

JS The priority of the

And finally gif The priority of the picture , You can see that this priority is the lowest .

With weight This keyword identifies the priority , The server knows which requests need to be responded first and sent first response, Which requests can be sent later . In this way, the overall experience provided by the browser will become better .

6、 ... and 、HTTP 2 Problems with the agreement

be based on TCP perhaps TCP+TLS Of HTTP 2 agreement There are still a lot of problems , such as : The handshake time is too long , If it is based on TCP Of HTTP 2 agreement , Then shake hands at least three times , If it is TCP+TLS Of HTTP 2 agreement , except TCP You have to go through TLS A lot of handshakes (TLS1.3 It can be done only by 1 The second handshake ). Each handshake needs to send a message and then receive the message ack To shake hands for the next time , In the weak network environment, we can imagine that the efficiency of establishing this connection is extremely low . Besides ,TCP Protocol born team leader The problem has been puzzling HTTP 21.x The protocol and HTTP 2 agreement . Let's take a look at Google spdy The propaganda map of , We can understand the essence of congestion more accurately :

Figure 1 is easy to understand , We're sending out at the same time with multiplexing support 3 individual stream, And then pass by TCP/IP agreement Send to server side , then TCP The protocol sends these packets back to our application layer , Notice that there's a condition here that , Send packets in the same order as receive packets . As you can see in the figure above, the order of the blocks is the same , But if you come across the situation in the figure below , For example, these packets happen to lose the first red packet , Then, even if the subsequent packets have been sent to the server's machine , We can't deliver data to our application layer protocol right away , because TCP The protocol stipulates that the order of receiving should be consistent with that of sending , Now that the red packet is missing , Then subsequent packets can only be blocked in the server , Wait until the red packet arrives at the server after being retransmitted , And then pass these packets to the application layer protocol .

TCP In addition to some of the defects mentioned above , There's another problem TCP The implementer of the protocol is at the operating system level , We have no language , Include Java,C,C++,Go wait The so-called exposure to the outside world Socket Programming interface (API) The ultimate implementer is actually the operating system itself . Let the operating system upgrade itself TCP The implementation of the protocol is very, very difficult , Besides, so many devices in the whole Internet want to be implemented as a whole TCP The upgrade of the agreement is an unrealistic thing (IPV6 There are also reasons for the slow upgrading of the agreement ). Based on the above problems , Google is based on udp The protocol encapsulates a layer of quic agreement ( In fact, many of them are based on udp Application layer protocol of protocol , They are all partially implemented on the application layer TCP Several functions of the protocol ), To replace HTTP 21.x-HTTP 2 Medium TCP agreement .

We turn on Chrome Medium quic Protocol switch :

Then visit youtube( The domestic b In fact, the station also supports ).

It can be seen that we have supported quic It's agreed . Why is this option in Chrome The browser is off by default , It's easy to understand , This quic The agreement was actually made by Google itself , It's not officially included in HTTP 3 Agreement , Everything is still in the draft . So this option is off by default . look down quic The agreement is compared to the original TCP What improvements have been made to the agreement ? In fact, the original queue transmission message is changed to no queue transmission , Naturally, there will be no congestion at the head of the team .

In addition to HTTP 3 It is also provided Change the port number or ip Addresses can also reuse the ability to connect before , I understand that the characteristics supported by this protocol may be more for the sake of the Internet of things . Many devices in the Internet of things ip It can all be changing all the time . Reusing the previous connection will greatly improve the efficiency of network transmission . In this way, you can avoid the existing disconnection, and you need to go through at least 1-3 individual rtt Can continue to transmit data .

Last but not least , In the extreme weak network environment ,HTTP 2 You may not perform as well as HTTP 1.x, because HTTP 2 There's only one TCP Connect , Under the weak net , If the packet loss rate is very high , So it's going to trigger all the time TCP Layer timeout retransmission , cause TCP The backlog of messages , The message cannot be delivered to the application layer above , however HTTP 1.x in , Because you can use multiple TCP Connect , So to a certain extent , The message backlog will not be like HTTP 2 So serious , That's what I think HTTP 2 The only agreement is not as good as HTTP 1.x The place of , Of course, this pot is TCP Of , Not at all HTTP 2 Of itself .

Read more :

author :vivo Internet -WuYue

In depth understanding of Web agreement ( 3、 ... and ):HTTP 2 More articles about

  1. In depth understanding of web agreement ( One )- http Packet transport

    This article was first published in vivo Internet technology WeChat official account   link : author : Wu yue The reason for this series of holes , Mainly in the front end of learning ...

  2. http agreement tcp agreement ip Agreement three handshakes, four waves , Why three handshakes , Why wave four times ,sockete I understand

    1.1 TCP What is it? ? TCP yes Tranfer Control Protocol For short ,TCP The protocol is connection oriented . reliable . Transport layer communication protocol based on byte stream . adopt TCP Protocol transfer , What we get is a sequential error free number ...

  3. How to understand TCP Three handshake protocol of ?

    • TCP It's a link oriented protocol , Any connection oriented protocol , We can all compare it to our most familiar phone model . How to make an analogy ? We can look at this matter from the two stages of establishment and destruction . Establish connection phase First , Let's see TCP in ...

  4. Reprint and accumulation series - In depth understanding of HTTP agreement

    In depth understanding of HTTP agreement 1.  Basic concepts 1.1  Introduce HTTP yes Hyper Text Transfer Protocol( Hypertext transfer protocol ) Abbreviation . Its development is the World Wide Web Association (World Wide Web C ...

  5. web— The third chapter XHTML

     web— The third chapter XHTML Another week   We learned to make forms : At first I thought forms were forms . But the result : Forms are used to collect and submit user input data , It's fascinating to say that , To put it simply, it's the landing end . such as :Facebook.twitter.Ins ...

  6. Introduction to reptiles ( One ): Quickly understand HTTP agreement

    4 Dig yourself a hole in the reptile series , Mainly involves HTTP agreement . Regular expressions . The crawler frame Scrapy. Message queue . Database and so on . The basic principle of crawler is to simulate the browser HTTP request , understand HTTP Protocols are essential for writing crawlers ...

  7. In depth understanding of AMQP agreement

    In depth understanding of AMQP agreement 2018 year 10 month 22 Japan 12:32:16 What kind of amorous feelings is it Read the number :1941   List of articles One .AMQP What is it? Two .AMQP Model working process In depth understanding of 3、 ... and .Exchange Switch Default ...

  8. Introduction to network programming lazy people ( 6、 ... and ): Explain profound theories in simple language , Fully understand HTTP agreement

    This paper cites the author of the book “ Terylene _Woo” The article , The content is abridged , Thanks for sharing . 1. Preface HTTP( Hypertext transfer protocol , English full name HyperText Transfer Protocol) It's the most widely used on the Internet ...

  9. understand Web The code structure and operation principle of the application program (3)

    1. understand Web The operation principle and mechanism of application program Web The application is browser based / Server mode ( Also known as B/S framework ) Applications for , When it's developed , It needs to be deployed to Web The server will work properly , The client that interacts with the user is a web browser . browser ...

  10. ( Redeposit Author unknown ) In depth understanding of HTML agreement

    In depth understanding of HTML agreement http Protocol Science Learning Series 1. Basic concepts 1.1 Introduce HTTP yes Hyper Text Transfer Protocol( Hypertext transfer protocol ) Abbreviation . Its development is the World Wide Web Association (Worl ...

Random recommendation

  1. SQL Server Automation operation and maintenance series —— Monitor the remaining disk space and SQL Server Error log (Power Shell)

    Requirements describe In our production environment , In most cases, we need to have our own operation and maintenance system , Including the detection of their own health status, etc . If something unusual happens , Need early warning , The general form of notification is email . In all the self-test process, the most basic one is the detection of disk free space . ...

  2. WinForm------GridControl merge cell

    1. modify GridView Properties of 2. Click on Run Design Modify the properties of the columns that need to be merged 3. to GridView Add event ( If the above two steps do not work, use this method again ) private void gridView1_C ...

  3. openstack The database gets a virtual machine floating_ip, fix_ip, project_name, user_name, hostname, host

      Reprint please indicate openstack Yes 3 Databases ,nova,neutron,keystone, What I need to do now is cross database join table ...

  4. soapui Chinese operation manual ( One )---- Create a new project

    1) Create a new project Click item , Choose new project SOAP. This will open up a new SOAP Project dialog . Be careful : You can do it, too CTRL + N(WIN) or CMD+ N(MAC) To create a new SOAP project . In the new SO ...

  5. mysql Specific explanation of authority control

    summary mysql Permission control can be controlled in different contexts and different operation levels , They include, for example, the following ** Management authority can agree with user management mysql server The operation of . These rights controls are global , Not for a specific number ...

  6. A fast 、 efficient Levenshtein Algorithm implementation —— Code implementation

    I saw a blog explanation on the Internet Levenshtein The calculation of , Most of the content is very good , It's just not good enough in some details , It took a long time to understand . I made a simple modification to the algorithm description . The link to the original is : A fast . efficient Levensht ...

  7. About oracle sql Statement query table name and field name to add double quotation marks

    from : As oracle I believe that you will encounter this problem , Pictured : Obviously, it is navicat Visualization creates tables , ...

  8. empty、isset、is_null Comparison

    Go straight to the code <?php $a=0; $b='0'; $c=0.0; $d=''; $e=NULL; $f=array(); $g='\0'; $h=' ';//space $i=true; $j ...

  9. MemSQL Start[c]UP 2.0 - Round 1 F - Permutation thinking + Line tree maintenance hash value

    F - Permutation Ideas : For the current value x, Just need to know x + k, x - k Whether these two values appear on the left and right sides , And because each value has only one , So it can be converted into ,x+k, x-k On the way to x When you're in a position, whether ...

  10. tensorflow Model weight derivation

    tensorflow Use... When saving weight models tf.train.Saver().save Function to save the weight , The saved ckpt File cannot be opened directly , It's not good for importing model weights into other frameworks ( Such as Caffe.Keras etc. ) ...