High frequency trading
High frequency trading refers to computerized trading that seeks to profit from extremely short-term market changes that people cannot use , such as , A small change in the difference between the bid price and the offer price of a security , Or the tiny price difference of a stock on different exchanges . In high-frequency trading , Automation applications process hundreds of millions of market signals a day , Sending tens of millions of orders on exchanges around the world . To keep the business competitive , The response time must always be microseconds , Especially at the peak of black swan abnormal events .
The typical system structure of high frequency trading system is as follows ： Financial transaction signals will be converted into internal market data format ( Transaction use TCP、UDP And so on ) And a variety of formats ( Such as binary 、SBE、JSON、FIX etc. ). then , These standardized messages are sent to the algorithm server 、 Statistical engine 、UI、Log Server And various databases ( cache 、 File or distributed database ). Any delay will result in high cost . for example , Make a decision based on the old price or order too late . In order to gain a microsecond advantage , Most trading participants put in expensive hardware ： A liquid cooled one CPU Server pool for （2020 Years can be bought 56 nucleus 、5.6GHz、1TB Memory servers ）, Assembled in the main switching data center 、 High end nanosecond network switch 、 Dedicated transoceanic line , Even microwave networks .
Common high-frequency trading systems use highly customized Linux kernel , And with an operating system bypass , In this way, the data can be directly from the network card " Jump " To the application 、 be based on IPC Interprocess communication , Even use FPGA（ Programmable single purpose chip ）. As for programming languages , Usually the first thing that comes to mind is C++, It's actually a natural choice in this field .C++ The biggest advantage is that it runs fast , Closest to the machine code , And it compiles directly according to the target platform , It has the characteristics of high efficiency and stability .
Use Java Instead of C++
We made a different choice . in the past 14 In the year , We use Java Development , And use cheap hardware instead of expensive high-end devices .
In a small team , A work environment with limited resources and a lack of skilled developers ,Java It means that we can do software iterations quickly , because Java Ecosystem than C Series have faster development efficiency . Improvement measures can be discussed in the morning , And in the afternoon in production 、 Test and release .
Compared to large companies that need weeks or even months of software updates , This is a key advantage . In this field , A mistake can erase a year's profits in seconds , So you can't compromise on quality . We use a lot of open source libraries and projects , Implement a strict agile development environment , Including the use of Jenkins、Maven、 unit testing 、 Night building and Jira. adopt Java, Developers can focus on business logic , Not like it C++ Debug memory that way Coredump Or dealing with pointers . and , because Java Powerful memory management , Junior programmers can also be immediately involved in developing code , And the risk is controllable .
As long as you have good design patterns and clean coding habits , You can use it Java achieve C++ Delay of . We all know , send Java The reason why software development is powerful and convenient , It's also the main reason for its shortcomings , That's it Java virtual machine （JVM）.
Java Just in time compilation code （Just in Time compiler ）, It means the first time you encounter some code , There may also be compilation delays .Java The way to manage memory is by allocating memory blocks in heap space . Every once in a while , It will clean up the space , Delete old objects , Make room for new objects . The main problem is , For accurate statistics , Application threads need to be instantaneously " frozen ". This process is called garbage collection （GC）.GC It's low latency application developers who give up Java The main reason is .
On the market Java The most common and standard virtual machine is Oracle Hotspot JVM, It's in Java It's widely used in the community , Mainly for historical reasons . For very demanding applications ,Azul Systems Provides a great alternative , be called Zing.Zing yes Oracle Hotspot JVM A powerful alternative .Zing It's solved GC Pause and JIT Compilation problems .
Let's look at the use of Java The inherent problems and possible solutions .
understand Java Just in time compiler
image C++ Such a language is called a compiled language , Because the delivered code is completely binary , Can be directly in CPU On the implementation .PHP or Perl It's called interpretive language , Because the interpreter （ Installed on the target machine ） Compile every line of code while running .
Java In between ; It compiles the code into what's called Java Bytecode , Bytecode can be compiled into binary when it thinks fit .Java The reason you don't compile code at startup , It's about long-term performance optimization . By observing the operation of the application , Analysis of real-time method calls and class initialization ,Java Can compile part of the code that is often called . It may even make assumptions based on experience （ This part of the code will never be called , Or the object is always a String）.
therefore , The actual compiled code is very fast , But there are still 3 Disadvantages .
1、 A method needs to be called a certain number of times to reach the compilation threshold , Then it can be optimized and compiled （ This restriction can be configured , But usually it is 10000 About calls ）. Before that , Code that is not optimized doesn't use " Full speed " function .Java There is a trade-off between faster compilation and higher quality compilation （ If the assumption is wrong , There will be a cost of recompiling ）.
2、 When Java When the application restarts , Back to the beginning , You have to wait until you reach this threshold again .
3、 Some applications （ Like our scene ） There are some less frequent but critical approaches , These methods will only be called a few times , But when they are called , It needs to be extremely fast （ think about it , A risk or stop loss function is called only in an emergency ）.
Azul Zing By making it JVM The state of the compiled method and class " preservation " To solve these problems in what it calls a configuration file . This is called ReadyNow！ The unique function of , signify Java Applications always run at optimal speed , Even after a reboot . When you restart an application using an existing configuration file ,Azul JVM It immediately calls its previous results and compiles the annotated method directly , So it solved Java Preheating problem .
Besides , You can create a configuration file in the development environment , To simulate production behavior . then , The optimized configuration file can be deployed in the production environment , Because all critical paths are compiled and optimized .Zing The delay of is fairly stable over time . The percentile distribution shows that ,1% In the time of ,Hotspot JVM The resulting delay is Zing JVM Of 16 times .
Solve garbage collection （GC） The question of pause
In the garbage collection process , The entire application may freeze for milliseconds to seconds （ Latency increases with code complexity and heap size ）, What's worse is , You can't control when this happens . While pausing an application for a few milliseconds or even seconds, for many Java It may be acceptable for applications , But it's a disaster for low latency applications , Whether it's a car 、 Aerospace 、 Health care or Finance .
GC The impact of Java It's a big topic among developers ; A complete garbage collection is often called "stop-the-world", Because it freezes the entire application .
these years , many GC Algorithms are trying to achieve throughput （ How many? CPU For practical application logic rather than garbage collection ） And GC Make a trade-off between .
since Java 9 since ,G1 The collector has always been the default GC, The main idea is to divide according to the time target provided by users GC Pause time . It usually provides a short pause time , But the cost is lower throughput . Besides , The pause time increases with the size of the heap .Java Provides a number of settings to adjust its garbage collection （ as well as JVM）, From heap size to collection algorithms , And assigned to GC Number of threads for . therefore , notice Java It's common that applications are configured with a large number of custom options .
Many developers have turned to various technologies to completely avoid GC. The main idea is , If you create fewer objects , Fewer objects need to be cleaned up . An old technique is to use an object pool of reusable objects . for example , A database connection pool will hold 10 References to open connections , Be prepared to use when needed .
Multithreading usually requires locks , This can cause synchronization delays and pauses （ Especially when they share resources ）. A popular design is a ring buffer queue system , In a lockless setting , There are many threads that write and read . Some experts even choose to implement it entirely on their own Java memory management , Manage your own memory allocation , Although it solved a problem , But it brings more complexity and risk . under these circumstances , Obviously, other things should be considered JVM, So we decided to try Azul Zing JVM. Soon , We've achieved very high throughput , The pause is negligible .
This is because Zing A unique collector is used , be called C4（Continuurrentously Concurrent Compacting Collector）, It allows garbage collection without pauses , And don't care Java The size of the heap （ Up to 8TB）. This is done by... While the application is still running , Concurrent mapping and compression of memory to achieve . Besides , It doesn't need to change any code , Delays and speed improvements are out of the box , No need for lengthy configuration . under these circumstances ,Java Programmers can enjoy the best of both worlds , You can enjoy Java The simplicity of （ No need to be paranoid about creating new objects ）, You can enjoy it again Zing The underlying performance of , Make the delay of the whole system highly predictable .
Thanks a lot GC easy, A generic GC Log Analyzer , We can use it in real-world automated trading applications （ In a simulated environment ） Quickly compare the two JVM. In the application of high frequency trading , Use Zing Of GC Than using standard Oracle Hotspot JVM Small 180 About times . What's more impressive is ,GC The pause usually corresponds to the actual application pause time , and Zing intelligence GC Usually occurs in parallel with minimal or no actual pause .
Java While enjoying simplicity and business oriented features , High performance and low latency can still be achieved . although C++ Still available for specific underlying components , Such as driver 、 database 、 Compilers and operating systems , But most of them can be used in reality Java To develop , Including demanding applications like high-frequency trading .