Zeng fansong Alibaba cloud native application platform senior technical expert
Zhang Zhen Alibaba cloud native application platform senior technical expert
Reading guide ： This paper describes Alibaba's technological evolution in the field of container management , Read why K8s The reason for the ultimate success , And this year double 11 Inside Alibaba K8s Application . The content focuses on Alibaba based on K8s Three capabilities upgrading in the process of cloud original transformation practice , Technical solutions precipitated in the process of corresponding capability upgrading , And the business value achieved through these capability upgrades .
from 2015 year Google Take the lead in setting up CNCF since , Cloud native technology began to enter the public's attention and made rapid development , To 2018 Years include Google、AWS、Azure、Alibaba Cloud And other large cloud computing providers have joined in CNCF, Cloud native technology has developed from the original application containerization to include containers 、Service Mesh、 Microservices 、 Immutable infrastructure 、Serverless、FaaS And so on ,CFCF It also includes more and more open source projects .
Kubernetes As CNCF From the very beginning, the first project of has attracted people's attention ,Kubernetes from Google The engineer is based on Google Internal multi-year cluster management system Borg Design experience , Combined with the characteristics of the infrastructure in the era of cloud computing, it is redesigned , Designed to help businesses solve large-scale IT Infrastructure application container choreography problem .
Google stay 2014 year 6 In open source Kubernetes in the future , stay Redhat、Microsoft、Alibaba With the joint efforts of manufacturers and many open source fans , It has become the de facto standard in the field of container arrangement , It has greatly promoted the development of cloud native field .
Today, I'd like to share with you Kubernetes Large scale practical experience , Show how Alibaba cloud is based on Kubernetes Promote Alibaba application operation and maintenance technology stack to cloud native , How to promote Kubernetes Own technological progress , Fully tap the dividend of cloud's original era and help Alibaba reduce it significantly double 11 Of IT cost .
The development of container in Alibaba
stay 2011 Years ago , Alibaba uses VM Virtualization technology divides a physical machine into 3 A virtual machine , Used to deploy Taobao Services , With the rapid development of Taobao business , be based on VM Our technical solutions are not as flexible as our business .
therefore , Alibaba is in 2011 Began to explore based on Linux lxc Container technology , Used to replace tradition based on VM Application deployment plan of , To 2013 year , Developed based on Linux lxc Of T4 Container and AI Container arrangement system . This was a very leading technology solution at that time , But the container technology developed by myself is based on VM There are always some compatibility problems in the operation and maintenance system of the times .
stay 2013 Years as Docker The emergence of container image scheme , Alibaba technicians immediately saw container based + Docker The future of image technology , I began to devote myself to the research in this field , To 2015 year Aliswarm、Zeus、Hippo And so on container arrangement system vigorous development , Each has expanded its territory to serve part of Alibaba's business . Many systems solve the cost of business operation and maintenance , It also brings a certain cost of repeated construction , At the same time, Alibaba's internal resource distribution is relatively scattered , It is impossible to schedule various types of services in a unified way to give full play to the advantages of different business peak staggering resources .
It's in this context ,Sigma The system should be shipped out and 2017 Unified Alibaba's resource pool in , Unified dispatching of all Alibaba's core businesses , And for the first time, it supports running online services and offline jobs on the same physical machine , Greatly improve the resource utilization efficiency of data center and reduce Alibaba's IT cost .
With the rapid development of cloud native technology , Alibaba also sees the potential of cloud native technology , And future businesses IT The inevitable trend of all-round cloud access , from 2018 Began to transform to Kubernetes technology , adopt Kubernetes Expansion capabilities will Sigma Accumulate years of scheduling ability through Kubernetes The way to provide .
stay 2019 Alibaba announced a full-scale cloud launch , Alibaba began to embrace Kubernetes, And will Sigma The scheduling system is fully migrated to based on Kubernetes The scheduling system of , The system is also supporting the largest scale of this year double 11 The underlying infrastructure of e-commerce trading system , Stable support for hundreds of application changes before and after the promotion and provide fast application release and expansion experience , by double 11 The smooth shopping experience of Hummer .
Why? K8s Success in Ali
Kubernetes Stand out from the rest of the technology , In summary, it can be summarized into the following three aspects .
- First of all, it was born for the cloud era at the beginning , With advanced vision and advanced design concept , In addition to the original by the genius of Google Engineers are based on their internal Borg Years of experience in design , After its birth, it developed rapidly ;
Later, along with RedHat、IBM、 Microsoft 、Vmware、 Alibaba cloud and other outstanding engineers from all over the world have invested heavily in , Building a thriving community and ecosystem , Become the first choice of enterprise container arrangement system .
Alibaba economy has many subsidiaries , These subsidiaries will more or less have their own container arrangement system when they join the Alibaba family , In the process of integrating into Alibaba's infrastructure ,Kubernetes It's the most standard and easily accepted solution for customers inside and outside the economy .
- secondly ,Kubernetes A statement of advocacy API Design concept of , It also fits Alibaba's experience and lessons in the field of application operation and maintenance ;
Traditional operation and maintenance system is usually based on process design , And the procedural operation and maintenance system is under the long system call link , The system is usually inefficient due to the complexity of exception handling .
In large-scale operation and maintenance system, complex and various state processing is also a big problem , It is difficult to ensure the consistency of the system based on the process design , The handling of these boundary exceptions often leads to the complexity of the operation and maintenance system , Finally, only the manual operation of the operation and maintenance personnel can be relied on for the exception . Basically, it can be considered that the operation and maintenance system based on the process is difficult to cope with the application management on a large scale , and Kubernetes The statement provided API It is a good medicine to solve the rotation of application operation and maintenance state , It is the best practice principle to improve the overall link efficiency of the operation and maintenance technology stack .
- Third ,Kubernetes modularization 、 Scalable architecture design , Meet the needs of Alibaba's customized transformation to support many business operation and maintenance scenarios .
Inside Alibaba , That is, there are a large number of stateless core e-commerce systems , There's also a lot of caching 、 Message queuing and other middleware have state system , It also includes a large number of retrieval systems with inverted index data , And a lot of AI Calculation task , Different application types have different requirements for the underlying container management platform .
therefore , A modularized and convenient migration custom application management strategy 、 It is very important to design an easy to extend scheduling model , It can serve many internal application forms of Alibaba 、 The key to providing a unified container management infrastructure ,Kubernetes Basically providing these key foundational capabilities , Although in the process of practical application, there are still many practical problems .
alibaba.com K8s Application
stay 2019 year double 11, Alibaba's internal core business is mainly operated in Shenlong 、ECS、ECI Three resource types of infrastructure on top of , And these different types of infrastructure resources all go through Kubernetes Unified management , In the form of a container, it can be used by the upper application , Completed the core business support .
Different from the past double 11, This year's core e-commerce business applications are deployed on DPCA bare metal servers on a large scale . If you have noted the development of Alibaba cloud technology , Should not be strange to dragon server , It is a new generation of cloud server independently developed by Alibaba cloud , adopt “ Soft and hard integration ” The cloud computing virtualization cost is allocated to the low-cost hardware board , A complete release CPU Computing power , For the first time, it really achieved cloud computing virtualization “ zero ” expenses .
Containers are also a lightweight virtualization solution , Dragon + Containers +Kubernetes The combination of the two is the best partner in the original era of cloud , It supports the largest double 11, It will also be the mainstream technology form in the future .
Alibaba is also continuing to use ECS As Kubernetes The supply of underlying resources ,ECS As a traditional way of cloud computing virtualization, it supports the internal business of the Department group , At the same time, combined with the flexible container example ECI The traffic peak used to deal with business emergencies , Bring the elastic value of cloud computing to the business , It really enables on-demand applications 、 Release the ultimate flexibility of resources , Reduce the cost of business planning resources ahead of time .
These resources are distributed in hundreds of thousands of nodes at home and abroad , By dozens of Kubernetes Cluster hosting , Running tens of thousands of Alibaba apps , More than a million containers in total , Its scale is unprecedented . In this year's double 11 in , The biggest in Alibaba Kubernetes The scale of the cluster has reached 10000 ; Of course, this is not Kubernetes The technical limits of , Instead, we consider the balance between data center resource efficiency and infrastructure disaster recovery capability , In the future, if there is a need for this number may also become larger .
be based on K8s Cloud original transformation practice of
Kubernetes As a representative of cloud native technology , Has become the de facto standard in the field of container choreography , Alibaba from 2017 Began to explore , To 2018 The transformation of technology to use Kubernetes To manage production containers .
On the ground K8s In the process of , We are faced with two major problems ：
- firstly , Various business operation and maintenance platforms at the top ;
In order to support Alibaba's diversified business forms , Several typical business operation and maintenance platforms have been developed internally , Infrastructure of each operation and maintenance platform 、 Process control 、 There are more or less differences in application publishing strategies , Lack of a unified application operation and maintenance standard . In the technological evolution of scheduling and cluster management , How to upgrade the whole operation and maintenance system while maintaining the stability of multiple business platforms and businesses on them , It's a huge project .
- second , With the implementation of Alibaba economy's comprehensive cloud access strategy , The entire underlying infrastructure includes storage 、 The Internet 、 The technology evolution of basic operation and maintenance software is also very rapid . Scheduling and cluster management need to support the rapid evolution of infrastructure , Iterate over its own technical architecture , And at the same time to ensure the stability of the business .
be based on K8s It is in this context that cloud native technology transformation was born , Develop to 2019 year Kubernetes It has been deployed on a large scale inside , All the core businesses are already running in K8s Cluster management . But in these years of practice , There's a question that's always on the minds of Engineers , In Alibaba, it's so large 、 Under such a complex business , There are a lot of traditional operation and maintenance habits and the operation and maintenance system supporting these habits , Landing in such a background Kubernetes （ An internal metaphor is to change the engine of a high-speed airplane ） What are you insisting on , Where to compromise , Where must change ？
This chapter , We will share some of our thoughts on this issue in recent years , Especially after this year's double 11 After the test , The answer to this question is basically accepted by the group of Engineers .
The architect in charge of the top-level design can finally take a breath ： hug Kubernetes It's not an end in itself , And by embracing Kubernetes Cloud original transformation of Qiaodong business , adopt Kubernetes The ability of the management of the traditional operation and maintenance system under the stubborn disease , Really unleash the resilience of the cloud , Speed up the delivery of business applications , That's the biggest value of this technological revolution .
For end state upgrade
Under the traditional operation and maintenance system , The change of application is that the operation and maintenance initiates the workflow by creating the operator order , Then the container platform is changed one by one . For example, upgrading a service 3000 An example , The work order will be calculated in advance and generate multiple batches of subtasks , And call the interface of container platform one by one to complete the change of application .
In order to ensure the smooth implementation of the application release work order , Inside each sub job , The publishing of each container is also a workflow , Including monitoring the opening of pipes 、 Mirror pull 、 Container start stop 、 Service registration 、 Configure push and so on , If all goes well, the process will proceed as expected .
In the scenario of large-scale application release , Such as host down 、 Disk exception 、IO abnormal 、 Network anomalies 、 Kernel exceptions are almost inevitable , If there is an error in one of the steps in the publishing process , In general, the operation and maintenance platform needs to retry according to certain strategies , Until the timeout threshold of the batch is exceeded , This will bring three questions , Let's expand one by one .
- One is the efficiency problem of retrying ;
The execution time of each subtask will be delayed by the long tail release within the task , Assume that 3000 The containers are divided into 30 Each batch 100 individual （ It's not a best practice just to signal ）, When a container release exception occurs in each batch , The release time of this batch will be extended by retrying .
- The second is the problem of consistency caused by failure ;
For containers that issue exceptions , After a work order is completed, it can only be managed through peripheral link patrol , In fact, the routine inspection depends on the manual operation of the operation and maintenance personnel , It brings great labor cost and uncertainty .
- The third is the problem of application concurrent change conflict .
If in the process of application release , At the same time, the application expansion request , from 3000 Expand to 3200 An example , Expanded 200 Instance should use the old version or the new version , The problem with the old version is who is ultimately responsible for this 200 Upgrade of old version instance , The new version will face the problem of stability , If there are problems with the new version, the instance of new expansion will have a greater impact .
Because of these complex problems, most operation and maintenance systems reject concurrent application changes , Resulting in very low efficiency of concurrent operations .
K8s Statement provided for application management API At the same time, the design concept solves these three problems , The user only needs to describe the desired final state and the constraints to be observed in the process of achieving the desired state , All the complex operations needed to achieve the final state are handed over to K8s To complete .
In the process of application release , Usually K8s Control the concurrency and the maximum number of unavailable instances to restrict the impact of application publishing on services , The failed instances in the publishing process are solved in the system in a final and consistent way . It's based on this design , When users initiate service changes, they only update the expected status of the application , There is no need to wait for the end of any task , At the same time, it solves the application release efficiency 、 Consistency of online configuration and conflict efficiency of concurrent changes .
Based on the concept of end oriented management applications , We develop Advanced StatefulSet Application management model of , As the name suggests, it is based on Kubernetes Official StatefulSet Extended .
In the official working model , The application completes the version upgrade by scrolling , That is to create a new Pod Delete the old version of Pod, Until the whole app switches to a new version .
This way is simple and direct , But there is a problem of efficiency , For example, for all applications Pod Need to reschedule , This will bring a lot of pressure to the scheduler in the large-scale application release scenario ; meanwhile , Because the new version Pod For a brand new creation , You have to redistribute IP And mount the remote volume , This is for cloud computing networks 、 Storage infrastructure will also be a big challenge ; also , Because containers are newly dispatched , You need to download the new application image again on the machine , This will greatly reduce the efficiency of application release .
In order to improve the efficiency of application release and the certainty of resources , Developed this workload model , It supports in place publishing applications , The location of the app remains the same before and after the app is released , It also supports concurrent updates 、 Fault tolerant pause and other rich release strategies , Effectively meet the release needs of Alibaba's internal e-commerce applications . Because the location before and after the app is released remains the same , Therefore, we can download and decompress the container image to be published in advance in the process of gray-scale publishing , So as to greatly improve the efficiency of application release .
In end state oriented application management , The complex operation and maintenance process is K8s Internally realized ,K8s According to the user's expectation and current situation, calculate the actions to be performed , And gradually change until the final state . Facing the end state brings excellent operation and maintenance efficiency improvement , But at the same time, it also puts forward higher requirements for the system engineering architecture .
We know that K8s Inside is a modularity 、 Distributed system , The operation and maintenance decisions leading to the final state are scattered in multiple internal modules , All of these modules may initiate some operation and maintenance actions on the container , For example, controller 、 Operation and maintenance Operator、 Reschedulers are even kubelet. In a highly automated system , In case of unexpected exception , Its lethality may have disastrous consequences for the business running on it , Combined with the K8s Decision making is scattered in many modules , The problem is that the control of system risk becomes more difficult , There are high requirements for the quality of the system design .
In order to control the risk of the whole system , As shown in the figure above , We are K8s The key position of the system embeds the key behavior , The current limiting and fusing strategies are formulated , Make the whole system even in the case of extreme error , It can also maximize the protection of the business running on it .
Self healing ability upgrade
Under Alibaba's traditional operation and maintenance system , The container platform only produces resources , Application startup and service discovery are completed by operation and maintenance platform system after container startup , This layered approach gives the operation and maintenance system the greatest degree of freedom , It also promotes Alibaba's container ecological prosperity after containerization .
But there is a serious problem with this approach , Because the container scheduling platform can not trigger the expansion and contraction of the container independently , And we need to do complex linkage with one operation and maintenance platform , The upper operation and maintenance system also needs to sense the information of the underlying infrastructure , This has led to a lot of repeated construction work .
In engineering practice , These complexities make it inefficient even after careful design and a lot of investment , Seriously hindering the failure of the host 、 restart , Process crash in container 、 Self healing repair efficiency in case of jamming and other abnormalities , At the same time, it also makes the implementation of application elastic scaling very complex and inefficient .
The way we solve this problem is through K8s Container commands and lifecycle hooks are provided in , Build the process of starting the application and checking the application start status into pod in , Including and monitoring 、VIP、 Service center 、 Configuration center and other infrastructure interaction , adopt Pod Realize the life cycle unification of container and application instance .
Container platforms are no longer just production resources , It's about delivering services that can be used directly for the business , So that it can be in K8s The system completes the fault self-healing closed-loop , It greatly simplifies the construction of application fault self-healing and automatic elastic expansion capacity . Improve the efficiency of system self-healing , In fact, it also helps the business get better runtime stability and application operation and maintenance efficiency .
After completing the life cycle unification of container and application instance , We are building a unified controller programming framework ：Operator Platform.
Operator Platform By the center of the control module with a sidecar Framework container and client code , Through the abstraction of general controller capability , Include ： Event notification 、 Gray management 、 version control 、 cache 、 Command pipeline and other capabilities , Support for multilingual writing operator, Make it unnecessary for developers to understand K8s Many interface details and error handling , Thus reducing the base on operator The development difficulty of the automatic operation and maintenance ability , Make more and more operation and maintenance capabilities through operator The way of settling to K8s In the ecosystem , So that more stateful applications can be automatically deployed , Improve the operation efficiency of the whole operation and maintenance system .
In this way , The system of self-healing of the whole machine fault is built , Efficient series connection including machine locking 、 Apply eviction 、 The machine is offline 、 Abnormal repair and other processes , It ensures the online rate of cluster host and the availability of services . future , We expect to pass the operator Write standardized methods to promote the reuse of basic operation and maintenance capabilities of multiple operation and maintenance platforms , Reduce the cost of repeated construction .
The third important capability upgrade is the upgrade of immutable infrastructure .
That's true. Docker Provides a unified form of application delivery , By applying binary 、 To configure 、 Dependency files are typed into a mirror image during the build process , This ensures that applications are delivered in multiple environments after being built at one time , Avoid many problems caused by environmental inconsistency .
and K8s Further more , By putting different uses of Docker The container is assembled into a pod, Usually upgrading pod It needs to be destroyed and rebuilt , To ensure that the mirror is applied 、 volume 、 Consistency of resource specifications . On the ground K8s In the process of , Adhere to the design concept of immutable infrastructure , adopt K8s pod Separate the application running in a rich container from the operation and maintenance basic components into different containers , And upgrade the application by upgrading the container image .
Here's a concept that needs to be clarified , It's not using K8s It's like practicing the idea of immutable infrastructure , It is necessary to ensure that the application operation and maintenance is completed through image upgrading rather than dynamic file publishing , In fact, for some historical reasons , This usage is common in the industry .
Of course , And K8s Here's the difference , We didn't force adherence to pod It's a compromise way , That is to insist that the container is immutable .
The reason is that after we separate the application container from the O & M infrastructure container , Operation and maintenance container as application container sidecar Containers , It has different version iteration strategies . Application containers are released by application operation and maintenance personnel , The strategies vary according to the application , For example, the use of e-commerce applications StatefulSet And local life uses Deployment To manage applications , And infrastructure containers are in the charge of infrastructure operation and maintenance , There are also some differences between the publishing strategy and the application itself .
To solve this problem , We developed a project called SidecarSet Infrastructure container management model , It uses the same set to manage the operation and maintenance containers of multiple applications , Separate infrastructure changes from application container changes , To support the rapid evolution of infrastructure . Define infrastructure containers from applications pod After being pulled out , The application administrator no longer cares about the startup parameters of the underlying container , It's up to the infrastructure operation and maintenance personnel to configure SidecarSet Automatically inject the application into the O & M container , Simplify the complexity of application operation and maintenance .
You can see , This design of separation of concerns , At the same time, it simplifies the burden of application operation and maintenance and infrastructure operation and maintenance .
Summary and prospect
Alicloud's landing K8s Promote Alibaba's operation and maintenance system to go to cloud native , Release management efficiency in application container 、 Service stability and enterprise IT There has been a big breakthrough in cost .
We've been thinking , How to export Alibaba's application management experience to more scenarios , Solve the application management problems faced by more customers , In the trend of enterprise full cloudization , How to solve the problem of enterprises in the public cloud 、 Private cloud 、 The application management complexity of hybrid cloud and multi cloud scene .
It's in this context , Alibaba cloud and Microsoft are in 2019 year 11 In May, we jointly launched a standard specification for building and delivering cloud native applications , namely Open Application Model（ abbreviation OAM）.
OAM A general model is proposed , Let each platform show the application deployment and operation and maintenance capabilities in a unified high-level abstraction , Solve the problem of cross platform application delivery . meanwhile ,OAM Communicate and connect application developers in a standardized way 、 Operations staff 、 Application infrastructure , Make the cloud native application delivery and management process more consistent 、 Agreement .
By applying standardized methods of delivery , We expect to deploy an application on the cloud in the future , It's as convenient and efficient as installing a Taobao in an app store .
Last , Alibaba's relevant capability upgrading in cloud native transformation mentioned in this article , We all have or will open source to OpenKruise project in , Welcome to pay attention to and exchange ！
Participate in the way ：
- Nail sweep code into OAM Chinese discussion group of the project
（ Nail scan code to join the exchange group ）
- adopt Gitter Participate directly in the discussion
- OAM Open source implementation address
- star once
Cloud native practice summit is about to open
“ Alibaba cloud native Focus on microservices 、Serverless、 Containers 、Service Mesh And other technical fields 、 The trend of primary popular technology of focus cloud 、 Large scale practice of cloud original , Be the technology circle that knows cloud native developers best .”