Heavy interpretation: k8s cluster autoscaler module and its corresponding Huawei cloud plug-in deep dive

Huawei cloud developer community 2020-11-09 22:20:34
heavy interpretation k8s cluster autoscaler

Abstract : This article will decrypt K8s Cluster Autoscaler Module architecture and code Deep Dive, And K8s Cluster Autoscaler Huawei cloud plug-in .

Background information

Based on the business team (Cloud BU Application platform ) Developing Serverless The engine framework is done in the process K8s Cluster Autoscaler Huawei cloud plug-in . At present, the plug-in has contributed to K8s The open source community , See the picture below :

This article will cover the following :

1. Yes K8s Cluster Autoscaler Module architecture and code Deep Dive, Especially the introduction of the algorithms involved in the core function points .

2. K8s Cluster Autoscaler Huawei cloud plug-in Module introduction .

3. The author himself participated in K8s Some experience of open source project .( Such as : How to get information and help from the open source community , What should be paid attention to in the process of contributing to open source )

Go straight to the topic , No more details here K8s Basic concepts of .

What is? K8s Cluster Autoscaler (CA)?

What is elastic stretching ?

As the name implies, it is based on the business needs and Strategies of users , Automatically adjust its flexible computing resource management services , Its advantages are :

1. From the perspective of application developers : Enables application developers to focus on implementing business functions , There is no need to think too much about system level resources

2. From the system operator's point of view : Greatly reduce the operation and maintenance burden , If the system design is reasonable, it can be realized “ Zero operations ”

3. It's the realization of Serverless The cornerstone of Architecture , It's also Serverless One of the main characteristics of

In the specific explanation CA Before the concept , Let's first understand from the macro K8s Several ways of elastic scaling supported (CA It's just one of them ).

K8s Several elastic expansion methods supported :
notes :  To describe accuracy , When introducing the following key concepts , First reference K8S The official explanation is that the town will come to an end :)." In short " Part is the author's own interpretation .

VPA (Vertical Pod Autoscaler)

A set of components that automatically adjust the amount of CPU and memory requested by Pods running in the Kubernetes Cluster. Current state - beta.

In short :  For one POD, Expand and shrink it ( Because there are not many scenarios , Don't introduce too much )

HPA(Horizontal Pod Autoscaler) - Pod Level scaling

A component that scales the number of pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with beta support, on some other, application-provided metrics).

In short :  For a certain Node, According to the pre-set scaling policy ( Such as CPU, Memory Usage rate a set threshold value ), increase / Cut out Pods.

  • HPA Scaling strategy :
    HPA rely on metrics-server Component collection Pod On metrics, And then according to the preset scaling strategy ( Such as :CPU The utilization rate is greater than 50%), To decide whether to expand or shrink Pods. Calculation CPU/Memory Usage rate , It's taking everything Pods Average value . About how to calculate , Click here for a detailed introduction to the algorithm .
    notes :metrics-server By default, only support based on cpu and memory Monitoring index scaling strategy
  • HPA Architecture diagram :

The bottom half of the picture Prometheus Monitoring system and K8s Prometheus Adapter The introduction of components is to be able to use custom metrics To set the scaling strategy , Because it's not the focus of this article , I won't give you too much introduction here , K8s The official document has a Walkthrough The case is to master and understand the module step by step in the actual operation . If users only need to rely on cpu/memory To set the scaling strategy , as long as deploy default metrics-server Components ( Its installation is right K8s It's just one time deployment, Very convenient , The above link contains the installation steps )

CA (Cluster Autoscaler)- Node Level scaling

A component that automatically adjusts the size of a Kubernetes Cluster so that: all pods have a place to run and there are no unneeded nodes.

In short :  about K8S colony , increase / Delete the Nodes, To achieve the purpose of cluster expansion and reduction .

Kubernetes(K8s) Cluster Autoscaler(CA) Module source code analysis :

So much bedding has been done in front , It's time to get to the point of this article . Next, I will reveal from the two dimensions of architecture and code CA The mysterious veil of modules , And cooperate with FAQ Answer common questions in the form of .

CA The overall architecture and its sub modules

As shown in the figure above , CA The module contains the following sub modules ,  See K8S CA Modules in Github Source code

  • autoscaler: Core module , Including the core Scale Up and Scale Down function ( Corresponding Github in  core Package).

1. During the expansion : Its ScaleUp Function will call estimator Module to evaluate the number of nodes required

2. When shrinking : Its ScaleDown Function will call simulator Module to evaluate the number of nodes in the shrink

  • estimator: Responsible for calculating how much capacity expansion needs Node ( Corresponding Github in  estimator Package)
  • simulator: Responsible for simulation scheduling , Calculate the reduction node ( Corresponding Github in  simulator Package)
  • expander: In charge of capacity expansion , Choose the right one Node The algorithm of ( Corresponding Github in  expander Package), You can add or customize your own algorithms
  • cloudprovider: CA The interface module provides to specific cloud providers ( Corresponding Github in cloudprovider Package). The following sub modules will also focus on , It's also Huawei cloud cloudprovider The extension point .

1. autoscaler Connect with specific cloud providers through this module ( As shown in the box at the bottom right corner of the figure above AWS, GCE And cloud providers ), And you can schedule the Node.

2. cloudprovider Some column interfaces are pre-set , For specific cloud providers to implement , To schedule the Node Purpose

Through to K8s CA Module architecture and the introduction of the fabric of the source code , I summarize the following best practices that are worth learning and learning from , It can be applied to any programming language :

1. SOLID Design principles are everywhere , Specifically reflected in :

1. Each sub module is responsible for solving a specific problem only - Single responsibility

2. Each sub module has an extension point reserved - Opening and closing principle

3. The interface isolation of each sub module is very clear - Interface separation principle

Clear organization of sub module packages

Plug in extension point design

About CA Module user FAQs

1. CA and k8s The relationship between other ways of elastic expansion ?

1. VPA Update existing Pod The use of resources

2. HPA Update existing Pod replications

3. If there are not enough nodes to run after a scalability event POD, be CA Will expand the capacity of new Node Into the cluster , I was in Pending State of Pods Will be scheduled to be managed by the new node On

2. CA When to adjust K8S The cluster size ?

  1. When to expand : When resources are insufficient ,Pod Scheduling failed , That is, being has always been in Pending State of Pod( See the flow chart on the next page ), from Cloud Provider Add NODE Into the cluster
  2. When to shrink : Node The utilization rate of resources is low , And Node There is Pod Can be rescheduled to other Node Up

3. CA How often to check Pods The state of ?
CA every other 10s Check to see if you are in pending State of Pods

4. How to control some Node Don't be CA Delete when shrinking ?

0. Node There are Pod By PodDisruptionBudget Controller limits .PodDisruptionBudgetSpec

1. Node The namespace on is kube-system Of Pods.

2. Node On Pod By Evict There's no place to put , There is no other suitable Node Can schedule this pod

3. Node Yes annotation: “cluster-autoscaler.kubernetes.io/scale-down-disabled”: “true”

4. Node There are the following annotation Of Pod:“cluster-autoscaler.kubernetes.io/safe-to-evict”: “false”. Click for details

If you want to learn more about it , Please click here View a more complete list of FAQs and answers .

CA Module source code analysis

Because of space , Only the core sub module is introduced in depth , This paper introduces other sub modules by combining how to coordinate and cooperate with other sub modules .

CA Module integral entrance

Program start entrance :kubernetes/autoscaler/cluster-autoscaler/main.go

CA Of autoscaler Sub module

As shown in the figure above ,autoscaler.go It's the interface , The default implementation is static_autoscaler.go, The implementation calls scale_down.go and scale_up.go Inside ScaleDown as well as ScaleUp Function to complete the expansion and reduction .

So here comes the question , appropriate ScaleUp and ScaleDown Method will be called , Let's go through it step by step in order , go back to CA The whole entrance , There's a RunOnce( stay autoscaler The default implementation of the interface static_autoscaler.go in ) Method , Will start a Loop Has been running listen and watch Is there any one in the system that is in pending State of Pods(i.e. Need help finding Node Of Pods), Like the following code snippet (static_autoscaler.go Inside RunOnce function ) Shown , It is worth noting that , In the actual call ScaleUp There will be a few if/else Judge whether or not the specific conditions are met :

about ScaleDown Function call , Empathy , Also in the RunOnce In the function , ScaleDown The main logic is to follow the following steps :

1. Identify potential low utilization Nodes ( That's in the code scaleDownCandidates An array variable )

2. And then to Nodes Inside Pods find “ Next door ”( That can be placed Nodes, Corresponding to the code podDestinations An array variable )

3. And then there's the screenshot below , How many? if/else The judgment is in accordance with ScaleDown Conditions , Is executed TryToScaleDown function

Through the above introduction, combined with the code fragment , We learned when ScaleUp/ScaleDown Function will be called . Next , Let's look at when these two core functions are called , What happened inside .
Let's take a look first ScaleUp:

From the above code snippet , And the notes I put in it , You can see , Here's what happened :

1. adopt cloudprovider Sub module ( The following is a special introduction to this sub module ) Get the ones that can be expanded from the specific cloud provider NodeGroups

2. Put those Unschedulable Pods Group according to the expansion requirements ( Corresponding to the code above buildPodEquivalenceGroups Function call )

3. The first 1 Step to get all the available NodeGroups And the 2 Step by step, to be allocated Pods, As input , Send to estimator The packing algorithm of sub modules ( The call takes place on computeExpansionOption Function call internal ) , Get some candidates Pods Dispatch / Distribution plan . because estimator The core of the sub module is the boxing algorithm , The following figure shows the implementation of the bin packing algorithm Estimate function , Here's a little trick to implement , Just before the algorithm starts , First call calculatePodScore Reduce the two-dimensional problem to one-dimensional problem ( namely Pod Yes CPU and Memory The needs of ), Then there is the traditional packing algorithm , Two for loop Here it is Pods Find the right Node. How to reduce the dimension , See binpacking.estimator.go Inside calculatePodScore Function source code .

4. The first 3 Step by step to get some solutions , Send to  expander Sub module , Get the optimal allocation scheme ( In the corresponding code fragment ExpanderStrategy.BestOption Function call for )expander Provides the centralized strategy in the following screenshot , Users can implement expander Interface BestOption function , To achieve your own expander Strategy

CA Of cloudprovider Sub module

With specific cloud providers (i.e. AWS, GCP, Azure, Huawei Cloud) Yes, the corresponding cloud platform will be followed by Node Group( Some cloud platforms are called Node Pool) Inside Node Add and delete operations have achieved the purpose of expansion and reduction . Its code corresponds to the same name as cloudprovider package. See Github Code . No cloud provider , All in accordance with k8s Expand in a way that is agreed upon , Develop your own cloudprovider plug-in unit , Here's the picture :

The following will be a special introduction to how Huawei cloud expands the module .

Hua Wei Yun cloudprovider Plug in development and open source contribution experience

Hua Wei Yun cloudprovider How plug-ins are extended and developed ?

The picture below shows Huawei cloudprovider The general code structure of the plug-in , It's in the green box SDK The actual is CCE( Cloud container engine CCE) What is necessary for the necessary operation ( Yes Node Pool/Group Inside Node Add and delete ). We don't have to write this part ourselves, as it is , But because we say CCE Team SDK It's not perfect , So we developed some of the necessary solutions to CCE operational SDK. The point is the code in the red box :

huaweicloud_cloud_provider.go It's the entrance , He is in charge of huaweicloud_cloud_config.go Reading configuration , And instantiate huaweicloud_manager.go object .huaweicloud_manager.go Object by calling in the blue box department CCE SDK To get CCE The overall message . CCE When the overall information is obtained , You can call huaweicloud_node_group.go To complete the task CCE The binding of Node Group/Pool Conduct Node The expansion and reduction of the volume has reached to the whole CCE Of Node Telescopic .

How to obtain the resources needed from the open source community and the points to be noticed in the open source process ?

When I first started taking on the project , I don't know , I don't know how to start .K8s The document on this piece is not very clear . Past experience and K8s Github README Information provided in , I joined their Slack organization , Find the corresponding interest group channel( My case is sig-autoscaling channel), Raised my question ( As the screenshot below ). be based on K8s The size of the code store , If you don't find the right extension point , Almost impossible to change and extend .

Focus on : Now almost all open source groups have Slack group , Join in finding the appropriate interest group , There are a lot of cattle in it , Raise questions , Generally, someone will be enthusiastic about the answer . Mailing lists can also , But I think Slack Be efficient and real-time , Strongly recommend . For the open source projects I normally come into contact with , I usually join it Slack in , Ask questions whenever you have questions . Of course , Open source projects contributed by China , Many communicate in wechat group :) For example, Huawei open source micro service framework project ServiceComb, I also have wechat groups . All in all , For open source projects , Be sure to find efficient ways to communicate with the organization .

in addition , In the process of contributing code , If you use three-party open source code , Due to copyright and secondary distribution issues , Try to avoid directly including three-party source code , If you really need , It can be extended , And attach Huawei's copyright information and disclaimer to the newly expanded documents .


Click to follow , The first time to learn about Huawei's new cloud technology ~

本文为[Huawei cloud developer community]所创,转载请带上原文链接,感谢

  1. 【计算机网络 12(1),尚学堂马士兵Java视频教程
  2. 【程序猿历程,史上最全的Java面试题集锦在这里
  3. 【程序猿历程(1),Javaweb视频教程百度云
  4. Notes on MySQL 45 lectures (1-7)
  5. [computer network 12 (1), Shang Xuetang Ma soldier java video tutorial
  6. The most complete collection of Java interview questions in history is here
  7. [process of program ape (1), JavaWeb video tutorial, baidu cloud
  8. Notes on MySQL 45 lectures (1-7)
  9. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  10. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  11. 精进 Spring Boot 03:Spring Boot 的配置文件和配置管理,以及用三种方式读取配置文件
  12. Refined spring boot 03: spring boot configuration files and configuration management, and reading configuration files in three ways
  13. 【递归,Java传智播客笔记
  14. [recursion, Java intelligence podcast notes
  15. [adhere to painting for 386 days] the beginning of spring of 24 solar terms
  16. K8S系列第八篇(Service、EndPoints以及高可用kubeadm部署)
  17. K8s Series Part 8 (service, endpoints and high availability kubeadm deployment)
  18. 【重识 HTML (3),350道Java面试真题分享
  19. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  20. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  21. [re recognize HTML (3) and share 350 real Java interview questions
  22. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  23. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  24. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  25. RPC 1: how to develop RPC framework from scratch
  26. 造轮子系列之RPC 1:如何从零开始开发RPC框架
  27. RPC 1: how to develop RPC framework from scratch
  28. 一次性捋清楚吧,对乱糟糟的,Spring事务扩展机制
  29. 一文彻底弄懂如何选择抽象类还是接口,连续四年百度Java岗必问面试题
  30. Redis常用命令
  31. 一双拖鞋引发的血案,狂神说Java系列笔记
  32. 一、mysql基础安装
  33. 一位程序员的独白:尽管我一生坎坷,Java框架面试基础
  34. Clear it all at once. For the messy, spring transaction extension mechanism
  35. A thorough understanding of how to choose abstract classes or interfaces, baidu Java post must ask interview questions for four consecutive years
  36. Redis common commands
  37. A pair of slippers triggered the murder, crazy God said java series notes
  38. 1、 MySQL basic installation
  39. Monologue of a programmer: despite my ups and downs in my life, Java framework is the foundation of interview
  40. 【大厂面试】三面三问Spring循环依赖,请一定要把这篇看完(建议收藏)
  41. 一线互联网企业中,springboot入门项目
  42. 一篇文带你入门SSM框架Spring开发,帮你快速拿Offer
  43. 【面试资料】Java全集、微服务、大数据、数据结构与算法、机器学习知识最全总结,283页pdf
  44. 【leetcode刷题】24.数组中重复的数字——Java版
  45. 【leetcode刷题】23.对称二叉树——Java版
  46. 【leetcode刷题】22.二叉树的中序遍历——Java版
  47. 【leetcode刷题】21.三数之和——Java版
  48. 【leetcode刷题】20.最长回文子串——Java版
  49. 【leetcode刷题】19.回文链表——Java版
  50. 【leetcode刷题】18.反转链表——Java版
  51. 【leetcode刷题】17.相交链表——Java&python版
  52. 【leetcode刷题】16.环形链表——Java版
  53. 【leetcode刷题】15.汉明距离——Java版
  54. 【leetcode刷题】14.找到所有数组中消失的数字——Java版
  55. 【leetcode刷题】13.比特位计数——Java版
  56. oracle控制用户权限命令
  57. 三年Java开发,继阿里,鲁班二期Java架构师
  58. Oracle必须要启动的服务
  59. 万字长文!深入剖析HashMap,Java基础笔试题大全带答案
  60. 一问Kafka就心慌?我却凭着这份,图灵学院vip课程百度云