Linux glibc memory station problems and Solutions

Song Baohua 2021-06-24 05:26:36
linux glibc memory station problems

Author's brief introduction

Liu Dongyun  

2017 Graduated from Hangzhou University of Electronic Science and technology . Now working in a network equipment communication company in Hangzhou ,2020 He began to serve in Linux Kernel Engineer , Mainly responsible for the maintenance of the operating system , Focus on memory management .



For embedded devices , User mode memory management is a basic function , At present, the mainstream user mode memory management libraries are glibc、uclibc、tcmalloc、jemalloc etc. .

This article is based on glibc2.17 Version analysis , around glibc Principle of memory allocation 、 The causes of the problem are discussed , Also on glibc Cache a lot of memory ( Up to dozens G Or even hundreds G) And do not release the problem gives a solution .

The problem I encountered is based on glibc Memory management 64 position Linux System . The specific phenomena are as follows : equipment 32G Physical memory , In the case of large-scale turbulence , The physical memory occupied by a user process has skyrocketed to 20G about .

After stopping the flow , It is observed that the business module has released most of the memory , But the physical memory occupied by the process is still up to 16G about , Since then, the memory condition has remained in that state , The system memory is tight , If you superimpose other services, there will be OOM The phenomenon of , The possibility of a memory leak has been ruled out for this process .


Glibc Basic principles of memory allocation

Glibc Used ptmalloc Memory management mode of , This paper is used in the description glibc To call .Glibc Memory is requested from the allocation area , It is divided into main distribution area and non main distribution area , There are locks in the distribution area , The lock needs to be acquired before allocating memory , Then apply for memory .

Most processes are multithreaded , When multiple threads need to apply for memory at the same time , If there is only one allocation area , So the efficiency is too low .

glibc In order to support multithreading memory request release , When multiple threads need to apply for memory at the same time, according to cpu An allocation area of a certain number , Assign the allocation area to the thread . If the number of threads is large , There will be multiple threads competing for an allocation area , Not here .

Basic principles of memory application : When the user calls malloc When applying for memory ,glibc Check to see if memory has been cached , If there is a cache, the cache memory will be used first , Returns a memory block of the size requested by the user .

If there is no cache or the cache is insufficient, it will apply to the operating system for memory ( It can be done by brk、mmap Application memory ), Then cut a piece of memory to the user , Pictured 1 Shown .

chart 1

Basic principles of memory release : When the business module is used, it is called. free When freeing memory ,glibc Check the usage status of the upper and lower memory blocks in the virtual address of the memory block (fast bin With the exception of ). If the last memory is free , Then merge with the previous memory . If the next memory is free , Then merge with the next memory . Pictured 2 Shown .

If the next block of memory top chunk(top chunk It's always free ), Then look at top chunk Does the size of exceed a threshold , If a threshold is exceeded, release it to OS, Pictured 3 Shown .

chart 2

chart 3


Glibc Memory station and its reasons   

Memory station concept :

Memory station refers to glibc from OS After applying for memory, it is allocated to the business module , After using the business module, the memory is released , however glibc The free memory was not released to the OS, That is to say, a lot of idle memory cannot be returned to the system .

The reason of memory station guard :

glibc At design time, it is determined that its memory is used for short life cycle , So in design, memory is released to OS When is the best time top chunk Is released when the size of the top chunk Part of the memory for OS. When top chunk No more memory will be released to OS.

So here comes the question , If and top chunk Adjacent memory blocks are always in use , that top chunk Will never exceed the threshold , Even if the business module releases a lot of memory , Up to dozens of G Or hundreds of them G,glibc Also unable to return the memory OS Of .

about glibc Come on , It has the concept of main distribution area and non main distribution area . Master distribution passed sbrk To increase the memory size of the allocation area , The non main distribution area is through one or more mmap The memory blocks are linked with linked lists to simulate the main allocation area . In order to more clearly explain the memory station , Here is an example to illustrate the main allocation area of memory station , Pictured 4 Shown .

chart 4

There are (a) (c) (e) (g) Memory block in use , This results in free memory (b) (d) (f) Can't and top chunk Connect into a larger free memory block ,glibc The threshold of (64 Bit system default is 128K), Although at present, there is nearly half of free memory 130M, I can't give it back OS.

Next, let's look at the memory station of the non main allocation area , Pictured 5 Shown , The actual non primary allocation area may have many heap, Let's assume that only 4 individual heap.

chart 5

In the process of positioning , The author and colleagues have discussed how to solve the problem of standing guard for many times . During a discussion, Deng Hongjie proposed to reduce heap Of size( Be similar to tcmalloc How to do it ), Although it is found that there is no effect after measurement , But it has played an enlightening role in solving problems in the future .

Later, when I read the code, I found that this is glibc The original mechanism , At the same time, the author observed a large number of non main allocation areas when looking at the memory layout heap Are all free state . The original mechanism is to release first heap3, If heap3 There is internal use , Even though heap0、heap1、heap2 All of the memory is free , It can't be released to the system .

glibc There are multiple distribution areas , There are hundreds of people in each distribution area M Free memory , Then the whole process takes up dozens G No wonder .


Glibc Memory station solutions and Solutions patch 

On memory release , For the main distribution area and the non main distribution area, the process is different , We 64 The process memory model of bit system is classic model , Stack grows from high address to low address .

For the main allocation area of memory stationing I have not encountered , If the main allocation area memory station , One way is to try madvise Assign the pagesize Free memory to be aligned , But the effect may not be obvious .

The other is by creating threads , Then move the business of the main thread to the new thread , In this way, the main distribution area will not create a guard , The station guard was transferred to the non main distribution area , The non primary allocation area is the main battlefield for our next optimization .

Two optimizations are carried out for the non main distribution area :

a) heap0,heap1,heap2 Is free , So we can put heap1,heap2 release ;

b) heap The default is 64M, Lower each heap Of size( I set it to 512K).

chart 6

Here's a special explanation for not releasing heap0 And the last heap3,heap0 The composition is shown in the figure 7 Shown . The first one is on the left heap namely heap0, The last one is on the right heap namely heap3.

It can be clearly seen from the figure that if it is released heap0 It will be struct malloc_state Structure release , Will cause the process to crash . The one on the right has memory in use , It can't be released . Of course, if heap3 All of the memory has been released , By glibc Native code ,patch No longer handle .

chart 7

The modified glibc Source code , Optimize its release mechanism , Actual streaming test .

After the turbulence reaches the peak value , The process uses 20G Of memory , In a few seconds after stopping streaming, it will return to the memory level before streaming , The memory occupied by the process is basically returned to the system . thus ,glibc Memory stationing problem solved .

Above we introduced how to solve the principle of memory station guard , It's on paper , Now let's look at patch The source code to achieve .

At present, the author has optimized the method patch Submit to open source community for review , Submitted to the community patch Not right heap Of size Make changes , That's because you want to be careful , After all, there are many scenarios for open source code , If necessary, it can be decided by itself heap Of size.

Patch be based on glibc2.17 Code

1. Index: arena.c
2. ===================================================================
3. --- arena.c (revision 2)
4. +++ arena.c (working copy)
5. @@ -652,7 +652,7 @@
7. static int
8. internal_function
9. -heap_trim(heap_info *heap, size_t pad)
10. +heap_trim(heap_info *heap, heap_info* free_heap, size_t pad)
11. {
12. mstate ar_ptr = heap->ar_ptr;
13. unsigned long pagesz = GLRO(dl_pagesize);
14. @@ -659,7 +659,29 @@
15. mchunkptr top_chunk = top(ar_ptr), p, bck, fwd;
16. heap_info *prev_heap;
17. long new_size, top_size, extra, prev_size, misalign;
18. + heap_info *last_heap;
20. + /*Release heap if possible*/
21. + last_heap = heap_for_ptr(top_chunk);
22. + if ((NULL != free_heap->prev) && (last_heap != free_heap)){
23. + p = chunk_at_offset(free_heap, sizeof(*free_heap));
24. + if (!inuse(p)){
25. + if (chunksize(p)+sizeof(*free_heap)+MINSIZE==free_heap->size){
26. + while (last_heap){
27. + if (last_heap->prev == free_heap){
28. + last_heap->prev == free_heap->prev;
29. + break;
30. + }
31. + last_heap = last_heap->prev;
32. + }
33. + ar_ptr->system_mem -= free_heap->size;
34. + arena_mem -= free_heap->size;
35. + unlink(p, bck, fwd);
36. + delete_heap(free_heap);
37. + return 1;
38. + }
39. + }
40. + }
41. /* Can this heap go away completely? */
42. while(top_chunk == chunk_at_offset(heap, sizeof(*heap))) {
43. prev_heap = heap->prev;
44. Index: malloc.c
45. ===================================================================
46. --- malloc.c (revision 2)
47. +++ malloc.c (working copy)
48. @@ -915,7 +915,7 @@
49. # if __WORDSIZE == 32
50. # define DEFAULT_MMAP_THRESHOLD_MAX (512 * 1024)
51. # else
52. -
# define DEFAULT_MMAP_THRESHOLD_MAX (4 * 1024 * 1024 * sizeof(long))
53. +# define DEFAULT_MMAP_THRESHOLD_MAX (256 * 1024)
54. # endif
55. #endif
57. @@ -3984,7 +3984,7 @@
58. heap_info *heap = heap_for_ptr(top(av));
60. assert(heap->ar_ptr == av);
61. - heap_trim(heap, mp_.top_pad);
62. + heap_trim(heap, heap_for_ptr(p), mp_.top_pad);
63. }
64. }



Different memory management methods have their advantages and disadvantages , Due to the need of work , I had the honor to study it glibc、tcmalloc、uclibc memory management , This article discusses glibc Memory management has a common problem , The feasible solution is given .

For the problem of memory stationing , The general practice is that users cache some memory that is not released for a long time . The other is to simply glibc Replace with tcmalloc. because tcmalloc Of span The relatively small , So the probability of guard is very low , Even if it happens, just stand guard span Size . If it cannot be used for some reason tcmalloc Instead of glibc Scene , The above solution can be tried , This problem has been bothering us for a long time , It took a long time and a lot of energy to locate .

stay glibc2.28 In the version of the ,glibc With tcache Characteristics of , For the scenario where the business process uses a lot of small memory, the problem of memory stationing is more likely to occur . At the time of writing this article, I looked at the glibc2.33 edition , The open source community has yet to fix this issue ( Maybe the big God of the open source community thinks this is not glibc The problem of , Instead, the user does not release memory ).

Due to my limited level , There's no guarantee that you're right glibc All of them are correct , So the reader finds out what's wrong with it , Don't be surprised , If convenient , You can write to discuss it , mailbox

Two dimensional code scanning identification , Focus on “Linux Code reading field ”

本文为[Song Baohua]所创,转载请带上原文链接,感谢

  1. Matrix architecture practice of Boshi fund's Internet open platform based on rocketmq
  2. 字节面试,我这样回答Spring中的循环依赖,拿下20k offer!
  3. Byte interview, I answer the circular dependence in spring like this, and get 20K offer!
  4. oracle 11g查看alert日志方法
  5. How to view alert log in Oracle 11g
  6. 手写Spring Config,最终一战,来瞅瞅撒!
  7. Handwritten spring config, the final battle, come and see!
  8. 用纯 JavaScript 撸一个 MVC 框架
  9. Build an MVC framework with pure JavaScript
  10. 使用springBoot实现服务端XML文件的前端界面读写
  11. Using springboot to read and write the front interface of server XML file
  12. 【Javascript + Vue】实现随机生成迷宫图片
  13. [Javascript + Vue] random generation of maze pictures
  14. 大数据入门:Hadoop伪分布式集群环境搭建教程
  15. Introduction to big data: Hadoop pseudo distributed cluster environment building tutorial
  16. 八股文骚套路之Java基础
  17. commons-collections反序列化利用链分析(3)
  18. Java foundation of eight part wensao routine
  19. Analysis of common collections deserialization utilization chain (3)
  20. dubbogo 社区负责人于雨说
  21. Yu Yu, head of dubbogo community, said
  22. dubbogo 社区负责人于雨说
  23. Yu Yu, head of dubbogo community, said
  24. 设计模式 选自《闻缺陷则喜》此书可免费下载
  25. The design pattern is selected from the book "you are happy when you hear defects", which can be downloaded free of charge
  26. xDAI被选为 Swarm 的侧链解决方案,将百倍降低 Swarm 网络Gas费
  27. L2 - 深入理解Arbitrum
  28. Xdai is selected as the side chain solution of swarm, which will reduce the gas cost of swarm network 100 times
  29. L2 - deep understanding of arbitrum
  30. Java全栈方向学习路线
  31. 设计模式学习04(Java实现)——单例模式
  32. Java full stack learning route
  33. Design pattern learning 04 (Java implementation) - singleton pattern
  34. Mybatis学习01:利用mybatis查询数据库
  35. Mybatis learning 01: using mybatis to query database
  36. Java程序员从零开始学Vue(01)- 前端发展史
  37. Java程序员从零开始学Vue(05)- 基础知识快速补充(html、css、js)
  38. Java programmers learn Vue from scratch
  39. Java programmers learn Vue from scratch (05) - quick supplement of basic knowledge (HTML, CSS, JS)
  40. 【Java并发编程实战14】构建自定义同步工具(Building-Custom-Synchronizers)
  41. [Java Concurrent Programming Practice 14] building custom Synchronizers
  42. 【源码分析】- 在SpringBoot中你会使用REST风格处理请求吗?
  43. [source code analysis] - do you use rest style to process requests in springboot?
  44. 框架篇:见识一下linux高性能网络IO+Reactor模型
  45. Framework: see Linux high performance network IO + reactor model
  46. 基础篇:JAVA.Stream函数,优雅的数据流操作
  47. 基础篇:异步编程不会?我教你啊!CompletableFuture(JDK1.8)
  48. Basic part: Java. Stream function, elegant data stream operation
  49. Basic: asynchronous programming won't? I'll teach you! CompletableFuture(JDK1.8)
  50. 技能篇:sed教程-linux命令
  51. 数据库篇:mysql内置函数
  52. Linux 主要的发行系统版本介绍
  53. 网络篇:朋友面试之https认证加密过程
  54. Skills: sed tutorial - Linux command
  55. Database: built in functions of MySQL
  56. Introduction of Linux main distribution system versions
  57. Network: friends interview: the encryption process of HTTPS authentication
  58. [Linux]经典面试题 - 系统管理 - 备份策略
  59. 解决java socket在传输汉字时出现截断导致乱码的问题
  60. [Linux] classic interview questions system management backup strategy