Linux memory management (1) physical memory initialization

linux memory management physical memory

from :

project :Linux Memory management topic

key word : User kernel space partition 、Node/Zone/Page、memblock、PGD/PUD/PMD/PTE、lowmem/highmem、ZONE_DMA/ZONE_NORMAL/ZONE_HIGHMEM、Watermark、MIGRATE_TYPES.


Physical memory is initialized with Linux Kernel initialization , At the same time, memory management is the foundation of many other functions . Coupled with various modules in the kernel .

Before initialization , understand Linux Memory management framework It helps to have a general image of memory management .

First , You need to know how the entire user and kernel space is divided (3:1、2:2), And then from Node->Zone->Page Initialization at the level of , Until the memory is available .

About Nodes、Zones、Pages The relationship between the three ,《ULVMM》 Figure 2.1 Introduce , although zone_mem_map One layer has been replaced , But it still reflects the hierarchical tree relationship between them .

pg_data_t Corresponding to one Node,node_zones It contains different Zone;Zone The next definition is per_cpu_pageset, take page and cpu binding .

1. User space and kernel space partition

32 position Linux in , The virtual address space consists of 4GB. Divide the entire virtual address space into user space + Kernel space , There are three kinds of :

prompt "Memory split"
depends on MMU
default VMSPLIT_3G
Select the desired split between kernel and user memory.
If you are not absolutely sure what you are doing, leave this
option alone!
config VMSPLIT_3G
bool "3G/1G user/kernel split"
config VMSPLIT_2G
bool "2G/2G user/kernel split"
config VMSPLIT_1G
bool "1G/3G user/kernel split"
default PHYS_OFFSET if !MMU
default 0x40000000 if VMSPLIT_1G
default 0x80000000 if VMSPLIT_2G
default 0xC0000000
  The result of this configuration is generated autoconf.h Defined #define CONFIG_PAGE_OFFSET 0xC0000000.

stay arch/arm/include/asm/memory.h in , It can be seen that PAGE_OFFSET It's the watershed between user space and kernel space . It's also the starting point for using kernel space .

 Copy code
/* PAGE_OFFSET - the virtual address of the start of the kernel image */
static inline phys_addr_t __virt_to_phys(unsigned long x)
return (phys_addr_t)x - PAGE_OFFSET + PHYS_OFFSET;
static inline unsigned long __phys_to_virt(phys_addr_t x)
 2. Get the physical memory size

All subsequent initialization and memory management are based on physical memory , So first get the physical address and size of the physical memory .

adopt DTS Get physical memory properties , Then parse and add to memblock In the subsystem .

memory@60000000 {
device_type = "memory";
reg = <0x60000000 0x40000000>;
According to the above dts, stay start_kernel-->setup_arch-->setup_machine_fdt-->early_init_dt_scan_nodes-->of_scan_flat_dt( Traverse Nodes)-->early_init_dt_scan_memory( Initializing a single memory Node).

The result is from DTS It is concluded that base size Namely 0x60000000 0x40000000.

 Copy code
int __init early_init_dt_scan_memory(unsigned long node, const char *uname,
int depth, void *data)
const char *type = of_get_flat_dt_prop(node, "device_type", NULL);----------------------------------device_type = "memory"
reg = of_get_flat_dt_prop(node, "linux,usable-memory", &l);
if (reg == NULL)
reg = of_get_flat_dt_prop(node, "reg", &l);---------------------------------------------------reg = <0x60000000 0x40000000>
if (reg == NULL)
return 0;
endp = reg + (l / sizeof(__be32));
pr_debug("memory scan node %s, reg size %d, data: %x %x %x %x,\n",
uname, l, reg[0], reg[1], reg[2], reg[3]);
while ((endp - reg) >= (dt_root_addr_cells + dt_root_size_cells)) {
u64 base, size;
base = dt_mem_next_cell(dt_root_addr_cells, &reg);--------------------------------------------0x60000000
size = dt_mem_next_cell(dt_root_size_cells, &reg);--------------------------------------------0x40000000
early_init_dt_add_memory_arch(base, size);---------------------------------------------------- Conduct base, size Effectiveness check
return 0;
And then according to the resolution base/size, call early_init_dt_add_memory_arch-->memblock_add-->memblock_add_range Add the parsed physical memory to memblock In the subsystem .

 Copy code
struct memblock {
bool bottom_up; /* is bottom up direction? */
phys_addr_t current_limit;
struct memblock_type memory;------------------- Add physical memory area
struct memblock_type reserved;----------------- Add reserved memory area
struct memblock_type physmem;
memblock_add Used to add region To memblock.memory in ; There are many places in the kernel initialization phase ( For example, ha arm_memblock_init) Use memblock_reserve take region Add to memblock.reserved.

memblock_remove Used to put a region from memblock.memory Remove ,memblock_free Used to put a region from memblock.reserved Remove .

The addresses in this are all physical addresses , All the information is in memblock In this global variable .

 Copy code
int __init_memblock memblock_add_range(struct memblock_type *type,
phys_addr_t base, phys_addr_t size,
int nid, unsigned long flags)
bool insert = false;
phys_addr_t obase = base;
phys_addr_t end = base + memblock_cap_size(base, &size);
int i, nr_new;
if (!size)
return 0;
/* special case for empty array */
if (type->regions[0].size == 0) {
WARN_ON(type->cnt != 1 || type->total_size);
type->regions[0].base = base;
type->regions[0].size = size;
type->regions[0].flags = flags;
memblock_set_region_node(&type->regions[0], nid);
type->total_size = size;
return 0;
* The following is executed twice. Once with %false @insert and
* then with %true. The first counts the number of regions needed
* to accomodate the new area. The second actually inserts them.
In the kernel boot phase , There is also a need for memory management , But the partner system is not initialized at this time . Used in the early kernel bootmem Mechanism , As a memory allocator in the kernel initialization phase .

It was later used memblock As the kernel initialization phase, the memory allocator , For memory allocation and release .

CONFIG_NO_BOOTMEM Used to decide whether to use bootmem,Vexpress Can make , So use memblock As an initialization phase of the memory allocator .

because bootmem and memblock both API compatible , So the user doesn't feel it . Use memblock When compiling mm/nobootmem.c, call memblock.c Distributor interface in .


3. Physical memory mapping

Because it wasn't opened CONFIG_ARM_LPAE,Linux The page table uses two layers of mapping . therefore PGD->PUD->PMD->PTE In the middle of the PUD/PMD Omitted ,pmd_off_k The return value of is actually pgd_offset_k.

 Copy code
static inline pmd_t *pmd_off_k(unsigned long virt)
return pmd_offset(pud_offset(pgd_offset_k(virt), virt), virt);
#define pgd_index(addr) ((addr) >> PGDIR_SHIFT)
#define pgd_offset(mm, addr) ((mm)->pgd + pgd_index(addr))
/* to find an entry in a kernel page-table-directory */
#define pgd_offset_k(addr) pgd_offset(&init_mm, addr)-------- The actual is addr Move right PGDIR_SHIFT position , And then relative to init_mm.pgd namely swapper_pg_dir The migration .swapper_pg_dir It's where the kernel page tables are stored .
 Copy code


prepare_page_table Used to empty page table entries , In fact, three sections of address page entries have been cleared ,0~MODULES_VADDR、MODULES_VADDR~PAGE_OFFSET、0xef800000~VMALLOC_START.


 Copy code
static inline void prepare_page_table(void)
unsigned long addr;
phys_addr_t end;
* Clear out all the mappings below the kernel image.
for (addr = 0; addr < MODULES_VADDR; addr += PMD_SIZE)------------------------ eliminate 0~MODULES_VADDR Address segment primary page table .
/* The XIP kernel is mapped in the module area -- skip over it */
addr = ((unsigned long)_etext + PMD_SIZE - 1) & PMD_MASK;
for ( ; addr < PAGE_OFFSET; addr += PMD_SIZE)-------------------------------- eliminate MODULES_VADDR~PAGE_OFFSET Address segment primary page table .
* Find the end of the first block of lowmem.
end = memblock.memory.regions[0].base + memblock.memory.regions[0].size;
if (end >= arm_lowmem_limit)-------------------------------------------------end=0x60000000+0x40000000, arm_lowmem_limit=0x8f800000
end = arm_lowmem_limit;
* Clear out all the kernel space mappings, except for the first
* memory bank, up to the vmalloc region.
for (addr = __phys_to_virt(end);
addr < VMALLOC_START; addr += PMD_SIZE)--------------------------------- here end take 0x8f800000, To a virtual address 0xef800000. eliminate 0xef800000~VMALLOC_START Address segment primary page table .
 Copy code


  The real way to create a page table is in map_lowmem Created two interval mapping interval one 0x60000000~0x60800000(0xc0000000~0xc0800000) And interval two 0x60800000~0x8f800000(0xc0800000~0xef800000).

Interval one : Have read-write execution rights , Mainly used for storing Kernel Code data segment , It also includes swapper_pg_dir Content .

Interval two : With reading and writing , No execution is allowed , yes Normal Memory part .

It can be seen that the virtual to physical address mapping of these two intervals is linear , But there are two special pages at the end that are not linear maps .

 Copy code
static void __init map_lowmem(void)
struct memblock_region *reg;
phys_addr_t kernel_x_start = round_down(__pa(_stext), SECTION_SIZE);
phys_addr_t kernel_x_end = round_up(__pa(__init_end), SECTION_SIZE);--------------kernel_x_start=0x60000000, kernel_x_end=60800000
/* Map all the lowmem memory banks. */
for_each_memblock(memory, reg) {
phys_addr_t start = reg->base;
phys_addr_t end = start + reg->size;----------------------start=0x60000000, end=0x8f800000
struct map_desc map;
if (end > arm_lowmem_limit)
end = arm_lowmem_limit;------------------------------- because arm_lowmem_limit=0x8f800000, therefore end=0x8f800000
if (start >= end)
if (end < kernel_x_start) {
map.pfn = __phys_to_pfn(start);
map.virtual = __phys_to_virt(start);
map.length = end - start;
map.type = MT_MEMORY_RWX;
} else if (start >= kernel_x_end) {
map.pfn = __phys_to_pfn(start);
map.virtual = __phys_to_virt(start);
map.length = end - start;
map.type = MT_MEMORY_RW;
} else {
/* This better cover the entire kernel */
if (start < kernel_x_start) {
map.pfn = __phys_to_pfn(start);
map.virtual = __phys_to_virt(start);
map.length = kernel_x_start - start;
map.type = MT_MEMORY_RW;
map.pfn = __phys_to_pfn(kernel_x_start);
map.virtual = __phys_to_virt(kernel_x_start);
map.length = kernel_x_end - kernel_x_start;
map.type = MT_MEMORY_RWX;
create_mapping(&map);-------------- Create a virtual address 0xc0000000 - 0xc0800000 To the physical address 0x60000000 - 0x60800000 The mapping relation of , The attribute is MT_MEMORY_RWX.
if (kernel_x_end < end) {
map.pfn = __phys_to_pfn(kernel_x_end);
map.virtual = __phys_to_virt(kernel_x_end);
map.length = end - kernel_x_end;
map.type = MT_MEMORY_RW;
create_mapping(&map);---------- Create a virtual address 0xc0800000 - 0xef800000 To the physical address 0x60800000 - 0x8f800000 The mapping relation of , The attribute is MT_MEMORY_RW.
 Copy code


There is also a portion of memory mapped to devicemaps_init In the middle of , Yes vectors mapping :

MT_HIGH_VECTORS: Virtual address -0xffff0000~0xffff1000, The corresponding physical address is 0x8f7fe000~0x8f7ff000.

MT_LOW_VECTORS: Virtual address -0xffff1000~0xffff2000, The corresponding physical address is 0x8f7ff000~0x8f800000.


static void __init devicemaps_init(const struct machine_desc *mdesc)
struct map_desc map;
unsigned long addr;
void *vectors;
printk("%s\n", __func__);
* Allocate the vector page early.
vectors = early_alloc(PAGE_SIZE * 2);
for (addr = VMALLOC_START; addr; addr += PMD_SIZE)
* Map the kernel if it is XIP.
* It is always first in the modulearea.
map.pfn = __phys_to_pfn(CONFIG_XIP_PHYS_ADDR & SECTION_MASK);
map.virtual = MODULES_VADDR;
map.length = ((unsigned long)_etext - map.virtual + ~SECTION_MASK) & SECTION_MASK;
map.type = MT_ROM;
* Map the cache flushing regions.
map.pfn = __phys_to_pfn(FLUSH_BASE_PHYS);
map.virtual = FLUSH_BASE;
map.length = SZ_1M;
map.type = MT_CACHECLEAN;
map.pfn = __phys_to_pfn(FLUSH_BASE_PHYS + SZ_1M);
map.length = SZ_1M;
map.type = MT_MINICLEAN;
* Create a mapping for the machine vectors at the high-vectors
* location (0xffff0000). If we aren't using high-vectors, also
* create a mapping at the low-vectors virtual address.
map.pfn = __phys_to_pfn(virt_to_phys(vectors));
map.virtual = 0xffff0000;
map.length = PAGE_SIZE;
map.type = MT_HIGH_VECTORS;
map.type = MT_LOW_VECTORS;
create_mapping(&map);---------- Virtual address 0xffff0000 - 0xffff1000 Mapping to 0x8f7fe000 - 0x8f7ff000, The attribute is MT_HIGH_VECTORS.
if (!vectors_high()) {
map.virtual = 0;
map.length = PAGE_SIZE * 2;
map.type = MT_LOW_VECTORS;
create_mapping(&map);------ Virtual address 0xffff1000 - 0xffff2000 Mapping to 0x8f7ff000 - 0x8f800000, The attribute is MT_LOW_VECTORS.
/* Now create a kernel read-only mapping */
map.pfn += 1;
map.virtual = 0xffff0000 + PAGE_SIZE;
map.length = PAGE_SIZE;
map.type = MT_LOW_VECTORS;
* Ask the machine support to map in the statically mapped devices.
if (mdesc->map_io)
/* Reserve fixed i/o space in VMALLOC region */
* Finally flush the caches and tlb to ensure that we're in a
* consistent state wrt the writebuffer. This also ensures that
* any write-allocated cache lines in the vector page are written
* back. After this point, we can start to touch devices again.
void __init sanity_check_meminfo(void)

????? How can these pages be guaranteed not to be used for other purposes ?????

4. zone initialization

Memory management will be a memory Node Divided into several zone Conduct management , Definition zone Type in the enum zone_type in .

Vexpress It defines NORMAL and HIGHMEM Two kinds of ,zone The initialization of is in bootmem_init In the middle of . adopt find_limits Find the physical memory start frame number min_low_pfn、 End frame number max_pfn、NORMAL The end frame number of the region max_low_pfn.

 Copy code
void __init bootmem_init(void)
unsigned long min, max_low, max_high;
max_low = max_high = 0;
find_limits(&min, &max_low, &max_high);----------------------min_now_pfn=0x60000 max_low_pfn=0x8f800 max_pfn=0xa0000, Through global variables memblock pick up information
zone_sizes_init(min, max_low, max_high);--------------------- from min_low_pfn To max_low_pfn yes ZONE_NORMAL,max_low_pfn To max_pfn yes ZONE_HIGHMEM.
* This doesn't seem to be used by the Linux memory manager any
* more, but is used by ll_rw_block. If we can get rid of it, we
* also get rid of some of the stuff above as well.
min_low_pfn = min;
max_low_pfn = max_low;
max_pfn = max_high;
zone_sizes_init Calculate each of them zone Size and zone Between hole, And then call free_area_init_node Create memory node zone.

 Copy code
void __paginginit free_area_init_node(int nid, unsigned long *zones_size,
unsigned long node_start_pfn, unsigned long *zholes_size)
pg_data_t *pgdat = NODE_DATA(nid);-------------------------------------------- obtain nid Corresponding Node data structure
unsigned long start_pfn = 0;
unsigned long end_pfn = 0;
/* pg_data_t should be reset to zero when it's allocated */
WARN_ON(pgdat->nr_zones || pgdat->classzone_idx);
pgdat->node_id = nid;
pgdat->node_start_pfn = node_start_pfn;
calculate_node_totalpages(pgdat, start_pfn, end_pfn,
zones_size, zholes_size);--------------------------------------- Calculation Node Of page number ,1GB/4KB=262144
printk(KERN_DEBUG "free_area_init_node: node %d, pgdat %08lx, node_mem_map %08lx\n",
nid, (unsigned long)pgdat,
(unsigned long)pgdat->node_mem_map);
free_area_init_core(pgdat, start_pfn, end_pfn,
zones_size, zholes_size);---------------------------------------- Initialize one by one Node Medium Zone
static void __paginginit free_area_init_core(struct pglist_data *pgdat,
unsigned long node_start_pfn, unsigned long node_end_pfn,
unsigned long *zones_size, unsigned long *zholes_size)
enum zone_type j;
int nid = pgdat->node_id;
unsigned long zone_start_pfn = pgdat->node_start_pfn;
int ret;
pgdat->numabalancing_migrate_nr_pages = 0;
pgdat->numabalancing_migrate_next_window = jiffies;
for (j = 0; j < MAX_NR_ZONES; j++) {
struct zone *zone = pgdat->node_zones + j;
unsigned long size, realsize, freesize, memmap_pages;
size = zone_spanned_pages_in_node(nid, j, node_start_pfn,
node_end_pfn, zones_size);
realsize = freesize = size - zone_absent_pages_in_node(nid, j,
* Adjust freesize so that it accounts for how much memory
* is used by this zone for memmap. This affects the watermark
* and per-cpu initialisations
memmap_pages = calc_memmap_size(size, realsize);-------------------------------- Calculation struct page The amount of space the province needs to spend .
if (!is_highmem_idx(j)) {-------------------------------------------------------HIGHMEM Do not calculate the mapping cost page number .
if (freesize >= memmap_pages) {
freesize -= memmap_pages;
if (memmap_pages)
" %s zone: %lu pages used for memmap\n",
zone_names[j], memmap_pages);
} else
" %s zone: %lu pages exceeds freesize %lu\n",
zone_names[j], memmap_pages, freesize);
/* Account for reserved pages */
if (j == 0 && freesize > dma_reserve) {
freesize -= dma_reserve;
printk(KERN_DEBUG " %s zone: %lu pages reserved\n",
zone_names[0], dma_reserve);
if (!is_highmem_idx(j))
nr_kernel_pages += freesize;
/* Charge for highmem memmap if there are enough kernel pages */
else if (nr_kernel_pages > memmap_pages * 2)
nr_kernel_pages -= memmap_pages;
nr_all_pages += freesize;
zone->spanned_pages = size;
zone->present_pages = realsize;
* Set an approximate value for lowmem here, it will be adjusted
* when the bootmem allocator frees pages into the buddy system.
* And all highmem pages will be managed by the buddy system.
zone->managed_pages = is_highmem_idx(j) ? realsize : freesize;
zone->node = nid;
zone->min_unmapped_pages = (freesize*sysctl_min_unmapped_ratio)
/ 100;
zone->min_slab_pages = (freesize * sysctl_min_slab_ratio) / 100;
zone->name = zone_names[j];
zone->zone_pgdat = pgdat;
/* For bootup, initialized properly in watermark setup */
mod_zone_page_state(zone, NR_ALLOC_BATCH, zone->managed_pages);
if (!size)
setup_usemap(pgdat, zone, zone_start_pfn, size);
ret = init_currently_empty_zone(zone, zone_start_pfn,
memmap_init(size, nid, j, zone_start_pfn);---------------------------------------
zone_start_pfn += size;
The result of the above function is as follows :

 Copy code
On node 0 totalpages: 262144------------------------------------------------262144*4KB=1GB
free_area_init_node: node 0, pgdat c0782480, node_mem_map eeffa000
Normal zone: 1520 pages used for memmap-----------------------------------struct page size 32Byte,194560*32B/4KB=1520Page
Normal zone: 0 pages reserved
Normal zone: 194560 pages, LIFO batch:31----------------------------------194560*4KB=760MB
HighMem zone: 67584 pages, LIFO batch:15----------------------------------67584*4KB=264MB
  therefore ZONE_NORMAL The corresponding physical address is 0x60000000 - 0x8f800000,ZONE_HIGHMEM The corresponding physical address is 0x8f800000 - 0xa0000000.




Every zone The water level will be calculated during system initialization :WMARK_MIN、WMARK_LOW、WMARK_HIGH. These parameters are in kswapd When reclaiming page memory .

 Copy code
enum zone_watermarks {
#define min_wmark_pages(z) (z->watermark[WMARK_MIN])
#define low_wmark_pages(z) (z->watermark[WMARK_LOW])
#define high_wmark_pages(z) (z->watermark[WMARK_HIGH])
struct zone {
/* Read-mostly fields */
/* zone watermarks, access with *_wmark_pages(zone) macros */
unsigned long watermark[NR_WMARK];
An important parameter for calculating water level min_free_kbytes Is in init_per_zone_wmark_min In the :


 Copy code
module_init(init_per_zone_wmark_min)------------------------------------------ Calculation min_free_kbytes=3489
__setup_per_zone_wmarks-->-------------------------------------------- Calculation WMARK_HIGH/WMARK_LOW
* Initialise min_free_kbytes.
* For small machines we want it small (128k min). For large machines
* we want it large (64MB max). But it is not linear, because network
* bandwidth does not increase linearly with machine size. We use
* min_free_kbytes = 4 * sqrt(lowmem_kbytes), for better accuracy:
* min_free_kbytes = sqrt(lowmem_kbytes * 16)
* which yields
* 16MB: 512k
* 32MB: 724k
* 64MB: 1024k
* 128MB: 1448k
* 256MB: 2048k
* 512MB: 2896k
* 1024MB: 4096k
* 2048MB: 5792k
* 4096MB: 8192k
* 8192MB: 11584k
* 16384MB: 16384k
int __meminit init_per_zone_wmark_min(void)
unsigned long lowmem_kbytes;
int new_min_free_kbytes;
lowmem_kbytes = nr_free_buffer_pages() * (PAGE_SIZE >> 10);-------- be equal to lowmem_kbytes=761100
new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16);----------------761100*16 Square root =3489.
if (new_min_free_kbytes > user_min_free_kbytes) {------------------user_min_free_kbytes=-1, therefore min_free_kbytes=3489. accord with [128B, 64MB]
min_free_kbytes = new_min_free_kbytes;
if (min_free_kbytes < 128)
min_free_kbytes = 128;
if (min_free_kbytes > 65536)
min_free_kbytes = 65536;
} else {
pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
new_min_free_kbytes, user_min_free_kbytes);
return 0;
The water level is calculated by __setup_per_zone_wmarks Accomplished :

 Copy code
static void __setup_per_zone_wmarks(void)
unsigned long pages_min = min_free_kbytes >> (PAGE_SHIFT - 10);------------min_free_kbytes=3489, therefore pages_min=3489/2=872
unsigned long lowmem_pages = 0;
struct zone *zone;
unsigned long flags;
/* Calculate total number of !ZONE_HIGHMEM pages */
for_each_zone(zone) {
if (!is_highmem(zone))
lowmem_pages += zone->managed_pages;----------------------------- Only calculate lowmem, therefore lowmem_pages=190273
for_each_zone(zone) {
u64 tmp;
spin_lock_irqsave(&zone->lock, flags);
tmp = (u64)pages_min * zone->managed_pages;
do_div(tmp, lowmem_pages);------------------------------------------Normal:tmp=872*190273/190273=872;Highmem:tmp=872*67584/190273=309
if (is_highmem(zone)) {
* __GFP_HIGH and PF_MEMALLOC allocations usually don't
* need highmem pages, so cap pages_min to a small
* value here.
* deltas controls asynch page reclaim, and so should
* not be capped for highmem.
unsigned long min_pages;
min_pages = zone->managed_pages / 1024;
min_pages = clamp(min_pages, SWAP_CLUSTER_MAX, 128UL);
zone->watermark[WMARK_MIN] = min_pages;-------------------------Highmen:min_pages=67584/1024=66
} else {
* If it's a lowmem zone, reserve a number of pages
* proportionate to the zone's size.
zone->watermark[WMARK_MIN] = tmp;--------------------------------Normal:872
zone->watermark[WMARK_LOW] = min_wmark_pages(zone) + (tmp >> 2);----Normal:872+872/4=1090;Highmem:66+309/4=143
zone->watermark[WMARK_HIGH] = min_wmark_pages(zone) + (tmp >> 1);----Normal:872+872/2=1308;Highmem:66+309/2=220
__mod_zone_page_state(zone, NR_ALLOC_BATCH,
high_wmark_pages(zone) - low_wmark_pages(zone) -
spin_unlock_irqrestore(&zone->lock, flags);
/* update totalreserve_pages */
Print each zone The information is as follows :

Normal min=872 low=1090 high=1308 zone_start_pfn=393216 managed_pages=190273 spanned_pages=194560 present_pages=194560---- size :0x2f800*4KB=760MB, You can use :190273 Page
HighMem min=66 low=143 high=220 zone_start_pfn=587776 managed_pages=67584 spanned_pages=67584 present_pages=67584--------- size :0x10800*4KB=264MB, You can use :67584 Pagepresent_pages=67584Movable min=32 low=32 high=32 zone_start_pfn=0 managed_pages=0 spanned_pages=0 present_pages=0



5. Physical memory initialization

Physical memory pages need to be added to the partner system , Partner system is a dynamic storage management method . When a user applies , Allocate a memory block of the right size , On the contrary, reclaim the memory block when it is released .

The management of free pages assigned by the partner system is based on two properties : The size of the page ,2 Of order The next power page ; And the migration type of the page .

 Copy code
struct zone {
* Flags for a pageblock_nr_pages block. See pageblock-flags.h.
* In SPARSEMEM, this map is stored in struct mem_section
unsigned long *pageblock_flags;-----------------------------zone in pageblock Corresponding MIGRATE_TYPE
#endif /* CONFIG_SPARSEMEM */...
/* free areas of different sizes */
struct free_area free_area[MAX_ORDER];--------------------------- according to order Distinguished free page block list
 Copy code
enum {
MIGRATE_UNMOVABLE,-------------------- The contents of the page box cannot be moved , The location must be fixed in memory , Can't move anywhere else , Most of the pages allocated by the core kernel fall into this category .
MIGRATE_RECLAIMABLE,------------------ The contents of the page frame can be recycled , Can't move directly . Because you can also rebuild pages from certain sources , For example, the data of the mapping file belongs to this category ,kswapd According to certain rules , Recycle these pages periodically .
MIGRATE_MOVABLE,---------------------- The contents of the page box can be moved , A page that belongs to a user space application belongs to this type of page , They are mapped through page tables , So just update the page table entries , And copy the data to a new location . Of course, pay attention to , A page may be shared by multiple processes , Corresponding to multiple page table entries .
MIGRATE_PCPTYPES, /* the number of types on the pcp lists */----- Used to express every CPU The number of migration types of linked lists in the data structure of the page frame cache .
* MIGRATE_CMA migration type is designed to mimic the way
* ZONE_MOVABLE works. Only movable pages can be allocated
* from MIGRATE_CMA pageblocks and page allocator never
* implicitly change migration type of MIGRATE_CMA pageblock.
* The way to use it is to change migratetype of a range of
* pageblocks to MIGRATE_CMA which can be done by
* __free_pageblock_cma() function. What is important though
* is that a range of pageblocks must be aligned to
* MAX_ORDER_NR_PAGES should biggest page be bigger then
* a single pageblock.
MIGRATE_CMA,------------------------ Reserve some memory for the driver , But when the driver is not in use , Partner systems can be allocated to user processes for anonymous memory or page caching . And when the driver needs to be used , The memory occupied by the process will be reclaimed or migrated to free up the reserved memory previously occupied , For drive use .
MIGRATE_ISOLATE, /* can't allocate from here */----------------- Page boxes cannot be assigned from this list , Because this list is specifically for NUMA Nodes move physical memory pages , Move the physical memory content to the most frequently used page CPU.
 Copy code
/* If huge pages are not used, group by MAX_ORDER_NR_PAGES */
#define pageblock_order (MAX_ORDER-1)
#define pageblock_nr_pages (1UL << pageblock_order)
