内存分页机制

Linux的分页机制用来实现以页(Page)为单位的虚拟内存系统，而具体的寻址方法则是逻辑地址经过分页机制的处理转换为物理地址。

控制寄存器

CR0

Bit Name    Full Name   Description
0   PE  Protected Mode Enable   If 1, system is in protected mode, else system is in real mode
1   MP  Monitor co-processor    Controls interaction of WAIT/FWAIT instructions with TS flag in CR0
2   EM  Emulation   If set, no x87 floating-point unit present, if clear, x87 FPU present
3   TS  Task switched   Allows saving x87 task context upon a task switch only after x87 instruction used
4   ET  Extension type  On the 386, it allowed to specify whether the external math coprocessor was an 80287 or 80387
5   NE  Numeric error   Enable internal x87 floating point error reporting when set, else enables PC style x87 error detection
16  WP  Write protect   When set, the CPU can't write to read-only pages when privilege level is 0
18  AM  Alignment mask  Alignment check enabled if AM set, AC flag (in EFLAGS register) set, and privilege level is 3
29  NW  Not-write through   Globally enables/disable write-through caching
30  CD  Cache disable   Globally enables/disable the memory cache
31  PG  Paging  If 1, enable paging and use the § CR3 register, else disable paging.

CR1：保留
CR2

Contains a value called Page Fault Linear Address (PFLA). When a page fault occurs, the address the program attempted to access is stored in the CR2 register.

CR3

Used when virtual addressing is enabled, hence when the PG bit is set in CR0. CR3 enables the processor to translate linear addresses into physical addresses by locating the page directory and page tables for the current task. Typically, the upper 20 bits of CR3 become the page directory base register (PDBR), which stores the physical address of the first page directory entry. If the PCIDE bit in CR4 is set, the lowest 12 bits are used for the process-context identifier (PCID).[1]

开启分页机制

CR0的第31位如果置位1，即开启分页机制：

PG Paging If 1, enable paging and use the § CR3 register, else disable paging.

以我们常用的x86架构为例，在kernel的代码（arch/x86/Kconfig）：

config PGTABLE_LEVELS
        int
        default 5 if X86_5LEVEL
        default 4 if X86_64
        default 3 if X86_PAE
        default 2

从这个配置我们应该可以看到X86_64是4 level分页机制

四级页表模型

下图是四级页表模型

从上面的图可以发现有四种类型的页表

PGD
PUD
PMD
PTE
其中Page Global Directory包含Page Upper Directory的地址，而Page Middle Derectory又包括Page Middle Derectory的地址，Page Middle Derectory包含Page Table的地址，其中每个Page Table对应一个Page Frame即物理页. 因此一个线性地址被分为5个部分

四种类型的页表数据结构

pgd_t，pud_t，pmd_t，pte_t分别是四种页面的数据结构，定义如下:

typedef struct { pgdval_t pgd; } pgd_t;
typedef struct { pudval_t pud; } pud_t;
typedef struct { pmdval_t pmd; } pmd_t;
typedef struct { pteval_t pte; } pte_t;

其中pgdval_t，pudval_t，pmdval_t，pteval_t的类型全部为unsigned long.

分页机制寻址过程

每个进程都有一个独立的页表，进程的mm_struct的成员pdg指向页全局目录。

下图是一个虚拟地址的分解图

可以看出由虚拟地址的CR3寄存器里面的值加上PGD得到页全局目录项
由PUD索引得到PUD目录项
由PMD得到PMD目录项
由PTE得到页表项
由offset得到具体的物理地址

通过逻辑地址查找页表Page Table

那么我们来看看相应的代码：

4350 static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address,
     /* [previous][next][first][last][top][bottom][index][help]  */
4351                             unsigned long *start, unsigned long *end,
4352                             pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp)
4353 {
4354         pgd_t *pgd;
4355         p4d_t *p4d;
4356         pud_t *pud;
4357         pmd_t *pmd;
4358         pte_t *ptep;
4359 
            /*
                进程的pdg指针存放在mm-=
            */
4360         pgd = pgd_offset(mm, address);//返回执行PGD的指针地址
4361         if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd)))
4362                 goto out;
4363         /*
                USE_EARLY_PGTABLE_L5在关闭的情况直接返回，不做任何动作
            */
4364         p4d = p4d_offset(pgd, address);//直接返回，不做任何修改
4365         if (p4d_none(*p4d) || unlikely(p4d_bad(*p4d)))
4366                 goto out;
4367         /*
                返回address地址指向的PUD地址
            */
4368         pud = pud_offset(p4d, address);
4369         if (pud_none(*pud) || unlikely(pud_bad(*pud)))
4370                 goto out;
4371         /*
                address地址指向的PMD地址
            */
4372         pmd = pmd_offset(pud, address);
4373         VM_BUG_ON(pmd_trans_huge(*pmd));
4374 
            ...

4393 
4394         if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd)))
4395                 goto out;
4396 
            ...
            /*
                然会address地址指向的PTE的页表项
            */
4402         ptep = pte_offset_map_lock(mm, pmd, address, ptlp);

            /*
                检查Present的标志，如果被置位1，所指的页（页表项）是在内存中；如果该标志为0，
                则这一页不在内存中，此时分页单元会把地址存放在CR2寄存器中，
                并产生14号异常：缺页异常
            */
4403         if (!pte_present(*ptep))
4404                 goto unlock;
4405         *ptepp = ptep;
4406         return 0;
4407 unlock:
4408         pte_unmap_unlock(ptep, *ptlp);
4409         if (start && end)
4410                 mmu_notifier_invalidate_range_end(mm, *start, *end);
4411 out:
4412         return -EINVAL;
4413 }

PTE转换物理页

4471 int follow_phys(struct vm_area_struct *vma,
     /* [previous][next][first][last][top][bottom][index][help]  */
4472                 unsigned long address, unsigned int flags,
4473                 unsigned long *prot, resource_size_t *phys)
4474 {
4475         int ret = -EINVAL;
4476         pte_t *ptep, pte;
4477         spinlock_t *ptl;
4478 
4479         if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
4480                 goto out;
4481 
4482         if (follow_pte(vma->vm_mm, address, &ptep, &ptl))
4483                 goto out;
4484         pte = *ptep;
4485 
4486         if ((flags & FOLL_WRITE) && !pte_write(pte))
4487                 goto unlock;
4488 
4489         *prot = pgprot_val(pte_pgprot(pte));
4490         *phys = (resource_size_t)pte_pfn(pte) << PAGE_SHIFT;
4491 
4492         ret = 0;
4493 unlock:
4494         pte_unmap_unlock(ptep, ptl);
4495 out:
4496         return ret;
4497 }

更新CR3寄存器(cr3里面的是物理地址)

load_cr3(next->pgd);

load_cr3

 229 static inline void load_cr3(pgd_t *pgdir)
     /* [previous][next][first][last][top][bottom][index][help]  */
 230 {
 231         write_cr3(__pa(pgdir));
 232 }

总结

通过内核代码简单学习了linux的4级分页机制，直观的了解了逻辑地址`到`物理地址的过程

但是我们上面了解的过程更期望下面可以学习，解答如下问题：

内存描述符struct mm_struct`和进程描述符`struct task_struct之间又有什么关系呢？
进程调度切换的时候，内存有时如何切换的呢？
内存具体是如何分配和释放的呢？
Present标志是由谁来置位和清空的呢？

参考链接：
http://www.leviathan.vip/2019/02/16/Linux%E5%86%85%E6%A0%B8%E6%BA%90%E7%A0%81%E5%88%86%E6%9E%90-%E5%86%85%E5%AD%98%E5%88%86%E9%A1%B5%E6%9C%BA%E5%88%B6/

https://blog.csdn.net/farmwang/article/details/70141912

文章归档

近期文章

内存分页机制

控制寄存器

开启分页机制

四级页表模型

四种类型的页表数据结构

分页机制寻址过程

通过逻辑地址查找页表Page Table

PTE转换物理页

更新CR3寄存器(cr3里面的是物理地址)

总结

发送评论编辑评论

内存分页机制

控制寄存器

开启分页机制

四级页表模型

四种类型的页表数据结构

分页机制寻址过程

通过逻辑地址查找页表Page Table

PTE转换物理页

更新CR3寄存器(cr3里面的是物理地址)

总结

发送评论 编辑评论

推荐文章

发送评论编辑评论