Linux内核虚拟内存管理之匿名映射缺页异常分析( 二 )

2885行判断:发生缺页的vma是否为私有映射,这个函数处理的是私有的匿名映射 。
2898行 如何页表不存在则分配页表(有可能缺页地址的页表项所在的直接页表不存在) 。
3.2 第一次读匿名页情况...2905/* Use the zero-page for reads */2906if (!(vmf->flags & FAULT_FLAG_WRITE) &&2907!mm_forbids_zeropage(vma->vm_mm)) {2908entry = pte_mkspecial(pfn_pte(my_zero_pfn(vmf->address),2909vma->vm_page_prot));2910vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,2911vmf->address, &vmf->ptl);2912if (!pte_none(*vmf->pte))2913goto unlock;2914ret = check_stable_address_space(vma->vm_mm);2915if (ret)2916goto unlock;2917/* Deliver the page fault to userland, check inside PT lock */2918if (userfaultfd_missing(vma)) {2919pte_unmap_unlock(vmf->pte, vmf->ptl);2920return handle_userfault(vmf, VM_UFFD_MISSING);2921}2922goto setpte;2923}...2968 setpte:2969set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);2906到2923行是处理的是私有匿名页读的情况:这里就会用到我们上面将的0页了 。
2906和 2907行判断是否是由于读操作导致的缺页而且没有禁止0页 。
2908-2909行是核心部分:设置页表项的值映射到0页 。
我们主要研究这个语句:pfn_pte用来将页帧号和页表属性拼接为页表项值:
arch/arm64/include/asm/pgtable.h:77 #define pfn_pte(pfn,prot)78__pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))是将pfn左移PAGE_SHIFT位(一般为12bit),或上pgprot_val(prot)
先看my_zero_pfn:
include/asm-generic/pgtable.h:875 static inline unsigned long my_zero_pfn(unsigned long addr)876 {877extern unsigned long zero_pfn;878return zero_pfn;879 }mm/memory.c:126 unsigned long zero_pfn __read_mostly;127 EXPORT_SYMBOL(zero_pfn);128129 unsigned long highest_memmap_pfn __read_mostly;130131 /*132* CONFIG_MMU architectures set up ZERO_PAGE in their paging_init()133*/134 static int __init init_zero_pfn(void)135 {136zero_pfn = page_to_pfn(ZERO_PAGE(0));137return 0;138 }139 core_initcall(init_zero_pfn);||
/
arch/arm64/include/asm/pgtable.h:54 /*55* ZERO_PAGE is a global shared page that is always zero: used56* for zero-mapped memory areas etc..57*/58 extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];59 #define ZERO_PAGE(vaddr)phys_to_page(__pa_symbol(empty_zero_page))最终我们看到使用的就是内核初始化设置的empty_zero_page这个0页得到页帧号 。再看看pfn_pte的第二个参数vma->vm_pageprot,这是vma的访问权限,在做内存映射mmap的时候会被设置 。
那么我们想知道的时候是什么时候0页被设置为了只读属性的(也就是页表项何时被设置为只读)?
我们带着这个问题去在内核代码中寻找答案 。其实代码看到这里一般看不到头绪,但是我们要知道何时vma的vm_page_prot成员被设置的,如何被设置的,有可能就能找到答案 。
我们到mm/mmap.c中去寻找答案:我们以do_brk_flags函数为例,这是设置堆的函数我们关注到3040行设置了vm_page_prot:
3040vma->vm_page_prot = vm_get_page_prot(flags);||
/
110 pgprot_t vm_get_page_prot(unsigned long vm_flags)111 {112pgprot_t ret = __pgprot(pgprot_val(protection_map[vm_flags &113(VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]) |114pgprot_val(arch_vm_get_page_prot(vm_flags)));115116return arch_filter_pgprot(ret);117 }118 EXPORT_SYMBOL(vm_get_page_prot);vm_get_page_prot函数会根据传递来的vmflags是否为VMREAD|VMWRITE|VMEXEC|VMSHARED来转换为保护位组合,继续往下看
||
/
78 /* description of effects of mapping type and prot in current implementation.79* this is due to the limited x86 page protection hardware.The expected80* behavior is in parens:81*82* map_typeprot83*PROT_NONEPROT_READPROT_WRITEPROT_EXEC84* MAP_SHAREDr: (no) nor: (yes) yesr: (no) yesr: (no) yes85*w: (no) now: (no) now: (yes) yesw: (no) no86*x: (no) nox: (no) yesx: (no) yesx: (yes) yes87*88* MAP_PRIVATEr: (no) nor: (yes) yesr: (no) yesr: (no) yes89*w: (no) now: (no) now: (copy) copyw: (no) no90*x: (no) nox: (no) yesx: (no) yesx: (yes) yes91*92* On arm64, PROT_EXEC has the following behaviour for both MAP_SHARED and93* MAP_PRIVATE:94*r: (no) no95*w: (no) no96*x: (yes) yes97*/98 pgprot_t protection_map[16] __ro_after_init = {99__P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,100__S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111101 };protection_map数组定义了从P000到S111一共16种组合,P表示私有(Private),S表示共享(Share),后面三个数字依次为可读、可写、可执行,如:_S010表示共享、不可读、可写、不可执行 。
||
/


推荐阅读