科技大本营|Linux内核虚拟内存管理之匿名映射缺页异常分析( 三 )

最终我们看到使用的就是内核初始化设置的empty_zero_page这个0页得到页帧号 。 再看看pfn_pte的第二个参数vma->vm_pageprot , 这是vma的访问权限 , 在做内存映射mmap的时候会被设置 。
那么我们想知道的时候是什么时候0页被设置为了只读属性的(也就是页表项何时被设置为只读)?
我们带着这个问题去在内核代码中寻找答案 。 其实代码看到这里一般看不到头绪 , 但是我们要知道何时vma的vm_page_prot成员被设置的 , 如何被设置的 , 有可能就能找到答案 。
我们到mm/mmap.c中去寻找答案:我们以do_brk_flags函数为例 , 这是设置堆的函数我们关注到3040行设置了vm_page_prot:
3040vma->vm_page_prot = vm_get_page_prot(flags);||
\/
110 pgprot_t vm_get_page_prot(unsigned long vm_flags)111 {112pgprot_t ret = __pgprot(pgprot_val(protection_map[vm_flags115116return arch_filter_pgprot(ret);117 }118 EXPORT_SYMBOL(vm_get_page_prot);vm_get_page_prot函数会根据传递来的vmflags是否为VMREAD|VMWRITE|VMEXEC|VMSHARED来转换为保护位组合 , 继续往下看
||
\/
78 /* description of effects of mapping type and prot in current implementation.79* this is due to the limited x86 page protection hardware.The expected80* behavior is in parens:81*82* map_typeprot83*PROT_NONEPROT_READPROT_WRITEPROT_EXEC84* MAP_SHAREDr: (no) nor: (yes) yesr: (no) yesr: (no) yes85*w: (no) now: (no) now: (yes) yesw: (no) no86*x: (no) nox: (no) yesx: (no) yesx: (yes) yes87*88* MAP_PRIVATEr: (no) nor: (yes) yesr: (no) yesr: (no) yes89*w: (no) now: (no) now: (copy) copyw: (no) no90*x: (no) nox: (no) yesx: (no) yesx: (yes) yes91*92* On arm64, PROT_EXEC has the following behaviour for both MAP_SHARED and93* MAP_PRIVATE:94*r: (no) no95*w: (no) no96*x: (yes) yes97*/98 pgprot_t protection_map[16] __ro_after_init = {99__P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,100__S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111101 };protection_map数组定义了从P000到S111一共16种组合 , P表示私有(Private),S表示共享(Share),后面三个数字依次为可读、可写、可执行 , 如:_S010表示共享、不可读、可写、不可执行 。
||
\/
arch/arm64/include/asm/pgtable-prot.h:93 #define PAGE_NONE__pgprot(((_PAGE_DEFAULT) 对于私有匿名映射的页 , 假设设置的vmflags为VMREAD|VMWRITE则对应的保护位组合为:P110即为PAGE_READONLY_EXEC=pgprot(_PAGE_DEFAULT | PTE_USER | PTE_RDONLY | PT_ENG | PTE_PXN)不会设置为可写 。
所以就将其页表设置为了只读!!!
2922行 跳转到setpte去将设置好的页表项值填写到页表项中 。
当匿名页读之后再次去写时候会由于页表属性为只读导致COW缺页异常 , 详将COW相关文章 , 再此不在赘述 。 下面用图说话:
科技大本营|Linux内核虚拟内存管理之匿名映射缺页异常分析3.3 第一次写匿名页的情况接着do_anonymous_page函数继续往下分析:
2876 static vm_fault_t do_anonymous_page(struct vm_fault *vmf)2877 {...29242925/* Allocate our own private page. */2926if (unlikely(anon_vma_prepare(vma)))2927goto oom;2928page = alloc_zeroed_user_highpage_movable(vma, vmf->address);2929if (!page)2930goto oom;29312932if (mem_cgroup_try_charge_delay(page, vma->vm_mm, GFP_KERNEL,29352936/*2937|* The memory barrier inside __SetPageUptodate makes sure that2938|* preceeding stores to the page contents become visible before2939|* the set_pte_at() write.2940|*/2941__SetPageUptodate(page);29422943entry = mk_pte(page, vma->vm_page_prot);2944if (vma->vm_flags29462947vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,29482949if (!pte_none(*vmf->pte))2950goto release;29512952ret = check_stable_address_space(vma->vm_mm);2953if (ret)2954goto release;29552956/* Deliver the page fault to userland, check inside PT lock */2957if (userfaultfd_missing(vma)) {2958pte_unmap_unlock(vmf->pte, vmf->ptl);2959mem_cgroup_cancel_charge(page, memcg, false);2960put_page(page);2961return handle_userfault(vmf, VM_UFFD_MISSING);2962}29632964inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);2965page_add_new_anon_rmap(page, vma, vmf->address, false);2966mem_cgroup_commit_charge(page, memcg, false, false);2967lru_cache_add_active_or_unevictable(page, vma);2968 setpte:2969set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);29702971/* No need to invalidate - it was non-present before */2972update_mmu_cache(vma, vmf->address, vmf->pte);2973 unlock:2974pte_unmap_unlock(vmf->pte, vmf->ptl);2975return ret;2976 release:2977mem_cgroup_cancel_charge(page, memcg,false);2978put_page(page);2979goto unlock;2980 oom_free_page:2981put_page(page);2982 oom:2983return VM_FAULT_OOM;2984 }


推荐阅读