[PATCH] mm: ptd_alloc take ptlock
Second step in pushing down the page_table_lock. Remove the temporary
bridging hack from __pud_alloc, __pmd_alloc, __pte_alloc: expect callers not
to hold page_table_lock, whether it's on init_mm or a user mm; take
page_table_lock internally to check if a racing task already allocated.
Convert their callers from common code. But avoid coming back to change them
again later: instead of moving the spin_lock(&mm->page_table_lock) down,
switch over to new macros pte_alloc_map_lock and pte_unmap_unlock, which
encapsulate the mapping+locking and unlocking+unmapping together, and in the
end may use alternatives to the mm page_table_lock itself.
These callers all hold mmap_sem (some exclusively, some not), so at no level
can a page table be whipped away from beneath them; and pte_alloc uses the
"atomic" pmd_present to test whether it needs to allocate. It appears that on
all arches we can safely descend without page_table_lock.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ac5f044..ea0826f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -277,12 +277,15 @@
unsigned long addr;
for (addr = vma->vm_start; addr < vma->vm_end; addr += HPAGE_SIZE) {
+ src_pte = huge_pte_offset(src, addr);
+ if (!src_pte)
+ continue;
dst_pte = huge_pte_alloc(dst, addr);
if (!dst_pte)
goto nomem;
+ spin_lock(&dst->page_table_lock);
spin_lock(&src->page_table_lock);
- src_pte = huge_pte_offset(src, addr);
- if (src_pte && !pte_none(*src_pte)) {
+ if (!pte_none(*src_pte)) {
entry = *src_pte;
ptepage = pte_page(entry);
get_page(ptepage);
@@ -290,6 +293,7 @@
set_huge_pte_at(dst, addr, dst_pte, entry);
}
spin_unlock(&src->page_table_lock);
+ spin_unlock(&dst->page_table_lock);
}
return 0;
@@ -354,7 +358,6 @@
hugetlb_prefault_arch_hook(mm);
- spin_lock(&mm->page_table_lock);
for (addr = vma->vm_start; addr < vma->vm_end; addr += HPAGE_SIZE) {
unsigned long idx;
pte_t *pte = huge_pte_alloc(mm, addr);
@@ -389,11 +392,12 @@
goto out;
}
}
+ spin_lock(&mm->page_table_lock);
add_mm_counter(mm, file_rss, HPAGE_SIZE / PAGE_SIZE);
set_huge_pte_at(mm, addr, pte, make_huge_pte(vma, page));
+ spin_unlock(&mm->page_table_lock);
}
out:
- spin_unlock(&mm->page_table_lock);
return ret;
}