memory unplug: migration by kernel
In usual, migrate_pages(page,,) is called with holding mm->sem by system call.
(mm here is a mm_struct which maps the migration target page.)
This semaphore helps avoiding some race conditions.
But, if we want to migrate a page by some kernel codes, we have to avoid
some races. This patch adds check code for following race condition.
1. A page which page->mapping==NULL can be target of migration. Then, we have
to check page->mapping before calling try_to_unmap().
2. anon_vma can be freed while page is unmapped, but page->mapping remains as
it was. We drop page->mapcount to be 0. Then we cannot trust page->mapping.
So, use rcu_read_lock() to prevent anon_vma pointed by page->mapping from
being freed during migration.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
diff --git a/mm/migrate.c b/mm/migrate.c
index 34d8ada..c8d8722 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -632,18 +632,35 @@
goto unlock;
wait_on_page_writeback(page);
}
-
/*
- * Establish migration ptes or remove ptes
+ * By try_to_unmap(), page->mapcount goes down to 0 here. In this case,
+ * we cannot notice that anon_vma is freed while we migrates a page.
+ * This rcu_read_lock() delays freeing anon_vma pointer until the end
+ * of migration. File cache pages are no problem because of page_lock()
*/
+ rcu_read_lock();
+ /*
+ * This is a corner case handling.
+ * When a new swap-cache is read into, it is linked to LRU
+ * and treated as swapcache but has no rmap yet.
+ * Calling try_to_unmap() against a page->mapping==NULL page is
+ * BUG. So handle it here.
+ */
+ if (!page->mapping)
+ goto rcu_unlock;
+ /* Establish migration ptes or remove ptes */
try_to_unmap(page, 1);
+
if (!page_mapped(page))
rc = move_to_new_page(newpage, page);
if (rc)
remove_migration_ptes(page, page);
+rcu_unlock:
+ rcu_read_unlock();
unlock:
+
unlock_page(page);
if (rc != -EAGAIN) {