[PATCH] Fix RCU race in access of nohz_cpu_mask

Accessing nohz_cpu_mask before incrementing rcp->cur is racy.  It can cause
tickless idle CPUs to be included in rsp->cpumask, which will extend
graceperiods unnecessarily.

Fix this race.  It has been tested using extensions to RCU torture module
that forces various CPUs to become idle.

Signed-off-by: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index f45b917..48d3bce 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -257,15 +257,23 @@
 
 	if (rcp->next_pending &&
 			rcp->completed == rcp->cur) {
-		/* Can't change, since spin lock held. */
-		cpus_andnot(rsp->cpumask, cpu_online_map, nohz_cpu_mask);
-
 		rcp->next_pending = 0;
-		/* next_pending == 0 must be visible in __rcu_process_callbacks()
-		 * before it can see new value of cur.
+		/*
+		 * next_pending == 0 must be visible in
+		 * __rcu_process_callbacks() before it can see new value of cur.
 		 */
 		smp_wmb();
 		rcp->cur++;
+
+		/*
+		 * Accessing nohz_cpu_mask before incrementing rcp->cur needs a
+		 * Barrier  Otherwise it can cause tickless idle CPUs to be
+		 * included in rsp->cpumask, which will extend graceperiods
+		 * unnecessarily.
+		 */
+		smp_mb();
+		cpus_andnot(rsp->cpumask, cpu_online_map, nohz_cpu_mask);
+
 	}
 }