[PATCH] sched: reduce overhead of calc_load
Currently, count_active_tasks() calls both nr_running() &
nr_interruptible(). Each of these functions does a "for_each_cpu" & reads
values from the runqueue of each cpu. Although this is not a lot of
instructions, each runqueue may be located on different node. Depending on
the architecture, a unique TLB entry may be required to access each
runqueue.
Since there may be more runqueues than cpu TLB entries, a scan of all
runqueues can trash the TLB. Each memory reference incurs a TLB miss &
refill.
In addition, the runqueue cacheline that contains nr_running &
nr_uninterruptible may be evicted from the cache between the two passes.
This causes unnecessary cache misses.
Combining nr_running() & nr_interruptible() into a single function
substantially reduces the TLB & cache misses on large systems. This should
have no measureable effect on smaller systems.
On a 128p IA64 system running a memory stress workload, the new function
reduced the overhead of calc_load() from 605 usec/call to 324 usec/call.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
diff --git a/kernel/sched.c b/kernel/sched.c
index a9ecac3..6e52e0a 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1658,6 +1658,21 @@
return sum;
}
+unsigned long nr_active(void)
+{
+ unsigned long i, running = 0, uninterruptible = 0;
+
+ for_each_online_cpu(i) {
+ running += cpu_rq(i)->nr_running;
+ uninterruptible += cpu_rq(i)->nr_uninterruptible;
+ }
+
+ if (unlikely((long)uninterruptible < 0))
+ uninterruptible = 0;
+
+ return running + uninterruptible;
+}
+
#ifdef CONFIG_SMP
/*
diff --git a/kernel/timer.c b/kernel/timer.c
index 9062a82..6b812c0 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -825,7 +825,7 @@
*/
static unsigned long count_active_tasks(void)
{
- return (nr_running() + nr_uninterruptible()) * FIXED_1;
+ return nr_active() * FIXED_1;
}
/*