[PATCH] roundup_pow_of_two() 64-bit fix

fls() takes an integer, so roundup_pow_of_two() is busted for ulongs larger
than 2^32-1.

Fix this by implementing and using fls_long().

(Why does roundup_pow_of_two() return a long?)

(Why is roundup_pow_of_two() __attribute_const__ whereas long_log2() is
__attribute_pure__?)

(Why does long_log2() suck so much?  Because we were missing fls_long()?)

Cc: Roland Dreier <rdreier@cisco.com>
Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Cc: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 208650b..f17525a 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -175,4 +175,11 @@
 	return (word >> shift) | (word << (32 - shift));
 }
 
+static inline unsigned fls_long(unsigned long l)
+{
+	if (sizeof(l) == 4)
+		return fls(l);
+	return fls64(l);
+}
+
 #endif
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index bb6e7dd..03d6cfa 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -154,9 +154,10 @@
 	return r;
 }
 
-static inline unsigned long __attribute_const__ roundup_pow_of_two(unsigned long x)
+static inline unsigned long
+__attribute_const__ roundup_pow_of_two(unsigned long x)
 {
-	return (1UL << fls(x - 1));
+	return 1UL << fls_long(x - 1);
 }
 
 extern int printk_ratelimit(void);