[MIPS] R4000/R4400 errata workarounds

 This is the gereric part of R4000/R4400 errata workarounds.  They include 
compiler and assembler support as well as some source code modifications 
to address the problems with some combinations of multiply/divide+shift 
instructions as well as the daddi and daddiu instructions.

 Changes included are as follows:

1. New Kconfig options to select workarounds by platforms as necessary.

2. Arch top-level Makefile to pass necessary options to the compiler; also 
   incompatible configurations are detected (-mno-sym32 unsupported as 
   horribly intrusive for little gain).

3. Bug detection updated and shuffled -- the multiply/divide+shift problem 
   is lethal enough that if not worked around it makes the kernel crash in 
   time_init() because of a division by zero; the daddiu erratum might 
   also trigger early potentially, though I have not observed it.  On the 
   other hand the daddi detection code requires the exception subsystem to 
   have been initialised (and is there mainly for information).

4. r4k_daddiu_bug() added so that the existence of the erratum can be 
   queried by code at the run time as necessary; useful for generated code 
   like TLB fault and copy/clear page handlers.

5. __udelay() updated as it uses multiplication in inline assembly.

 Note that -mdaddi requires modified toolchain (which has been maintained 
by myself and available from my site for ~4years now -- versions covered 
are GCC 2.95.4 - 4.1.2 and binutils from 2.13 onwards).  The -mfix-r4000 
and -mfix-r4400 have been standard for a while though.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
diff --git a/include/asm-mips/delay.h b/include/asm-mips/delay.h
index fab3213..de5105d 100644
--- a/include/asm-mips/delay.h
+++ b/include/asm-mips/delay.h
@@ -6,13 +6,16 @@
  * Copyright (C) 1994 by Waldorf Electronics
  * Copyright (C) 1995 - 2000, 01, 03 by Ralf Baechle
  * Copyright (C) 1999, 2000 Silicon Graphics, Inc.
+ * Copyright (C) 2007  Maciej W. Rozycki
  */
 #ifndef _ASM_DELAY_H
 #define _ASM_DELAY_H
 
 #include <linux/param.h>
 #include <linux/smp.h>
+
 #include <asm/compiler.h>
+#include <asm/war.h>
 
 static inline void __delay(unsigned long loops)
 {
@@ -50,7 +53,7 @@
 
 static inline void __udelay(unsigned long usecs, unsigned long lpj)
 {
-	unsigned long lo;
+	unsigned long hi, lo;
 
 	/*
 	 * The rates of 128 is rounded wrongly by the catchall case
@@ -70,11 +73,16 @@
 		: "=h" (usecs), "=l" (lo)
 		: "r" (usecs), "r" (lpj)
 		: GCC_REG_ACCUM);
-	else if (sizeof(long) == 8)
+	else if (sizeof(long) == 8 && !R4000_WAR)
 		__asm__("dmultu\t%2, %3"
 		: "=h" (usecs), "=l" (lo)
 		: "r" (usecs), "r" (lpj)
 		: GCC_REG_ACCUM);
+	else if (sizeof(long) == 8 && R4000_WAR)
+		__asm__("dmultu\t%3, %4\n\tmfhi\t%0"
+		: "=r" (usecs), "=h" (hi), "=l" (lo)
+		: "r" (usecs), "r" (lpj)
+		: GCC_REG_ACCUM);
 
 	__delay(usecs);
 }