Fixes for invoke/move-result fusing, recursion bug

Fix for the Arm move-result fusing - NEW_FILLED_ARRAY wasn't
being handled properly.  Still keeping x86 disabled.  Replaced
the recursive dfs order computation with an iterative version.  Could
be improved, but I'll wait to see if it shows up as an issue during
compile-time profiling.

Keeping the old recursive version code in place for a little while until
we're sure the new mechanism computes the exact same orderings.

With this CL we stop running out of thread stack memory on the 003
runtest.

Change-Id: Iab80f42135b081a3f49e1ee26a29220e602ae7e8
diff --git a/src/compiler/codegen/GenCommon.cc b/src/compiler/codegen/GenCommon.cc
index f114b45..29f3cca 100644
--- a/src/compiler/codegen/GenCommon.cc
+++ b/src/compiler/codegen/GenCommon.cc
@@ -624,6 +624,9 @@
       }
     }
   }
+  if (info->result.location != kLocInvalid) {
+    storeValue(cUnit, info->result, oatGetReturn(cUnit, false /* not fp */));
+  }
 }
 
 void genSput(CompilationUnit* cUnit, uint32_t fieldIdx, RegLocation rlSrc,