JNI: Move args in registers for @FastNative.

Golem results for art-opt-cc (higher is better):
linux-ia32                     before after
NativeDowncallStaticFast       222.00 222.17 (+0.0751%)
NativeDowncallStaticFast6      139.86 161.00 (+15.11%)
NativeDowncallStaticFastRefs6  131.00 137.86 (+5.238%)
NativeDowncallVirtualFast      211.79 217.17 (+2.543%)
NativeDowncallVirtualFast6     137.36 150.55 (+9.599%)
NativeDowncallVirtualFastRefs6 131.50 132.60 (+0.8382%)
linux-x64                      before after
NativeDowncallStaticFast       173.15 173.24 (+0.0499%)
NativeDowncallStaticFast6      135.50 157.61 (+16.31%)
NativeDowncallStaticFastRefs6  127.06 134.87 (+6.147%)
NativeDowncallVirtualFast      163.67 165.83 (+1.321%)
NativeDowncallVirtualFast6     128.18 147.35 (+14.96%)
NativeDowncallVirtualFastRefs6 123.44 130.74 (+5.914%)
linux-armv7                    before after
NativeDowncallStaticFast       21.622 21.622 (0%)
NativeDowncallStaticFast6      17.250 18.719 (+8.518%)
NativeDowncallStaticFastRefs6  14.757 15.663 (+6.145%)
NativeDowncallVirtualFast      21.027 21.319 (+1.388%)
NativeDowncallVirtualFast6     17.439 18.953 (+8.680%)
NativeDowncallVirtualFastRefs6 14.764 15.992 (+8.319%)
linux-armv8                    before after
NativeDowncallStaticFast       23.244 23.610 (+1.575%)
NativeDowncallStaticFast6      18.719 21.622 (+15.50%)
NativeDowncallStaticFastRefs6  14.757 18.491 (+20.89%)
NativeDowncallVirtualFast      20.197 21.319 (+5.554%)
NativeDowncallVirtualFast6     18.272 21.027 (+15.08%)
NativeDowncallVirtualFastRefs6 13.951 16.865 (+20.89%)
(The arm64 NativeDowncallVirtualFast reference value is very
low, resulting in an unexpected +5.554% improvement. As the
previous results seem to jump between 20.197 and 20.741,
the actual improvement is probably just around 2.5%.)

Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Bug: 172332525
Change-Id: I2b596414458b48a758826eafc223529e9f2fe059
diff --git a/compiler/utils/jni_macro_assembler.h b/compiler/utils/jni_macro_assembler.h
index 7f5dc2f..abb53b7 100644
--- a/compiler/utils/jni_macro_assembler.h
+++ b/compiler/utils/jni_macro_assembler.h
@@ -152,7 +152,16 @@
   virtual void LoadRawPtrFromThread(ManagedRegister dest, ThreadOffset<kPointerSize> offs) = 0;
 
   // Copying routines
-  virtual void MoveArguments(ArrayRef<ArgumentLocation> dests, ArrayRef<ArgumentLocation> srcs) = 0;
+
+  // Move arguments from `srcs` locations to `dests` locations.
+  //
+  // References shall be spilled to `refs` frame offsets (kInvalidReferenceOffset indicates
+  // a non-reference type) if they are in registers and corresponding `dests` shall be
+  // filled with `jobject` replacements. If the first argument is a reference, it is
+  // assumed to be `this` and cannot be null, all other reference arguments can be null.
+  virtual void MoveArguments(ArrayRef<ArgumentLocation> dests,
+                             ArrayRef<ArgumentLocation> srcs,
+                             ArrayRef<FrameOffset> refs) = 0;
 
   virtual void Move(ManagedRegister dest, ManagedRegister src, size_t size) = 0;
 
@@ -276,6 +285,8 @@
     emit_run_time_checks_in_debug_mode_ = value;
   }
 
+  static constexpr FrameOffset kInvalidReferenceOffset = FrameOffset(0);
+
  protected:
   JNIMacroAssembler() {}