Fix ARM64 SystemArrayCopy intrinsic with large constant dest position.
Make sure we do not deplete the whole VIXL scratch register pool, so
that VIXL can still use IP0 as a temporary when emitting
macro-instructions.
Test: art/test/testrunner/testrunner.py --optimizing --target --64
Bug: 37256530
Change-Id: I5da22e552297fad87db5763e2dab60ae6a7a43af
diff --git a/compiler/optimizing/intrinsics_arm64.cc b/compiler/optimizing/intrinsics_arm64.cc
index 423fd3c..8485c32 100644
--- a/compiler/optimizing/intrinsics_arm64.cc
+++ b/compiler/optimizing/intrinsics_arm64.cc
@@ -2755,9 +2755,17 @@
// Make sure `tmp` is not IP0, as it is clobbered by
// ReadBarrierMarkRegX entry points in
// ReadBarrierSystemArrayCopySlowPathARM64.
+ DCHECK(temps.IsAvailable(ip0));
temps.Exclude(ip0);
Register tmp = temps.AcquireW();
DCHECK_NE(LocationFrom(tmp).reg(), IP0);
+ // Put IP0 back in the pool so that VIXL has at least one
+ // scratch register available to emit macro-instructions (note
+ // that IP1 is already used for `tmp`). Indeed some
+ // macro-instructions used in GenSystemArrayCopyAddresses
+ // (invoked hereunder) may require a scratch register (for
+ // instance to emit a load with a large constant offset).
+ temps.Include(ip0);
// /* int32_t */ monitor = src->monitor_
__ Ldr(tmp, HeapOperand(src.W(), monitor_offset));