JNI: Improve argument passing for normal native.

Spill outgoing stack arguments directly to their stack slots
(except for `this` on x86) and convert such references to
`jobject` while spilling. Use the `MoveArguments()` call for
both argument spilling and loading regsister arguments to
let the assembler use multi-register stores.

Improve arm64 JNI assembler to use LDP/STP in the relevant
situations when spilling and loading registers.

Fix arm JNI assembler that called `CreateJObject()` with
a bogus input register in one case.

Golem results for art-opt-cc (higher is better):
linux-ia32                       before after
NativeDowncallStaticNormal6      25.074 25.578 (+2.011%)
NativeDowncallStaticNormalRefs6  25.248 25.248 (0%)
NativeDowncallVirtualNormal6     24.913 25.248 (+1.344%)
NativeDowncallVirtualNormalRefs6 25.074 25.086 (+0.482%)
linux-x64                        before after
NativeDowncallStaticNormal6      27.000 26.987 (-0.0500%)
NativeDowncallStaticNormalRefs6  25.411 25.411 (0%)
NativeDowncallVirtualNormal6     25.248 25.086 (-0.6395%)
NativeDowncallVirtualNormalRefs6 25.086 25.074 (-0.0492%)
linux-armv7                      before after
NativeDowncallStaticNormal6      5.9259 6.0663 (+2.368%)
NativeDowncallStaticNormalRefs6  5.6232 5.7061 (+1.474%)
NativeDowncallVirtualNormal6     5.3659 5.4536 (+1.636%)
NativeDowncallVirtualNormalRefs6 5.0879 5.1595 (+1.407%)
linux-armv8                      before after
NativeDowncallStaticNormal6      6.0663 6.2651 (+3.277%)
NativeDowncallStaticNormalRefs6  5.7279 5.8824 (+2.696%)
NativeDowncallVirtualNormal6     5.9494 6.0663 (+1.964%)
NativeDowncallVirtualNormalRefs6 5.5581 5.6630 (+1.888%)
(The x86 and x86-64 differences seem to be lost in noise.)

Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Test: run-gtests.sh
Test: testrunner.py --target --optimizing --jit
Bug: 172332525
Change-Id: Iaba8244c44d410bb1a4e31f90e4387ee5cc51bec
3 files changed