Do not create HandleScope for JNI transitions.

We previously crated a HandleScope in the JNI transition
frame to hold references passed as jobject (jclass, etc.)
to the native function and these references were actually
spilled twice during the transition.

We now construct the jobject as a pointer to the reference
spilled in the reserved out vreg area in the caller's frame.
And the jclass for static methods is just a pointer to the
method's declaring class. This reduces the amount of work
required in the JNI transition, both on entry (in compiled
stubs) and exit (in JniMethodEnd*).

Some additional work is required when GC visits references
of a native method as we need to walk over the method's
shorty which was unnecessary for a HandleScope.

Also fix Thread::InitStackHwm() to calculate correct stack
size needed by the new Thread::IsJniTransitionReference().

The results for StringToBytesBenchmark on blueline little
cores running at fixed frequency 1420800 are approximately
arm64 (medians from 3 runs) before after
timeGetBytesAscii EMPTY     447.33 436.86
timeGetBytesIso88591 EMPTY  440.52 431.13
timeGetBytesUtf8 EMPTY      432.31 409.82
arm (medians from 3 runs)   before after
timeGetBytesAscii EMPTY     500.53 490.87
timeGetBytesIso88591 EMPTY  496.45 495.30
timeGetBytesUtf8 EMPTY      488.84 472.68

Test: m test-art-host-gtest
Test: testrunner.py --host
Test: testrunner.py --host --gcstress
Test: testrunner.py --host --jit-on-first-use
Test: testrunner.py --host --jit-on-first-use --gcstress
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Test: boots.
Bug: 172332525
Change-Id: I658f9d87071587b3e89f31c65feca976a11e9cc2
43 files changed