Age | Commit message (Collapse) | Author |
|
Also remove MIPS assembler/disassembler support.
Test: aosp_taimen-userdebug boots.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 147346243
Change-Id: Id736074b97cd04987a7902741828b119508df1c0
|
|
Omit managed frame for @CriticalNative methods, do not check
for exceptions and and make a tail call when possible.
Pass the method pointer in a hidden argument to prepare for
implementing late binding for @CriticalNative methods.
This changes only the JNI compiler, Generic JNI shall be
updated in a separate change.
Performance improvements reported by Golem (art-opt-cc):
x86 x86-64 arm arm64
NativeDowncallStaticCritical6 +17% +50% +88% +139%
NativeDowncallStaticCritical +37% +32% +103% +216%
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: aosp_taimen-userdebug boots.
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Bug: 112189621
Change-Id: I5758c8f478627f2eee8f615b4537a907c211b9f8
|
|
Generated by clang-tidy, with IgnoreDestructors option enabled.
Test: m checkbuild
Bug: 116509795
Change-Id: I5dafa10c2cf605165581b8cf7dd2633ed101ed65
|
|
Handles compiler.
Bug: 116054210
Test: WITH_TIDY=1 mmma art
Change-Id: I5cdfe73c31ac39144838a2736146b71de037425e
|
|
Remove all uses of macros 'FINAL' and 'OVERRIDE' and replace them with
'final' and 'override' specifiers. Remove all definitions of these
macros as well, which were located in these files:
- libartbase/base/macros.h
- test/913-heaps/heaps.cc
- test/ti-agent/ti_macros.h
ART is now using C++14; the 'final' and 'override' specifiers have
been introduced in C++11.
Test: mmma art
Change-Id: I256c7758155a71a2940ef2574925a44076feeebf
|
|
Remove runtime/globals.h and make clients point to the right globals.h
(libartbase/base/globals.h). Also make within-libartbase includes
relative rather than using base/, etc.
Bug: 22322814
Test: make -j 40 checkbuild
Change-Id: I99de63fc851d48946ab401e2369de944419041c7
|
|
Test: ./testrunner.py --target --optimizing in QEMU
Test: mma test-art-host-gtest
Change-Id: I6ce5bdc86f951094f656c2f81ae8fc836d7a0b5c
|
|
This is mostly for clarity and future work.
This fixes the following:
- aui has an out reg, not an in/out reg
- maddv.df, msubv.df, fmadd.df, fmsub.df have an
in/out reg, not a simply out reg
This also ensures consistent marking of even-numbered 32-bit
FPRs used by FPR load and store instructions (odd-numbered
32-bit FPRs remain unmarked as if there are no paired FPRs;
we don't use odd-numbered 32-bit FPRs to hold single-precision
values).
Test: test-art-host-gtest
Test: booted MIPS32R2 in QEMU
Test: booted MIPS64R6 in QEMU
Test: testrunner.py --target --optimizing --32
(on CI20 and MIPS32R6)
Test: test-art-target-gtest32
(on CI20 and MIPS32R6)
Change-Id: I408b8ac063c9b1cc6f036dda095d1e3e1e2e1ef1
|
|
These instructions are needed for implementing Sum-of-Abs-Differences
visitor.
Test: mma test-art-host-gtest
Change-Id: Ie708f30a450b0558215f59f21bb49b68c852f247
|
|
These instructions are needed for SIMD reduction.
Also added assembler tests for each instruction.
Test: mma test-art-host-gtest
Change-Id: I0f02618a14b4cbcc3b81ce51dd2586fa4cdbfd18
|
|
Test: test-art-host-gtest
Test: booted MIPS32R2 in QEMU
Test: testrunner.py --target --optimizing --32
Test: repeat all of the above with suppressed generation
of HMipsPackedSwitch
Change-Id: Ic8a27d88cd2d7eebaf5826ce8fd1a5607a024844
|
|
Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 25.1MiB -> 21.1MiB
BatteryStats.dumpLocked(): 49.6MiB -> 42.0MiB
This is because all the memory previously used by Scheduler
is reused by the register allocator; the register allocator
has a higher peak usage of the ArenaStack.
And continue the "arena"->"allocator" renaming.
Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 64312607
Change-Id: Idfd79a9901552b5147ec0bf591cb38120de86b01
|
|
CriticalNative methods shall not be suspended and hence do not
require MR to be refreshed in compiled JNI code.
This change is for ARM and ARM64 only.
Impact on Critical Native benchmarks times (median of 10 runs,
lower is better):
* angler-userdebug - ARMv7
** All cores
NativeDowncallStaticCritical -2.78%
NativeDowncallStaticCritical6 -1.79%
** Little cores only
NativeDowncallStaticCritical -1.66%
NativeDowncallStaticCritical6 -1.27%
** Big cores only
NativeDowncallStaticCritical -2.66%
NativeDowncallStaticCritical6 -1.70%
* angler-userdebug - ARMv8
** All cores
NativeDowncallStaticCritical -3.52%
NativeDowncallStaticCritical6 -1.79%
** Little cores only
NativeDowncallStaticCritical -1.63%
NativeDowncallStaticCritical6 -1.27%
** Big cores only
NativeDowncallStaticCritical -3.87%
NativeDowncallStaticCritical6 -1.75%
Test: m test-art-target
Test: m test-art-target with tree built with ART_USE_READ_BARRIER=false
Test: m test-art-host-gtest
Test: ARM64 device boot test
Test: ARM device boot test
Bug: b/37707231
Change-Id: I95d61b9ecde0afffdd5fd44763b19caa06025ec8
|
|
Remove mostly-unused include and move it to its users.
Test: m
Change-Id: Ibb40f919db64a490290c6e18cf1123aaf44199fc
|
|
Test: test-art-host-gtest
Test: booted MIPS64 (with 2nd arch MIPS32R6) in QEMU
Test: test-art-target-gtest32
Test: testrunner.py --target --optimizing --32
Test: same tests as above on CI20
Test: booted MIPS32R2 in QEMU
Change-Id: I7e1ba59993008014d0115ae20c56e0a71fef0fb0
|
|
The bulk of the change is in the assemblers and their
tests.
The main goal is to introduce "bare" branches to labels
(as opposed to the existing bare branches with relative
offsets, whose direct use we want to eliminate).
These branches' delay/forbidden slots are filled
manually and these branches do not promote to long (the
branch target must be within reach of the individual
branch instruction).
The secondary goal is to add more branch tests (mainly
for bare vs non-bare branches and a few extra) and
refactor and reorganize the branch test code a bit.
The third goal is to improve idiom recognition in the
disassembler, including branch idioms and a few others.
Further details:
- introduce bare branches (R2 and R6) to labels, making
R2 branches available for use on R6
- make use of the above in the code generators
- align beqz/bnez with their GNU assembler encoding to
simplify and shorten the test code
- update the CFI test because of the above
- add trivial tests for bare and non-bare branches
(addressing existing debt as well)
- add MIPS32R6 tests for long beqc/beqzc/bc (debt)
- add MIPS64R6 long beqzc test (debt)
- group branch tests together
- group constant/literal/address-loading tests together
- make the disassembler recognize:
- b/beqz/bnez (beq/bne with $zero reg)
- nal (bltzal with $zero reg)
- bal/bgezal (bal = bgezal with $zero reg)
- move (or with $zero reg)
- li (ori/addiu with $zero reg)
- dli (daddiu with $zero reg)
- disassemble 16-bit immediate operands (in andi, ori,
xori, li, dli) as signed or unsigned as appropriate
- drop unused instructions (bltzl, bltzall, addi) from
the disassembler as there are no plans to use them
Test: test-art-host-gtest
Test: booted MIPS64 (with 2nd arch MIPS32R6) in QEMU
Test: test-art-target-gtest
Test: testrunner.py --target --optimizing
Test: same tests as above on CI20
Test: booted MIPS32R2 in QEMU
Change-Id: I62b74a6c00ce0651528114806ba24a59ba564a73
|
|
Added maddv.df, msubv.df, fmadd.df and fmsub.df MSA instructions
in assembler, disassembler and tests.
These instructions are needed for multiplyaccumulate support in
ART Vectorizer.
Test: mma test-art-host-gtest
Change-Id: Idef7faaeed47f1fef83fa58676ce664afe24ffe8
|
|
Test: booted MIPS64 (with 2nd arch MIPS32R6) in QEMU
Test: test-art-target-gtest
Test: testrunner.py --target --optimizing
Test: same tests as above on CI20
Test: booted MIPS32 and MIPS64 in QEMU with poisoning
in configurations:
- with Baker read barrier thunks
- without Baker read barrier thunks
- ART_READ_BARRIER_TYPE=TABLELOOKUP
Change-Id: I79f320bf8862a04215c76cfeff3118ebc87f7ef2
|
|
MIPS32 implementation which uses MSA extension.
Note: Testing is done with checker parts of tests 640, 645, 646 and
651, locally changed to cover MIPS32 cases. These changes can't
be included in this patch since MSA is not a default option.
Test: ./testrunner.py --target --optimizing -j1 in QEMU (mips32r6)
Change-Id: Ieba28f94c48c943d5444017bede9a5d409149762
|
|
Test: mma test-art-host-gtest
Test: ./testrunner.py --optimizing --target in QEMU (MIPS)
Change-Id: I90b7baa1d5f910887bcc3ab80a1a48391ba80c45
|
|
|
|
Move iterator code from bit_utils.h into bit_utils_iterator.h. Move
Identity into stl_util_identity.h. Remove now unnecessary includes,
and fix up transitive users.
Test: m
Change-Id: Id1ce9cda66827c5d00584f39ed310b6b37629906
|
|
Added a number of MSA (The MIPS SIMD Architecture) instructions.
Added assembler tests for each instruction.
Test: mma test-art-host-gtest
Change-Id: I1d499309fc08923484f64d1883b9c3f95eadd3be
|
|
For MIPS32R6 replace instances of "sll/addu" to calculate the
address of an item in an array with "lsa". For other versions of
MIPS32 use the "sll/addu" sequence. Encapsulate this logic in an
assembler method to eliminate having a lot of statements like
"if (IsR6()) { ... } else { ... }" scattered throughout the code.
MIPS64 always supports R6. This means that all instances of
"dsll/daddu" used to calculate the address of an item in an array
can be replaced by "dlsa" so there is no need to encapsulate
conditional logic in a special method. The code can just emit
"dlsa" directly.
Test: mma -j2 ART_TEST_OPTIMIZING=true test-art-target-run-test
Tested on MIPS32, and MIPS64 QEMU.
Test: "make test-art-target-gtest32" on CI20 board.
Test: "cd art; test/testrunner/testrunner.py --target --optimizing --32"
on CI20 board.
Change-Id: Ibe5facc1bc2a6a7a6584e23d3a48e163ae38077d
|
|
Rationale: on MIPS64 64-bit loads and stores may be performed
as pairs of 32-bit loads/stores. Implicit null checks must be
associated with the first 32-bit load/store in a pair and not
the last. This change ensures proper association of said checks
(a few were done after the last 32-bit load/store in a pair)
and lays ground for further improvements in array/field get/set.
Additionally ported to MIPS32.
Test: mma test-art-target-run-test in QEMU
Test: mma test-art-host-gtest
Change-Id: If2612df62c21522959e69c637a36cc4ea962a32e
|
|
This is in preparation for read barrier support.
Bug: 12687968
Test: test-art-host-gtest
Test: booted MIPS32R2 in QEMU
Test: test-art-target
Test: booted MIPS64 (with 2nd arch MIPS32R6) in QEMU
Test: test-art-target (both MIPS64R6 and MIPS32R6)
Note: built with ART_HEAP_POISONING=true.
Change-Id: I0e6e04ff8de2fc8ca6126388409fa218e6920734
|
|
Use memcpy(3) to copy characters under the assumption that memcpy()
has been hand optimized for best performance on the platform being
tested.
Test: run-test --optimizing 020-string
Test: run-test 020-string
Test: run-test --no-prebuild --optimizing 020-string
Test: run-test --no-prebuild 020-string
Test: run-test --optimizing 082-inline-execute
Test: run-test 082-inline-execute
Test: run-test --no-prebuild --optimizing 082-inline-execute
Test: run-test --no-prebuild 082-inline-execute
Test: mma -j2 ART_TEST_OPTIMIZING=true test-art-target-run-test
Test: booted MIPS32R2 emulator.
Note: Tested against both the MIPS32R2, and MIPS64R6 emulators.
Change-Id: I4192cf6244db120c8de5cc4932d4132acfc9740d
|
|
Test: booted MIPS32R2 in QEMU
Test: test-art-target-run-test-optimizing (MIPS32R2) on CI20
Test: booted MIPS64 (with 2nd arch MIPS32R6) in QEMU
Test: test-art-target-run-test-optimizing (MIPS32R6) in QEMU
Test: test-art-host-gtest
Change-Id: I8a8127d8d29cb5df84ed6f4fd4478f8d889e5cb7
|
|
Static method dispatch via JNI requires a read barrier
for the ArtMethod::GetDeclaringClass() load before adding it to the
JNI StackHandleScope.
We used to call ReadBarrierJni unconditionally but add a branch
to skip calling it if the GC is not currently in the marking phase.
Test: ART_USE_READ_BARRIER=true make test-art-host test-art-target
Bug: 30437917
Change-Id: I4f505ebde17c0a67209c7bb51b3f39e37a06373a
|
|
Test: booted MIPS32 in QEMU
Test: test-art-target-run-test-optimizing on CI20
Test: test-art-host-gtest
Change-Id: Ifcf8c1e215e3768711c391e8da6f663bba71f8d9
|
|
Test: booted MIPS32R2 in QEMU
Test: test-art-target-run-test-optimizing (MIPS32R2) on CI20
Test: booted MIPS64 (with 2nd arch MIPS32R6) in QEMU
Test: test-art-target-run-test-optimizing (MIPS32R6) in QEMU
Test: test-art-host-gtest
Change-Id: I2e1a65ff1ba9406b84351ba7998f853b1ce4aef9
|
|
Test: booted MIPS32 in QEMU
Test: test-art-host-gtest
Test: test-art-target-gtest
Test: test-art-target-run-test-optimizing on CI20
Change-Id: I727e80753395ab99fff004cb5d2e0a06409150d7
|
|
Rationale: on MIPS32 64-bit loads and stores may be performed
as pairs of 32-bit loads/stores. Implicit null checks must be
associated with the first 32-bit load/store in a pair and not
the last. This change ensures proper association of said checks
(a few were done after the last 32-bit load/store in a pair)
and lays ground for further improvements in array/field get/set.
Test: booted MIPS32 in QEMU
Test: test-art-host-gtest
Test: test-art-target-run-test-optimizing in QEMU
Change-Id: I3674947c00bb17930790a7a47c9b7aadc0c030b8
|
|
Extract macro assembler functionality used by the JNI compiler from
the assembler interface. Templatize the new interface so that
type safety ensures correct usage.
Change-Id: Idb9f56e5b87e43ee6a7378853d8a9f01abe156b2
Test: m test-art-host
|
|
Move away from size_t to dedicated enum (class).
Bug: 30373134
Bug: 30419309
Test: m test-art-host
Change-Id: Id453c330f1065012e7d4f9fc24ac477cc9bb9269
|
|
Tested:
- MIPS32 Android boots in QEMU
- test-art-host-gtest
- test-art-target-run-test-optimizing in QEMU, on CI20
- test-art-target-gtest on CI20
Change-Id: I70fd5d5267f8594c3b29d5a4ccf66b8ca8b09df3
|
|
Improvements include:
- CodeGeneratorMIPS::GenerateStaticOrDirectCall() supports:
- MethodLoadKind::kDirectAddressWithFixup (via literals)
- CodePtrLocation::kCallDirectWithFixup (via literals)
- MethodLoadKind::kDexCachePcRelative
- 32-bit literals to support the above (not ready for general-
purpose applications yet because RA is not saved in leaf
methods, but is clobbered on MIPS32R2 when simulating
PC-relative addressing (MIPS32R6 is OK because it has
PC-relative addressing with the lwpc instruction))
- shorter instruction sequences for recursive static/direct
calls
Tested:
- test-art-host-gtest
- test-art-target-gtest and test-art-target-run-test-optimizing on:
- MIPS32R2 QEMU
- CI20 board
- MIPS32R6 (2nd arch) QEMU
Change-Id: Id5b137ad32d5590487fd154c9a01d3b3e7e044ff
|
|
|
|
Change-Id: I6c3773e8bc1233bcda83d5b7254438ef69e9570d
|
|
Precalculate callee saves at compile time and return them
as ArrayRef<> instead of keeping then in a std::vector<>.
Change-Id: I4fd7d2bbf6138dc31b0fe8554eac35b0777ec9ef
|
|
And clean up some APIs to return std::unique_ptr<> instead
of raw pointers that don't communicate ownership.
Change-Id: I3017302307a0253d661240750298802fb0d9585e
|
|
Change-Id: Ie871763b9a36075fd3d70ee6e2e241ae1ccc36cf
|
|
- abs(double) - abs(float) - abs(int)
- abs(long) - max(double, double) - max(float, float)
- max(int, int) - max(long, long) - min(double, double)
- min(float, float) - min(int, int) - min(long, long)
- sqrt(double)
The math intrinsics:
- ceil(double) - floor(double) - rint(double)
- round(double) - round(float)
aren't implemented because they require instructions which only exist
for MIPS64, or for MIPS32r6.
Change-Id: I943be3592b52a423fcb7ac40f46f38a5e2a58c50
|
|
- byte libcore.io.Memory.peekByte(long address)
- short libcore.io.Memory.peekShort(long address)
- int libcore.io.Memory.peekInt(long address)
- long libcore.io.Memory.peekLong(long address)
- void libcore.io.Memory.pokeByte(long address, byte value)
- void libcore.io.Memory.pokeShort(long address, short value)
- void libcore.io.Memory.pokeInt(long address, int value)
- void libcore.io.Memory.pokeLong(long address, long value)
- char java.lang.String.charAt(int index)
Change-Id: I5ff30b61d87313d00f0fd3f0ee09f1c454f9c9fa
|
|
32-bit FPUs."
|
|
with 32-bit FPUs.
Change-Id: If66932fb39cdd5946f6c05c82036191ad405a877
|
|
Change-Id: I767fe9623cc14e8480c31e305725eb5221cac282
|
|
Specifically:
- Use the delay slot in InvokeRuntime() for direct entry points
- Use kNoOutputOverlap wherever possible
- Improve and/or/xor/add/sub with 64-bit integer constants
- Improve 64-bit shifts by a constant amount on R2+
- More efficient load/store of 64-bit constants (especially, 0 & +0.0)
Change-Id: I86d2217c8b5b8e2a9371effc2ce38b9eec62782b
|
|
This also does a minor clean-up in the assembler and
its test.
Bug: 25559148
Change-Id: I9bad3c500b592a09013b56745f70752eb284a842
|
|
|