Age | Commit message (Collapse) | Author |
|
Add some functions from `BitVector` to `BitVectorView<>`
and use this to speed up the `SsaLivenessAnalysis` pass.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 331194861
Change-Id: I5bca36da7b4d53a3d5e60f45530168cf20290120
|
|
After the old implementation was renamed in
https://android-review.googlesource.com/2526708 ,
we introduce a new function with the old name but new
behavior, just `DCHECK()`-ing the instruction kind before
casting down the pointer. We change appropriate calls from
`As##type##OrNull()` to `As##type()` to avoid unncessary
run-time checks and reduce the size of libart-compiler.so.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Bug: 181943478
Change-Id: I025681612a77ca2157fed4886ca47f2053975d4e
|
|
The null type check in the current implementation of
`HInstruction::As##type()` often cannot be optimized away
by clang++. It is therefore beneficial to have two functions
HInstruction::As##type()
HInstruction::As##type##OrNull()
where the first function never returns null but the second
one can return null. The additional text "OrNull" shall also
flag the possibility of yielding null to the developer which
may help avoid bugs similar to what we have seen previously.
This requires renaming the existing function that can return
null and introducing new function that cannot. However,
defining the new function `HInstruction::As##type()` in the
same change as renaming the old one would risk introducing
bugs by missing a rename. Therefore we simply rename the old
function here and the new function shall be introduced in a
separate change with all behavioral changes being explicit.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: buildbot-build.sh --target
Bug: 181943478
Change-Id: I4defd85038e28fe3506903ba3f33f723682b3298
|
|
Also so some style cleanup.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I34304acb39bc5197dde03543a6c157b3c319f94f
|
|
This reverts commit 0a51605ddd81635135463dab08b6f7c21b58ffb0.
Reason for revert: Reland after some of the required work
was merged in other CLs.
Also address a TODO from the original CL to mark required
symbols with EXPORT in `intrinsic_objects.h`.
Also mark symbols in new files as HIDDEN.
Bug: 186902856
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I936d448983928af23614ca82c2d0bf9a645e2c52
|
|
Follow-up to aosp/1988868 in which we added the (D)CHECK_IMPLIES
macro. This CL uses it on compiler/ occurrences found by a regex.
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: If63aed969bfb8b31d6fbbcb3bca2b04314c894b7
|
|
ART vectorizer assumes that there is single size of SIMD
register used for the whole program. Make this assumption explicit
and refactor the code.
Note: This is a base for the future introduction of SIMD slots of
size other than 8 or 16 bytes.
Test: test-art-target, test-art-host.
Change-Id: Id699d5e3590ca8c655ecd9f9ed4e63f49e3c4f9c
|
|
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf.
Reason for revert: Breaks ASAN tests (ODR violation).
Bug: 142365358
Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
|
|
Make symbols in compiler/optimizing hidden by a namespace
attribute. The unit intrinsic_objects.{h,cc} is excluded as
it is needed by dex2oat.
As the symbols are no longer exported, gtests are now linked
with the static version of the libartd-compiler library.
libart-compiler.so size:
- before:
arm: 2396152
arm64: 3345280
- after:
arm: 2016176 (-371KiB, -15.9%)
arm64: 2874480 (-460KiB, -14.1%)
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Bug: 142365358
Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
|
|
For SIMD graphs allocate 64 bit instead of 128 bit on stack for
each FP register to be preserved by the callee in the frame entry
as ABI suggests (currently 64-bit registers are preserved but
more space on stack is allocated).
Note: slow paths still require spilling full 128-bit Q-Registers
for SIMD graphs due to register allocator restrictions.
Test: test-art-target.
Change-Id: Ie0b12e4b769158445f3d0f4562c70d4fb0ea7744
|
|
Handles compiler.
Bug: 116054210
Test: WITH_TIDY=1 mmma art
Change-Id: I5cdfe73c31ac39144838a2736146b71de037425e
|
|
Rationale:
Currently we have some remaining ugliness around signed and unsigned
SIMD operations due to lack of kUint32 and kUint64 in the HIR. By
"softly" introducing these types, ABS/MIN/MAX/HALVING_ADD/SAD_ACCUMULATE
operations can solely rely on the packed data types to distinguish
between signed and unsigned operations. Cleaner, and also allows for
some code removal in the current loop optimizer.
Bug: 72709770
Test: test-art-host test-art-target
Change-Id: I68e4cdfba325f622a7256adbe649735569cab2a3
|
|
Reuse the memory previously allocated on the ArenaStack by
optimization passes.
This CL handles only the architecture-independent codegen
and slow paths, architecture-dependent codegen allocations
shall be moved to the ScopedArenaAllocator in a follow-up.
Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 19.6MiB -> 18.5MiB (-1189KiB)
BatteryStats.dumpLocked(): 39.3MiB -> 37.0MiB (-2379KiB)
Also move definitions of functions that use bit_vector-inl.h
from bit_vector.h also to bit_vector-inl.h .
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 64312607
Change-Id: I84688c3a5a95bf90f56bd3a150bc31fedc95f29c
|
|
Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 25.1MiB -> 21.1MiB
BatteryStats.dumpLocked(): 49.6MiB -> 42.0MiB
This is because all the memory previously used by Scheduler
is reused by the register allocator; the register allocator
has a higher peak usage of the ArenaStack.
And continue the "arena"->"allocator" renaming.
Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 64312607
Change-Id: Idfd79a9901552b5147ec0bf591cb38120de86b01
|
|
This CL adds all the necessary codegen for the Uint8 type
but does not add code transformations that use that code.
Vectorization codegens are modified to use Uint8 as the
packed type when appropriate. The side effects are now
disconnected from the instruction's type after the graph has
been built to allow changing HArrayGet/H*FieldGet/HVecLoad
to use a type different from the underlying field or array.
Note: HArrayGet for String.charAt() is modified to have
no side effects whatsoever; Strings are immutable.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Test: testrunner.py --target --optimizing on Nexus 6P
Test: Nexus 6P boots.
Bug: 23964345
Change-Id: If2dfffedcfb1f50db24570a1e9bd517b3f17bfd0
|
|
Replace most uses of the runtime's Primitive in compiler
with a new class DataType. This prepares for introducing
new types, such as Uint8, that the runtime does not need
to know about.
Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 23964345
Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
|
|
Test: m test-art-host-gtest
Test: testrunner.py --host
Change-Id: I2b720e2ed8f96303cf80e9daa6d5278bf0c3da2f
|
|
Rationale:
The last ART vectorizer break-out CL \O/
This ensures spilling on x86 and x86_4 is correct.
Also, it paves the way to wider SIMD on ARM and MIPS.
Test: test-art-host
Bug: 34083438
Change-Id: I5b27d18c2045f3ab70b64c335423b3ff2a507ac2
|
|
Rationale:
This prepares requesting a different number of spill slots
during SIMD vectorization.
Bug: 34083438
Test: test-art-host, test-art-host-gtest-register_allocator_test
Change-Id: I6d22966ba483deec72b5eea5061c403c12b2ada7
|
|
Normal and environment use positions are held in separate
lists and the code never mixes them together. By using two
separate classes, we can reduce complexity and avoid an
unnecesary data member, reducing the memory usage.
Tracking allocations for a certain big app, the peak arena
memory usage is
before:
MEM: used: 79245960, ...
SsaLiveness 31221600
after:
MEM: used: 78754024, ...
SsaLiveness 30729664
Test: testrunner.py --host
Bug: 34053922
Change-Id: I02d3c9f564bbe3b1da0e03c33cf7c0f810f235dc
|
|
The class linker now tracks whether a method has a single implementation
and if so, the JIT compiler will try to devirtualize a virtual call for
the method into a direct call. If the single-implementation assumption
is violated due to additional class linking, compiled code that makes the
assumption is invalidated. Deoptimization is triggered for compiled code
live on stack. Instead of patching return pc's on stack, a CHA guard is
added which checks a hidden should_deoptimize flag for deoptimization.
This approach limits the number of deoptimization points.
This CL does not devirtualize abstract/interface method invocation.
Slides on CHA:
https://docs.google.com/a/google.com/presentation/d/1Ax6cabP1vM44aLOaJU3B26n5fTE9w5YU-1CRevIDsBc/edit?usp=sharing
Change-Id: I18bf716a601b6413b46312e925a6ad9e4008efa4
Test: ART_TEST_JIT=true m test-art-host/target-run-test test-art-host-gtest
|
|
barriers.""
This reverts commit 4a3aa578eff94eb10450fae1772deb7cb8ddc6a6.
The failing assertion (see b/30762467):
08-09 11:32:46.767 1654 1656 F dex2oatd: art/compiler/optimizing/register_allocation_resolver.cc:325] Check failed: interval->GetDefinedBy()->IsActualObject() IntermediateAddress@InstanceFieldGet
that motivated the initial revert has been removed by a
previous CL (commit
70e97462116a47ef2e582ea29a037847debcc029,
https://android-review.googlesource.com/#/c/254920/).
Test: ART host and target (ARM, ARM64) tests with `ART_USE_READ_BARRIER=true`.
Bug: 26601270
Bug: 12687968
Change-Id: I09cae0c6c38ca403924153e9f0eb0cc3ff4540e7
|
|
Rationale:
Ownership of graph's linear order and iterators was
a bit unclear now that other phases are using it.
New approach allows phases to compute their own
order, while ssa_liveness is sole owner for graph
(since it is not mutated afterwards).
Also shortens lifetime of loop's arena.
Test: test-art-host
Change-Id: Ib7137d1203a1e0a12db49868f4117d48a4277f30
|
|
Reducing the frame size makes stack maps smaller as we need
fewer bits for stack masks and some dex register locations
may use short location kind rather than long. On Nexus 9,
AOSP ToT, the boot.oat size reduction is
prebuilt multi-part boot image:
- 32-bit boot.oat: -416KiB (-0.6%)
- 64-bit boot.oat: -635KiB (-0.9%)
prebuilt multi-part boot image with read barrier:
- 32-bit boot.oat: -483KiB (-0.7%)
- 64-bit boot.oat: -703KiB (-0.9%)
on-device built single boot image:
- 32-bit boot.oat: -380KiB (-0.6%)
- 64-bit boot.oat: -632KiB (-0.9%)
on-device built single boot image with read barrier:
- 32-bit boot.oat: -448KiB (-0.6%)
- 64-bit boot.oat: -692KiB (-0.9%)
The other benefit is that at runtime, threads may need fewer
pages for their stacks, reducing overall memory usage.
We defer the calculation of the maximum spill size from
the main register allocator (linear scan or graph coloring)
to the RegisterAllocationResolver and do it based on the
live registers at slow path safepoints. The old notion of
an artificial slow path safepoint interval is removed as
it is no longer needed.
Test: Run ART test suite on host and Nexus 9.
Bug: 30212852
Change-Id: I40b3d114e278e2c5807982904fa49bf6642c6275
|
|
Change-Id: I245111eabb43368875c1215ca4f3a1f1918492fe
|
|
Test: m test-art-host
Change-Id: Ie82c2802f76f27512ef922ba583caeccf5675063
|