summaryrefslogtreecommitdiff
path: root/compiler/optimizing/register_allocation_resolver.cc
AgeCommit message (Collapse)Author
2025-02-22Optimizing: Speed up SSA Liveness Analysis. Vladimir Marko
Add some functions from `BitVector` to `BitVectorView<>` and use this to speed up the `SsaLivenessAnalysis` pass. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 331194861 Change-Id: I5bca36da7b4d53a3d5e60f45530168cf20290120
2023-04-27Optimizing: Add `HInstruction::As##type()`. Vladimir Marko
After the old implementation was renamed in https://android-review.googlesource.com/2526708 , we introduce a new function with the old name but new behavior, just `DCHECK()`-ing the instruction kind before casting down the pointer. We change appropriate calls from `As##type##OrNull()` to `As##type()` to avoid unncessary run-time checks and reduce the size of libart-compiler.so. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: run-gtests.sh Test: testrunner.py --target --optimizing Bug: 181943478 Change-Id: I025681612a77ca2157fed4886ca47f2053975d4e
2023-04-27Optimizing: Rename `As##type` to `As##type##OrNull`. Vladimir Marko
The null type check in the current implementation of `HInstruction::As##type()` often cannot be optimized away by clang++. It is therefore beneficial to have two functions HInstruction::As##type() HInstruction::As##type##OrNull() where the first function never returns null but the second one can return null. The additional text "OrNull" shall also flag the possibility of yielding null to the developer which may help avoid bugs similar to what we have seen previously. This requires renaming the existing function that can return null and introducing new function that cannot. However, defining the new function `HInstruction::As##type()` in the same change as renaming the old one would risk introducing bugs by missing a rename. Therefore we simply rename the old function here and the new function shall be introduced in a separate change with all behavioral changes being explicit. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: buildbot-build.sh --target Bug: 181943478 Change-Id: I4defd85038e28fe3506903ba3f33f723682b3298
2023-04-26Remove unnecessary `HInstruction::As##type()` calls. Vladimir Marko
Also so some style cleanup. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I34304acb39bc5197dde03543a6c157b3c319f94f
2022-11-07Reland "Make compiler/optimizing/ symbols hidden." Vladimír Marko
This reverts commit 0a51605ddd81635135463dab08b6f7c21b58ffb0. Reason for revert: Reland after some of the required work was merged in other CLs. Also address a TODO from the original CL to mark required symbols with EXPORT in `intrinsic_objects.h`. Also mark symbols in new files as HIDDEN. Bug: 186902856 Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I936d448983928af23614ca82c2d0bf9a645e2c52
2022-02-25Update compiler/ implications to use (D)CHECK_IMPLIES Santiago Aboy Solanes
Follow-up to aosp/1988868 in which we added the (D)CHECK_IMPLIES macro. This CL uses it on compiler/ occurrences found by a regex. Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: If63aed969bfb8b31d6fbbcb3bca2b04314c894b7
2020-04-17ART: Refactor SIMD slots and regs size processing. Artem Serov
ART vectorizer assumes that there is single size of SIMD register used for the whole program. Make this assumption explicit and refactor the code. Note: This is a base for the future introduction of SIMD slots of size other than 8 or 16 bytes. Test: test-art-target, test-art-host. Change-Id: Id699d5e3590ca8c655ecd9f9ed4e63f49e3c4f9c
2019-10-14Revert "Make compiler/optimizing/ symbols hidden." Vladimir Marko
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf. Reason for revert: Breaks ASAN tests (ODR violation). Bug: 142365358 Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
2019-10-14Make compiler/optimizing/ symbols hidden. Vladimir Marko
Make symbols in compiler/optimizing hidden by a namespace attribute. The unit intrinsic_objects.{h,cc} is excluded as it is needed by dex2oat. As the symbols are no longer exported, gtests are now linked with the static version of the libartd-compiler library. libart-compiler.so size: - before: arm: 2396152 arm64: 3345280 - after: arm: 2016176 (-371KiB, -15.9%) arm64: 2874480 (-460KiB, -14.1%) Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Bug: 142365358 Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
2019-08-02ART: ARM64: Optimize frame size for SIMD graphs. Artem Serov
For SIMD graphs allocate 64 bit instead of 128 bit on stack for each FP register to be preserved by the callee in the frame entry as ABI suggests (currently 64-bit registers are preserved but more space on stack is allocated). Note: slow paths still require spilling full 128-bit Q-Registers for SIMD graphs due to register allocator restrictions. Test: test-art-target. Change-Id: Ie0b12e4b769158445f3d0f4562c70d4fb0ea7744
2018-12-27ART: Refactor for bugprone-argument-comment Andreas Gampe
Handles compiler. Bug: 116054210 Test: WITH_TIDY=1 mmma art Change-Id: I5cdfe73c31ac39144838a2736146b71de037425e
2018-02-01Clean up signed/unsigned in vectorizer. Aart Bik
Rationale: Currently we have some remaining ugliness around signed and unsigned SIMD operations due to lack of kUint32 and kUint64 in the HIR. By "softly" introducing these types, ABS/MIN/MAX/HALVING_ADD/SAD_ACCUMULATE operations can solely rely on the packed data types to distinguish between signed and unsigned operations. Cleaner, and also allows for some code removal in the current loop optimizer. Bug: 72709770 Test: test-art-host test-art-target Change-Id: I68e4cdfba325f622a7256adbe649735569cab2a3
2017-10-17Use ScopedArenaAllocator for code generation. Vladimir Marko
Reuse the memory previously allocated on the ArenaStack by optimization passes. This CL handles only the architecture-independent codegen and slow paths, architecture-dependent codegen allocations shall be moved to the ScopedArenaAllocator in a follow-up. Memory needed to compile the two most expensive methods for aosp_angler-userdebug boot image: BatteryStats.dumpCheckinLocked() : 19.6MiB -> 18.5MiB (-1189KiB) BatteryStats.dumpLocked(): 39.3MiB -> 37.0MiB (-2379KiB) Also move definitions of functions that use bit_vector-inl.h from bit_vector.h also to bit_vector-inl.h . Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 64312607 Change-Id: I84688c3a5a95bf90f56bd3a150bc31fedc95f29c
2017-10-09Use ScopedArenaAllocator for register allocation. Vladimir Marko
Memory needed to compile the two most expensive methods for aosp_angler-userdebug boot image: BatteryStats.dumpCheckinLocked() : 25.1MiB -> 21.1MiB BatteryStats.dumpLocked(): 49.6MiB -> 42.0MiB This is because all the memory previously used by Scheduler is reused by the register allocator; the register allocator has a higher peak usage of the ArenaStack. And continue the "arena"->"allocator" renaming. Test: m test-art-host-gtest Test: testrunner.py --host Bug: 64312607 Change-Id: Idfd79a9901552b5147ec0bf591cb38120de86b01
2017-10-03ART: Introduce Uint8 compiler data type. Vladimir Marko
This CL adds all the necessary codegen for the Uint8 type but does not add code transformations that use that code. Vectorization codegens are modified to use Uint8 as the packed type when appropriate. The side effects are now disconnected from the instruction's type after the graph has been built to allow changing HArrayGet/H*FieldGet/HVecLoad to use a type different from the underlying field or array. Note: HArrayGet for String.charAt() is modified to have no side effects whatsoever; Strings are immutable. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Test: testrunner.py --target --optimizing on Nexus 6P Test: Nexus 6P boots. Bug: 23964345 Change-Id: If2dfffedcfb1f50db24570a1e9bd517b3f17bfd0
2017-09-25ART: Introduce compiler data type. Vladimir Marko
Replace most uses of the runtime's Primitive in compiler with a new class DataType. This prepares for introducing new types, such as Uint8, that the runtime does not need to know about. Test: m test-art-host-gtest Test: testrunner.py --host Bug: 23964345 Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
2017-06-01Use IntrusiveForwardList<> for Env-/UsePosition. Vladimir Marko
Test: m test-art-host-gtest Test: testrunner.py --host Change-Id: I2b720e2ed8f96303cf80e9daa6d5278bf0c3da2f
2017-03-23Implement a SIMD spilling slot. Aart Bik
Rationale: The last ART vectorizer break-out CL \O/ This ensures spilling on x86 and x86_4 is correct. Also, it paves the way to wider SIMD on ARM and MIPS. Test: test-art-host Bug: 34083438 Change-Id: I5b27d18c2045f3ab70b64c335423b3ff2a507ac2
2017-03-22Change 1/2 spill slots to more general number of spill slots. Aart Bik
Rationale: This prepares requesting a different number of spill slots during SIMD vectorization. Bug: 34083438 Test: test-art-host, test-art-host-gtest-register_allocator_test Change-Id: I6d22966ba483deec72b5eea5061c403c12b2ada7
2017-03-13Introduce EnvUsePosition for liveness analysis. Vladimir Marko
Normal and environment use positions are held in separate lists and the code never mixes them together. By using two separate classes, we can reduce complexity and avoid an unnecesary data member, reducing the memory usage. Tracking allocations for a certain big app, the peak arena memory usage is before: MEM: used: 79245960, ... SsaLiveness 31221600 after: MEM: used: 78754024, ... SsaLiveness 30729664 Test: testrunner.py --host Bug: 34053922 Change-Id: I02d3c9f564bbe3b1da0e03c33cf7c0f810f235dc
2016-12-01Class Hierarchy Analysis (CHA) Mingyao Yang
The class linker now tracks whether a method has a single implementation and if so, the JIT compiler will try to devirtualize a virtual call for the method into a direct call. If the single-implementation assumption is violated due to additional class linking, compiled code that makes the assumption is invalidated. Deoptimization is triggered for compiled code live on stack. Instead of patching return pc's on stack, a CHA guard is added which checks a hidden should_deoptimize flag for deoptimization. This approach limits the number of deoptimization points. This CL does not devirtualize abstract/interface method invocation. Slides on CHA: https://docs.google.com/a/google.com/presentation/d/1Ax6cabP1vM44aLOaJU3B26n5fTE9w5YU-1CRevIDsBc/edit?usp=sharing Change-Id: I18bf716a601b6413b46312e925a6ad9e4008efa4 Test: ART_TEST_JIT=true m test-art-host/target-run-test test-art-host-gtest
2016-11-04Revert "Revert "Enable IntermediateAddress for primitive arrays with read ↵ Roland Levillain
barriers."" This reverts commit 4a3aa578eff94eb10450fae1772deb7cb8ddc6a6. The failing assertion (see b/30762467): 08-09 11:32:46.767 1654 1656 F dex2oatd: art/compiler/optimizing/register_allocation_resolver.cc:325] Check failed: interval->GetDefinedBy()->IsActualObject() IntermediateAddress@InstanceFieldGet that motivated the initial revert has been removed by a previous CL (commit 70e97462116a47ef2e582ea29a037847debcc029, https://android-review.googlesource.com/#/c/254920/). Test: ART host and target (ARM, ARM64) tests with `ART_USE_READ_BARRIER=true`. Bug: 26601270 Bug: 12687968 Change-Id: I09cae0c6c38ca403924153e9f0eb0cc3ff4540e7
2016-10-05Refactoring of graph linearization and linear order. Aart Bik
Rationale: Ownership of graph's linear order and iterators was a bit unclear now that other phases are using it. New approach allows phases to compute their own order, while ssa_liveness is sole owner for graph (since it is not mutated afterwards). Also shortens lifetime of loop's arena. Test: test-art-host Change-Id: Ib7137d1203a1e0a12db49868f4117d48a4277f30
2016-09-05Avoid excessive spill slots for slow paths. Vladimir Marko
Reducing the frame size makes stack maps smaller as we need fewer bits for stack masks and some dex register locations may use short location kind rather than long. On Nexus 9, AOSP ToT, the boot.oat size reduction is prebuilt multi-part boot image: - 32-bit boot.oat: -416KiB (-0.6%) - 64-bit boot.oat: -635KiB (-0.9%) prebuilt multi-part boot image with read barrier: - 32-bit boot.oat: -483KiB (-0.7%) - 64-bit boot.oat: -703KiB (-0.9%) on-device built single boot image: - 32-bit boot.oat: -380KiB (-0.6%) - 64-bit boot.oat: -632KiB (-0.9%) on-device built single boot image with read barrier: - 32-bit boot.oat: -448KiB (-0.6%) - 64-bit boot.oat: -692KiB (-0.9%) The other benefit is that at runtime, threads may need fewer pages for their stacks, reducing overall memory usage. We defer the calculation of the maximum spill size from the main register allocator (linear scan or graph coloring) to the RegisterAllocationResolver and do it based on the live registers at slow path safepoints. The old notion of an artificial slow path safepoint interval is removed as it is no longer needed. Test: Run ART test suite on host and Nexus 9. Bug: 30212852 Change-Id: I40b3d114e278e2c5807982904fa49bf6642c6275
2016-07-20Fix accidental pass-by-value Matthew Gharrity
Change-Id: I245111eabb43368875c1215ca4f3a1f1918492fe
2016-07-19Refactor SSA deconstruction into its own class Matthew Gharrity
Test: m test-art-host Change-Id: Ie82c2802f76f27512ef922ba583caeccf5675063