summaryrefslogtreecommitdiff
path: root/compiler/optimizing/nodes.cc
AgeCommit message (Collapse)Author
2025-02-27Speed up DCE, CFRE and `ReplaceUsesDominatedBy()`... Vladimir Marko
... by using `BitViewVector<>`. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 331194861 Change-Id: If22934aeae82b21ebf9fc20e817d0958bd6edec8
2025-02-21Speed up `HGraph::BuildDominatorTree()`. Vladimir Marko
Add some functions from `BitVector` to `BitVectorView<>` and use this to speed up `HGraph::BuildDominatorTree()`. Also clean up code sinking. This was a missed opportunity in https://android-review.googlesource.com/3500455 . Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 331194861 Change-Id: Iec03db8b44af38c549447ccfa0bf8dab731b550d
2025-02-20Introduce `BitVectorView<>`. Vladimir Marko
Initially implement only simple bit getter and setters and use the new class to avoid overheads of `ArenaBitVector` in a few places. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 331194861 Change-Id: Ie29dfcd02286770e07131e43b65e6e9fb044a924
2025-02-17Optimizing: Rename `GetNextInstructionId()`. Vladimir Marko
Rename to `AllocateInstructionId()` because it's clealy not a simple getter. Annotate it as `ALWAYS_INLINE`. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 181943478 Change-Id: I618181a05e8cf1e2d1808611af8bdb6ab4f55e3c
2025-02-17Optimizing: Speed up `HInstruction::Add{,Env}UseAt()`. Vladimir Marko
Avoid three dependent loads to fetch the allocator on the hot paths. Inline the `FixupUserRecordsAfter*UseInsertion()` loop and use the fact that it's known to execute exactly one or two iterations. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 181943478 Change-Id: I7fd4d48caebc6aeb13fb9a9f8146a06129c72b2e
2025-02-13Use HInstructionIteratorHandleChanges again in RTP Santiago Aboy Solanes
In http://r.android.com/2952876 we changed the RTP iterator to HInstructionIterator, but we want the HandleChanges one. Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing Change-Id: I1e9e9cc84d45aa34c24a805f16798e86fd123fc3
2025-02-10Optimize RemoveInstruction Santiago Aboy Solanes
We can skip two ifs as they are implied from previous ifs. Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing Change-Id: Ia6088887832117791b82b07a2c31d2f9b8bf8b58
2025-01-17Optimizing: Fix `InsertInputAt()`. Vladimir Marko
Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I500faee42b02dbc72474e30fa2a3c0388ae86674
2024-12-03cleanup: Remove extra SetGraph calls Santiago Aboy Solanes
DeleteDeadEmptyBlock does SetGraph(nullptr) as the last step. Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing Change-Id: I8a155d22ff62d55bd07c1f3733e8891c2f9fe3de
2024-11-21Allow the inliner to devirtualize intrinsics Santiago Aboy Solanes
To do so update: * TryReplaceStringBuilderAppend * Code paths relevant to previously InvokeVirtual that are now InvokeStaticOrDirect * checker tests. Bug: 369206455 Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing Change-Id: I4d40980e416f3130d3c344c5f07b7b331deb5c97
2024-10-30cleanup FixUpInstructionType Santiago Aboy Solanes
It was expecting to have an HSelect as an input, so we might as well encode that with the C++ types. Rename it to FixUpSelectType. Also move the instructions around so that the ScopedObjectAccess scope is smaller. As a drive-by, move ScopedObjectAccess to CheckAgainstUpperBound so that we only call it in a subset of cases. Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing Change-Id: Id6f179e68e0a460577d5e42b8c431f3d035405a4
2024-10-11Refactor `HandleCache` out of `nodes.{h,cc}`. Vladimir Marko
Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I4989657309b46c1e8fec3e9eb4024f1fc329fbe0
2024-10-11Move `HCondition` creation function to `HCondition`. Vladimir Marko
Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: Ibf7d27af872bf0bc9a91d1698d66047947b513f3
2024-10-11Do not record dex PC in constant HIR. Vladimir Marko
Due to the dedplication of constants, the dex PC can be useless or even misleading. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I501abc3cca920415b3118e92b06a01b173b2406a
2024-10-11Refactor `ReferenceTypeInfo` out of `nodes.{h,cc}`. Vladimir Marko
Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I0c2f310606bb03f264038534c23f15dd0fee5662
2024-10-07Reland "Calculate the number of out vregs." Vladimír Marko
This reverts commit 434a327234f74eed3ef4072314d2e2bdb73e4dda. Reason for revert: Relanding with no change. The regressions that were the reason for the revert may reappear. However, these regressions are probably caused by subtle effects that are not directly related to this change. For example, a code size improvement can regress performance simply by moving the start of a loop from an aligned address to an unaligned address, or by splitting a loop across two cache lines. Bug: 358519867 Bug: 359722268 Change-Id: I997b8a4219418f79b3a5fc4e7e50817911f0a737
2024-10-04Reduce memory used by `HEnvironment`. Vladimir Marko
Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 181943478 Change-Id: Ie05d4001e411a669e11b8edda375414e5da52ae2
2024-09-13riscv64: Add node Rol, fix InstructionBuilder Anton Romanov
This reverts commit 744830cb242c82c4637e6fb303b36d0371c84979. Reason for revert: updated CHECKer test to use rolw instead of rol. Change-Id: I50e34c6ac69488a9c083f04c6382df4302e8e7d3
2024-09-11Revert "riscv64: Add node Rol, fix InstructionBuilder" Nicolas Geoffray
This reverts commit 39927bc359ccbe65371213c4559126b05dcfb117. Reason for revert: Failure on bot with: error: Statement could not be matched starting from line 1089612 TestRotate.java:95: rol {{a\d+}}, {{a\d+}}, {{a\d+}} ISA_FEATURES = {'rv64gcv_zba_zbb_zbs': True} READ_BARRIER_TYPE = baker 567-checker-builder-intrinsics FAILED: [run-test:1074] CFG checker failed $ ssh -q -F /b/s/w/ir/cache/builder/art/test/testrunner/ssh_config -p 10001 ubuntu@localhost "rm -rf /home/ubuntu/art-test-chroot/data/run-test/test-343039" 567-checker-builder-intrinsics files deleted from host and from target ---------- test-art-target-run-test-ndebug-prebuild-optimizing-no-relocate-ntrace-cms-checkjni-picimage-ndebuggable-no-jvmti-567-checker-builder-intrinsics64 Change-Id: Ic1fd87c331c9eba315af6c98c3ad393766327417
2024-09-10riscv64: Add node Rol, fix InstructionBuilder Anton Romanov
Add new IR node Rol (rotate left). This allows to generate 1 (one) risc-v instruction from Integer(Long).rotateLeft intrinsic instead of 2 instructions (Ror+Neg). Fix InstructionBuilder: build Rol from rotateLeft instead of Ror+Neg. Add unfolding of Rol node in InstructionSimplifier(Arm, Arm64 and X86 Int64 type) to Neg+Ror. By compiling with dex2oat all the methods of applications below I got: in Facebook 1 Ror+Neg pattern, in Minecraft 5 Ror+Neg patterns. Test: art/test/testrunner/testrunner.py --target --64 --ndebug --optimizing Change-Id: Ic28610c6fab4f66386f2fbc0f7223ef2c0e644b6
2024-09-02cleanup: change Set/GetIntrinsic in ArtMethod to use Intrinsics Santiago Aboy Solanes
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I253a6bfe6bba7e02e527722c4632cb60938fe1c6
2024-09-02Typo fix: instrinsic -> intrinsic Santiago Aboy Solanes
Change-Id: I6116b792d156970cefc277d2ea6af05627917d09
2024-08-21Revert "Calculate the number of out vregs." Vladimír Marko
This reverts commit 3e75615ad25b6af1842b194e78b429b0f585b46a. Reason for revert: Regressed some micro-benchmarks, see bug 359722268. Bug: 358519867 Bug: 359722268 Change-Id: I207cc78c88193564e90c98eda2c96a5ba354a588
2024-08-14Clean up condition simplification. Vladimir Marko
Leave condition construction in the `HGraph` but move the rest of the condition simplification code to the simplifier where it belongs. Also clean up simplifier tests and a few other gtests. Note that `SuperblockClonerTest.IndividualInstrCloner` now clones an additional `HGoto` from the entry block. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I73ee8c227c1c100ac7eb9d4a3813c61ad928b6dd
2024-08-13Calculate the number of out vregs. Vladimir Marko
Determine the number of out vregs needed by invokes that actually make a call, and by `HStringBuilderAppend`s. This can yield smaller frame sizes of compiled methods when some calls are inlined or fully intrinsified. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 358519867 Change-Id: I4930a9bd811b1de14658f5ef44e65eadea6a7961
2024-03-25Remove extra uses of ClearAllBits Santiago Aboy Solanes
ArenaBitVector creation guarantees it starts empty. Add a debug check to make sure this assumption doesn't change. Note that ArenaAllocator guarantees zero-initialized memory but ScopedArenaAllocators do not. This is fine either way since the BitVector constructor calls ClearAllBits. Bug: 329037671 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: Icbf5e5dd1869e80b5d5828ecca9f13de30c0242b
2024-03-12Remove default cases when all cases are defined Santiago Aboy Solanes
Bug: 328756212 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Test: m test-art-host-gtest Change-Id: I9584e1b93e49265b84a9e45c8b283ebaf8ad3eb2
2024-02-09Clean up `HGraphVisitor::VisitBasicBlock()`. Vladimir Marko
Skip `HPhi::Accept()` and add functions to visit only Phis or only non-Phi instructions. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 181943478 Change-Id: Iba0690ae70f46d6a2bafa9055b2ae5167e58a2f4
2024-01-30Speed up HConstantFoldingVisitor::PropagateValue Santiago Aboy Solanes
We can speed it up in two ways: 1) Don't call it if it has exactly one element, as we will never be able to replace its use in the if clause 2) Lazily compute the dominated blocks when needed Compiling locally GMS, HConstantFoldingVisitor::VisitIf goes down from 1.8% of the compile time to 0.7%. Most of this improvement (90%+) is coming from the `1)` optimization. This is because there are many cases where we have only one use (the if), which is in the same block so we compute the domination to always end up not doing the optimization. Bug: 278626992 Test: Locally compile gms Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: Ic17b4b44840c7efa0224504031bf635584850ced
2024-01-29Optimizing: Remove block reachability information. Vladimir Marko
This code is dead after Partial LSE removal. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 298176183 Change-Id: If67efa9d1df908232b6c2f32f3d2c64fb91759ae
2024-01-12Speed up HInstruction::ReplaceUsesDominatedBy Santiago Aboy Solanes
We can pre-calculate the dominated blocks so that the dominance calculation is faster overall. Methods which were affected by the previous implementation see 10x speedup with this CL. Bug: 319118425 Test: Locally compile the app in the bug Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I0e3b84485ea7082bec348b6852b5164e06aa7829
2023-11-27Simplify boxing followed by unboxing. Vladimir Marko
Also mark boxing `valueOf()` intrinsics as never null to avoid creating unnecessary `HNullCheck` instructions. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I86e7721e3af6c59407aa2ddfc1bd11bd2fdac83c
2023-10-16Add a new helper RecomputeDominatorTree Santiago Aboy Solanes
It clears loop and dominance information, and builds the dominator tree. It also dchecks that we are not calling this methods with irreducible loops, as it is not supported. When adding this helper we found a partial LSE bug as it was recomputing dominator tree for irreducible loops. Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing Bug: 304749506 Change-Id: Ia4cc72cd19779ad881fa686e52b43679fe5a64d3
2023-10-10Add optimization to simplify Select+Binary/Unary ops Santiago Aboy Solanes
We can simplify a Select + Binary/Unary Op if: * Both inputs to the Select instruction are constant, and * The Select instruction is not used in another instruction to avoid duplicating Selects. * In the case of Binary ops, both inputs can't be Select. Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: Ic716155e9a8515126c2867bb1d54593fa63011ae
2023-08-18An instruction cannot be found before itself Santiago Aboy Solanes
If we call FoundBefore with the same instruction in both parameters, we will return true instead of false. By swapping the two ifs inside the for, we can achieve what we want. FoundBefore is used to sort in code sinking, and otherwise we would be not be covering irreflexivity so we wouldn't be having a strict weak ordering. Bug: 296538046 Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing Change-Id: I0abab631842ed5e1442437608aaa16c106fe37a8
2023-07-14Clean up ART intrinsics. Vladimir Marko
Change `intrinsics_list.h` to a normal include file instead of the weird include-use-and-undef pattern. Prefix macros defined in that file with `ART_`. And also remove blank lines at end of some files and address some comments on merged changes. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 283082089 Change-Id: I9c462f973c0c4bb53eff39fbe191014f6321d7c5
2023-07-12Support autovectorization of diamond loops. Artem Serov
This CL enables predicated autovectorization of loops with control flow, currently only for simple diamond pattern ones: header------------------+ | | diamond_hif | / \ | diamond_true diamond_false | \ / | back_edge | | | +---------------------+ Original author: Artem Serov <Artem.Serov@linaro.org> Test: ./art/test.py --host --optimizing --jit Test: ./art/test.py --target --optimizing --jit Test: 661-checker-simd-cf-loops. Test: target tests on arm64 with SVE (for details see art/test/README.arm_fvp). Change-Id: I8dbc266278b4ab074b831d6c224f02024030cc8a
2023-04-27Optimizing: Add `HInstruction::As##type()`. Vladimir Marko
After the old implementation was renamed in https://android-review.googlesource.com/2526708 , we introduce a new function with the old name but new behavior, just `DCHECK()`-ing the instruction kind before casting down the pointer. We change appropriate calls from `As##type##OrNull()` to `As##type()` to avoid unncessary run-time checks and reduce the size of libart-compiler.so. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: run-gtests.sh Test: testrunner.py --target --optimizing Bug: 181943478 Change-Id: I025681612a77ca2157fed4886ca47f2053975d4e
2023-04-27Optimizing: Rename `As##type` to `As##type##OrNull`. Vladimir Marko
The null type check in the current implementation of `HInstruction::As##type()` often cannot be optimized away by clang++. It is therefore beneficial to have two functions HInstruction::As##type() HInstruction::As##type##OrNull() where the first function never returns null but the second one can return null. The additional text "OrNull" shall also flag the possibility of yielding null to the developer which may help avoid bugs similar to what we have seen previously. This requires renaming the existing function that can return null and introducing new function that cannot. However, defining the new function `HInstruction::As##type()` in the same change as renaming the old one would risk introducing bugs by missing a rename. Therefore we simply rename the old function here and the new function shall be introduced in a separate change with all behavioral changes being explicit. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: buildbot-build.sh --target Bug: 181943478 Change-Id: I4defd85038e28fe3506903ba3f33f723682b3298
2023-04-27Reland "Don't enable intrinsic optimizations in debuggable runtime"" Mythri Alle
This reverts commit b5fcab944b3786f27ab6b698685109bfc7f785fd. Reason for revert: test/988 is a CTS test and we shouldn't modify the Main to do any real work other than calling run. Also there's no way to call ensureJitCompiled from atests, so restoring 988 to original and adding another test for testing JIT tracing Bug: 279547861 Test: test.py -t 988, 2263 Change-Id: I0908c29996a550b93ba6c38f99460ff0d51a2964
2023-04-26Remove unnecessary `HInstruction::As##type()` calls. Vladimir Marko
Also so some style cleanup. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I34304acb39bc5197dde03543a6c157b3c319f94f
2023-04-25Revert "Don't enable intrinsic optimizations in debuggable runtime" Mythri Alle
This reverts commit cb008914fbc5a2334e3c00366afdb5f8af5a23ba. Reason for revert: Failures on some configs https://buganizer.corp.google.com/issues/279562617 Change-Id: I4d26cd00e76d8ec4aef76ab26987418eab24d217
2023-04-25Don't enable intrinsic optimizations in debuggable runtime Mythri Alle
We have optimizations that generate code inline for intrinsics instead of leaving them as invoke for better performance. Some debug features like method entry / exit or setting a breakpoint on intrinsics wouldn't work if intrinsics are inlined. So disable those optimizations in debuggable runtimes. Also update 988-method-trace test to test intrinsics on JITed code. Test: art/test.py -t 988 Bug: 279547861 Change-Id: Ic7c61d1b1541ff534faa24ccec5c2d0b574b0537
2023-02-22Set more RTI only if they are valid Santiago Aboy Solanes
Follow-up to aosp/2442280. We haven't seen crashes with these ones, but we can't guarantee that the RTI will be valid in these code paths. Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I80da85a6549ba0275a80027016363e0cf9fb8045
2023-01-13Update the graph flags and check consistency Santiago Aboy Solanes
Check that the flags are up to date in graph checker. Mainly a correctness check CL but it brings slight code size reduction (e.g. not needing vreg info if HasMonitorOperations is false). Update loop_optimization_test to stop using `LocalRun` directly as it meant that it was breaking assumptions (i.e. top_loop_ was nullptr when it was expected to have a value). Bug: 264278131 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I29765b3be46d4bd7c91ea9c80f7565a3c88fae2e
2022-12-16Set HasMonitorOperations in the outer graph when inlining Santiago Aboy Solanes
If the callee graph has monitor operations, the outer graph will have them too when inlining. If we don't do this, e.g. we will skip adding the needed vreg info https://cs.android.com/android/platform/superproject/+/master:art/compiler/optimizing/code_generator.cc;l=1193;drc=434d968b4af0bc8af9889170250bee3e08839bea. Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I97817ea93c12727bb7d198fb5abea21c32d181c9
2022-12-14Move adding extra goto blocks to InlineInto Santiago Aboy Solanes
There are some cases in which we need to add extra goto blocks when inlining to avoid critical edges. The `TryBoundary` of `kind:exit` instructions will always have more than one successor (normal flow and exceptional flow). If its normal flow successor has more than one predecessor, we would be introducing a critical edge. We can avoid the critical edge in InlineInto instead of doing it in the builder which helps decoupling those two stages as well as simplifying surrounding code. We also have the benefit of adding the extra goto blocks only when necessary. Bug: 227283224 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: Ibe21623c94c798f7cea60ff892064e63a38a787a
2022-12-12Allow to inline invokes that sometimes throw into try blocks Santiago Aboy Solanes
We supported inlining these invokes into regular blocks, and we can extend that support for try blocks too. Bug: 227283224 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: Id45a009adabc610f4bf7a0457880ad7b9d772178
2022-12-07Allow inlining invokes that contain try catches into catch blocks Santiago Aboy Solanes
Since catch blocks are never considered try blocks, we can guarantee that its invokes are not inside a TryBoundary (which is the blocker for enabling inlining of try catch invokes inside try blocks). Bug: 227283224 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I747e2e8c2515e36041ad3966ca6a6388ef7d91df
2022-12-06Update domination chain and RPO manually in MaybeAddExtraGotoBlocks Santiago Aboy Solanes
There's no need to recompute the whole graph as we know what changed. As a drive-by, we now don't return false for graphs with irreducible loops so we can remove that restriction from the builder. However, if a graph with irreducible loops hits this path it means that: A) it's being inlined B) Has irreducible loops We don't inline graphs with irreducible loops, and after building for inline we don't remove them either because constant folding and instruction simplifier don't remove them, and DCE doesn't run for graphs with irreducible loops. So, in terms of dex2oat's outputs nothing should change. Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I8cbf1b5f0518bb5dd14ffd751100ea81f5478863