summaryrefslogtreecommitdiff
path: root/compiler/optimizing/optimization.cc
AgeCommit message (Collapse)Author
2025-01-24Optimizing: Rename `HCodeFlowSimplifier`... Vladimir Marko
... to `HControlFlowSimplifier` because "control flow" is the correct technical term. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I2607ac699fa33c3e7ca7f54364e1e8497148412b
2025-01-17Optimizing: Rename `HSelectGenerator`... Vladimir Marko
... to `HCodeFlowSimplifier` in preparation for adding more optimizations to this pass. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: Icb05c3455d93a7b939f82ced9b08165e533bb21a
2024-10-30Run RTP after GVN to remove more NullCheck instructions Santiago Aboy Solanes
After GVN, we deduplicate instructions and we might have more information regarding the nullability of the SSA variable. We can run RTP to insert the BoundType instructions, which will later be used by InstructionSimplifier to remove the NullCheck. RTP will be run conditionally, only if GVN did at least one replacement. It improves ~0.1% of odex size for speed compiled apps. Bug: 369206455 Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing Change-Id: I0a4a104690b3fe9ac4118642ab9e9916dc30a9c5
2024-04-04riscv64: Create InstructionSimplifierRiscv64, ShiftAdd Anton Romanov
Create InstructionSimplifierRiscv64 optimization. Replace Shl (1|2|3) and Add with Riscv64ShiftAdd IR instruction. By compiling with dex2oat all the methods of applications below I got: Facebook: 45 cases TikTok: 26 cases YouTube: 19 cases of the pattern. Test: art/test/testrunner/testrunner.py --target --64 --ndebug --optimizing Change-Id: I88903450d998983bb2a628942112d7518099c3f5
2024-01-30Reland^2 "Run optimizations with baseline compilation." Nicolas Geoffray
This reverts commit 3dccb13f4e92db37a13359e126c5ddc12cb674b5. Also includes the fix for incrementing hotness that got reverted: aosp/2906378 Bug: 313040662 Reduces jank on compose view scrolling for 4 iterations: - For Go Mokey: - Before: ~698 frames drawn / ~13.87% janky frames - After: ~937 frames drawn / ~5.52% janky frames - For Pixel 8 pro: - Before: ~2440 frames drawn / ~0.90% janky frames - After: ~2450 frames drawn / ~0.55% janky frames Reason for revert: Reduce inlining threshold for baseline. Change-Id: Iee5cd4c3ceb7715caf9299b56551aae6f0259769
2024-01-15Revert "Restrict the use of ConstantFolding's VisitIf" Santiago Aboy Solanes
This reverts commit 5eb1fd0dae3832ceee2102613bb08c291daca6f3. Reason for revert: In aosp/2903248 we implemented a faster way of doing `ReplaceUsesDominatedBy` which is used by `VisitIf`. The impact of `VisitIf` is now small enough that running VisitIf in all passes is faster that the previous implementation running some of the time. This CLs re-enables the optimization in all constant folding passes because: A) Lets this optimization (and others that can use the result) kick in earlier B) Run it for callee graphs in the inliner (which has been disabled as of CL aosp/2543831) C) Consistency of the ConstantFolding pass, which helps to have a simpler mental model Bug: 278626992 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Test: Locally compiled GMS and compared time to compile Change-Id: I5dc5f591557c8de0bc4d23dbfd0b91b5b7e56ab5
2024-01-15Revert "Reland "Run optimizations with baseline compilation."" Nicolas Geoffray
This reverts commit 1a6b5b318aa69903a74dd10312a77bd8ee7c4cf6. Reason for revert: asan failure Change-Id: Ie9da0b04c899d6cb37148e7a3542190e65737787
2024-01-05Reland "Run optimizations with baseline compilation." Nicolas Geoffray
This reverts commit c8309515d099992b7cab8f2b8c6db3ed77671ff4. Bug: 313040662 Reason for revert: remove call to slow path on back edges. Change-Id: I3fe52295afcb0be4b4062f8d9060adb4abb64375
2024-01-04Revert "Run optimizations with baseline compilation." Almaz Mingaleev
This reverts commit 41c5dde40d1c75d36a7f984c8d72ec65fbff3111. Reason for revert: breaks test.java.util.Arrays.Sorting Change-Id: I03385c9f1efff4b8e8bd315827dde6ed774bbb52
2024-01-03Run optimizations with baseline compilation. Nicolas Geoffray
And introduce inlined inline caches, which customize an inline cache for the top-level method being compiled. Reduces jank on compose view scrolling for 20 seconds: - For Go Mokey: - Before: ~525 frames drawn / ~14.64% janky frames - After: ~891 frames drawn / ~4.74% janky frames - For Pixel 8 pro: - Before: ~2443 frames drawn / ~0.91% janky frames - After: ~2447 frames drawn / ~0.65% janky frames Bug: 313040662 Test: test.py Change-Id: Ibaa746c6bd3c665b18ec9cd29cb477cf21023467
2023-10-30Replace `gUseReadBarrier` with compiler option in compiler. Vladimir Marko
Leave a few `gUseReadBarrier` uses in JNI macro assemblers. We shall deaal with these later. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: run-gtests.sh Test: testrunner.py --target --optimizing Bug: 289805127 Change-Id: I9d2aa245cee4c650129f169a82beda7dc0dd6a35
2023-10-17riscv64: Implement `CriticalNativeAbiFixupRiscv64`. Vladimir Marko
And pass integral stack args sign-extended to 64 bits for direct @CriticalNative calls. Enable direct @CriticalNative call codegen unconditionally and also enable `HClinitCheck` codegen and extend the 178-app-image-native-method run-test to properly test these use cases. Test: # Edit `run-test` to disable checker, then testrunner.py --target --64 --ndebug --optimizing # Ignore 6 pre-existing failures (down from 7). Bug: 283082089 Change-Id: Ia514c62006c7079b04182cc39e413eb2deb089c1
2023-09-08Remove some obsolete TODO comments, fix indentation. Vladimir Marko
Test: Rely on TreeHugger. Change-Id: I4e3c0ba13d576ef62121d47ebc4965f6667b624f
2023-04-19Restrict the use of ConstantFolding's VisitIf Santiago Aboy Solanes
It was taking a lot of time for the improvement it got. We can get 99.99% of the improvement, with only one VisitIf call. This is roughly 20% of the compile time it used to take. Bug: 278626992 Fixes: 278626992 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: Icc00c9ad6a9eb4f4fd18677bcb65655cbbe9d027
2023-01-30Skip InductionVarAnalysis for a pathological case Santiago Aboy Solanes
Having a long chain of loop header phis hangs up the compiler. Note that we can still compile the method if we skip the InductionVarAnalysis phase. Bug: 246249941 Fixes: 246249941 Bug: 32545907 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Test: dex2oat compile the app in 246249941 Change-Id: Id2d14b1c4d787f98d656055274c7cfcae6491686
2023-01-04Add a write barrier elimination pass Santiago Aboy Solanes
We can eliminate redundant write barriers as we don't need several for the same receiver. For example: ``` MyObject o; o.inner_obj = io; o.inner_obj2 = io2; o.inner_obj3 = io3; ``` We can keep the write barrier for `inner_obj` and remove the other two. Note that we cannot perform this optimization across invokes, suspend check, or instructions that can throw. Local improvements (pixel 5, speed compile): * System server: -280KB (-0.56%) * SystemUIGoogle: -330KB (-1.16%) * AGSA: -3876KB (-1.19%) Bug: 260843353 Fixes: 260843353 Change-Id: Ibf98efbe891ee00e46125853c3e97ae30aa3ff30
2022-11-07Reland "Make compiler/optimizing/ symbols hidden." VladimĂ­r Marko
This reverts commit 0a51605ddd81635135463dab08b6f7c21b58ffb0. Reason for revert: Reland after some of the required work was merged in other CLs. Also address a TODO from the original CL to mark required symbols with EXPORT in `intrinsic_objects.h`. Also mark symbols in new files as HIDDEN. Bug: 186902856 Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I936d448983928af23614ca82c2d0bf9a645e2c52
2022-10-10Compiler implementation of try catch inlining Santiago Aboy Solanes
Notable changes: 1) Wiring of the graph now allows for inlinees graph ending in TryBoundary, or Goto in some special cases. 2) Building a graph with try catch for inlining may add an extra Goto block. 3) Oat version bump. 4) Reduced kMaximumNumberOfCumulatedDexRegisters from 32 to 20. Bug: 227283224 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: Ic2fd956de24b72d1de29b4cd3d0b2a1ddab231d8
2022-08-17Reland "Propagating values from if clauses to its successors" Santiago Aboy Solanes
This reverts commit fa1034c563b44c4f557814c50e2678e14dcd1d13. Reason for revert: Relanding after float/double fix. In short, don't deal with floats/doubles since they bring a lot of edge cases e.g. if (f == 0.0f) { // f is not guaranteed to be 0.0f, e.g. it could be -0.0f. } Bug: 240543764 Change-Id: I400bdab71dba0934e6f1740538fe6e6c0a7bf5fc
2022-08-09Revert "Propagating values from if clauses to its successors" Santiago Aboy Solanes
This reverts commit c6b816ceb2b35300c937ef2e7d008598b6afba21. Reason for revert: Broke libcore test https://ci.chromium.org/ui/p/art/builders/ci/angler-armv7-ndebug/3179/overview Change-Id: I4f238bd20cc485e49078104e0225c373cac23415
2022-08-09Propagating values from if clauses to its successors Santiago Aboy Solanes
We have knowledge of the value of some variables at compile time due to the fact they are used in if clauses. For example: if (variable == constant) { // SSA `variable` guaranteed to be equal to constant here. } else { // No guarantees can be made here (except for booleans since // they only have two values). } Similarly with `variable != constant`. We can also apply this to boolean parameters e.g. void foo (boolean val) { if (val) { // `val` guaranteed to be true here. ... } ... } Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I55df0252d672870993d06e5ac92f5bba44d902bd
2020-08-21Improved LSE: Replacing loads with Phis. Vladimir Marko
Create "Phi placeholders" for tracking heap values that can merge from different values and try to match existing Phis or create new Phis to replace loads. For Phi placeholders from loop headers we do not know whether they are fed by unknown values through back-edges when processing the loop header, so we delay processing loads that depend on them until we walked the entire graph. We then try to match them with existing instructions (when the location is unchanged in the loop) or Phis or create new Phis if needed. If we find a loop Phi placeholder fed with unknown value from a back-edge, we mark the Phi placeholder unreplaceable and reprocess loads and stores to propagate the unknown value. This can sometimes allow other loads to be replaced. At the end we re-calculate the heap values to find stores that can be eliminated because they write over the same value. Golem results: art-opt-cc arm arm64 x86 x86-64 CaffeineFloat +6.7% +3.0% +5.9% +3.8% KotlinMicroWhen +33.7% +4.8% +1.8% +0.6% art-opt (more noisy than art-opt-cc) CaffeineFloat +4.1% +4.4% +7.8% +10.5% KotlinMicroWhen +33.6% +2.0% +1.8% +1.8% The MoveLiteralColumn benchmark seems to gain significantly (up to 22% on art-opt-cc but under 10% on art-opt) but it is very noisy and the results are therefore unreliable. Insignificant code size changes for aosp_blueline-userdebug: - before: arm boot*.oat: 15303468 arm64 boot*.oat: 18184736 services.odex: 25195944 grep -c pAllocObject boot.arm64.oatdump.txt: 27213 grep -c pAllocArray boot.arm64.oatdump.txt: 3620 - after: arm boot*.oat: 15299524 (-4KiB, -0.03%) arm64 boot*.oat: 18176528 (-8KiB, -0.05%) services.odex: 25191832 (-4KiB, -0.02%) grep -c pAllocObject boot.arm64.oatdump.txt: 27206 (-7) grep -c pAllocArray boot.arm64.oatdump.txt: 3615 (-5) Test: New tests in 530-checker-lse. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: blueline-userdebug boots. Bug: 77906240 Change-Id: Ia9fe0cd3530f9d3941650dfefc00a7f7fd821994
2020-08-10ARM: Allow FP args in core regs for @CriticalNative. Vladimir Marko
If a float or double argument needs to be passed in core register to a @CriticalNative method due to soft-float native ABI, insert a fake call to Float.floatToRawIntBits() or Double.doubleToRawLongBits() to satisfy type checks in the compiler. We cannot do that for intrinsics that expect those inputs in actual FP registers, so we still prevent such intrinsics from using `kCallCriticalNative`. This should be irrelevant if an actual intrinsic implementation is emitted. There are currently two unimplemented intrinsics that are affected by the carve-out, namely MathRoundDouble and FP16ToHalf, and four intrinsics implemented only when ARMv8A is supported, namely MathRint, MathRoundFloat, MathCeil and MathFloor. Test: testrunner.py --target --32 -t 178-app-image-native-method Bug: 112189621 Change-Id: Id14ef4f49f8a0e6489f97dc9588c0e6a5c122632
2020-07-28More inclusive language in the runtime David Srbecky
Test: m Bug: 161896447 Bug: 161850439 Bug: 161336379 Change-Id: Iabc29fa43b4b5a403699d6bca95e9a2cb8945d77
2020-06-17ART: Simplify HRem to reuse existing HDiv Evgeny Astigeevich
A pattern seen in libcore and SPECjvm2008 workloads is a pair of HRem/HDiv having the same dividend and divisor. The code generator processes them separately and generates duplicated instructions calculating HDiv. This CL adds detection of such a pattern to the instruction simplifier. This optimization affects HInductionVarAnalysis and HLoopOptimization preventing some loop optimizations. To avoid this the instruction simplifier has the loop_friendly mode which means not to optimize HRems if they are in a loop. A microbenchmark run on Pixel 3 shows the following improvements: | little cores | big cores arm32 Int32 | +21% | +40% arm32 Int64 | +46% | +44% arm64 Int32 | +27% | +14% arm64 Int64 | +33% | +27% Test: 411-checker-instruct-simplifier-hrem Test: test.py --host --optimizing --jit --gtest --interpreter Test: test.py --target --optimizing --jit --interpreter Test: run-gtests.sh Change-Id: I376a1bd299d7fe10acad46771236edd5f85dfe56
2020-06-08Run LSA as a part of the LSE pass. Vladimir Marko
Make LSA a helper class, not an optimization pass. Move all its allocations to ScopedArenaAllocator to reduce the peak memory usage a little bit. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I7fc634abe732d22c99005921ffecac5207bcf05f
2020-05-13Move HandleCache to HGraph. Vladimir Marko
This avoids passing the `VariableSizedHandleScope*` argument around and eliminates HGraph::inexact_object_rti_ and its initialization. The latter shall allow running Optimizing gtests that do not require type information without creating a Runtime in future. (To be implemented in a separate CL.) Test: m test-art-host-gtest Test: testrunner.py --host --optmizing Test: aosp_taimen-userdebug boots. Change-Id: I36fe9bc556c6d610d644c8c14cc74c9985a14d64
2020-04-17ART: Refactor SIMD slots and regs size processing. Artem Serov
ART vectorizer assumes that there is single size of SIMD register used for the whole program. Make this assumption explicit and refactor the code. Note: This is a base for the future introduction of SIMD slots of size other than 8 or 16 bytes. Test: test-art-target, test-art-host. Change-Id: Id699d5e3590ca8c655ecd9f9ed4e63f49e3c4f9c
2020-02-13Remove MIPS support from Optimizing. Vladimir Marko
Test: aosp_taimen-userdebug boots. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 147346243 Change-Id: I97fdc15e568ae3fe390efb1da690343025f84944
2019-10-14Revert "Make compiler/optimizing/ symbols hidden." Vladimir Marko
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf. Reason for revert: Breaks ASAN tests (ODR violation). Bug: 142365358 Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
2019-10-14Make compiler/optimizing/ symbols hidden. Vladimir Marko
Make symbols in compiler/optimizing hidden by a namespace attribute. The unit intrinsic_objects.{h,cc} is excluded as it is needed by dex2oat. As the symbols are no longer exported, gtests are now linked with the static version of the libartd-compiler library. libart-compiler.so size: - before: arm: 2396152 arm64: 3345280 - after: arm: 2016176 (-371KiB, -15.9%) arm64: 2874480 (-460KiB, -14.1%) Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Bug: 142365358 Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
2018-12-27ART: Refactor for bugprone-argument-comment Andreas Gampe
Handles compiler. Bug: 116054210 Test: WITH_TIDY=1 mmma art Change-Id: I5cdfe73c31ac39144838a2736146b71de037425e
2018-12-06Refactor CompilerDriver::CompileAll(). Vladimir Marko
Treat verification results and image classes as mutable only in CompilerDriver::PreCompile(), and treat them as immutable during compilation, accessed through the CompilerOptions. This severs the dependency of the inliner on the CompilerDriver. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I594a0213ca6a5003c19b4bd488af98db4358d51d
2018-11-08Emit bit manipulation instructions for x86 and x86_64 Shalini Salomi Bodapati
This patch performs instruction simplification to generate instructions andn, blsmsk and blsr on cpus that have avx2. Test: test.py --host --64, test-art-host-gtest Change-Id: Ie41a1b99ac2980f1e9f6a831a7d639bc3e248f0f Signed-off-by: Shalini Salomi Bodapati <shalini.salomi.bodapati@intel.com>
2018-09-28Remove need for intrinsic recognizer to be a pass. Nicolas Geoffray
Instead just recognize the intrinsic when creating an invoke instruction. Also remove some old code related to compiler driver sharpening. Test: test.py Change-Id: Iecb668f30e95034970fcf57160ca12092c9c610d
2018-09-19Remove sharpening as an optimization pass. Nicolas Geoffray
Make the last sharpening helper (methods) like the other helpers: being invoked by the instruction builder. Test: test.py Change-Id: Ic80a454f9b59b0b4ef7825590b24402500ba851c
2018-07-13Merge "Revert "Emit vector mulitply and accumulate instructions for x86."" Hans Boehm
2018-07-13Revert "Emit vector mulitply and accumulate instructions for x86." Hans Boehm
This reverts commit 61908880e6565acfadbafe93fa64de000014f1a6. Reason for revert: By failing to round multiply results, it does not follow Java rounding rules. Change-Id: Ic0ef08691bef266c9f8d91973e596e09ff3307c6
2018-07-02Merge "Emit vector mulitply and accumulate instructions for x86." Treehugger Robot
2018-07-02Emit vector mulitply and accumulate instructions for x86. Gupta Kumar, Sanjiv
This patch adds a new cpu vaiant named kabylake and performs instruction simplification to generate VectorMulitplyAccumulate. Test: ./test.py --host --64 Change-Id: Ie6cc882dadf1322dd4d3ae49bfdb600b0c447765 Signed-off-by: Gupta Kumar, Sanjiv <sanjiv.kumar.gupta@intel.com>
2018-06-28Remove CompilerDriver::support_boot_image_fixup_. Vladimir Marko
Check for non-PIC boot image as a testing config instead. Honor the config for HInvokeStaticOrDirect sharpening. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I3645f4fefe322f1fd64ea88a2b41a35ceccea688
2018-06-25Move instruction_set_ to CompilerOptions. Vladimir Marko
Removes CompilerDriver dependency from ImageWriter and several other classes. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: Pixel 2 XL boots. Test: m test-art-target-gtest Test: testrunner.py --target --optimizing Change-Id: I3c5b8ff73732128b9c4fad9405231a216ea72465
2018-04-30Step 2 of 2: conditional passes. Aart Bik
Rationale: The change introduces actual conditional passes (dependence on inliner). This ensures more cases are optimized downstream without needlessly introducing compile-time. NOTE: Some checker tests needed to be rewritten due to subtle changes in the phase ordering. No optimizations were harmed in the process, though. Bug: b/78171933, b/74026074 Test: test-art-host,target Change-Id: I335260df780e14ba1f22499ad74d79060c7be44d
2018-01-08Clean up CodeItemAccessors and Compact/StandardDexFile Mathieu Chartier
Change constructor to use a reference to a dex file. Remove duplicated logic for GetCodeItemSize. Bug: 63756964 Test: test-art-host Change-Id: I69af8b93abdf6bdfa4454e16db8f4e75883bca46
2018-01-05Create dex subdirectory David Sehr
Move all the DexFile related source to a common subdirectory dex/ of runtime. Bug: 71361973 Test: make -j 50 test-art-host Change-Id: I59e984ed660b93e0776556308be3d653722f5223
2017-12-22Make CodeItem fields private Mathieu Chartier
Make code item fields private and use accessors. Added a hand full of friend classes to reduce the size of the change. Changed default to be nullable and removed CreateNullable. CreateNullable was a bad API since it defaulted to the unsafe, may add a CreateNonNullable if it's important for performance. Motivation: Have a different layout for code items in cdex. Bug: 63756964 Test: test-art-host-gtest Test: test/testrunner/testrunner.py --host Test: art/tools/run-jdwp-tests.sh '--mode=host' '--variant=X32' --debug Change-Id: I42bc7435e20358682075cb6de52713b595f95bf9
2017-12-08Determine HLoadClass/String load kind early. Vladimir Marko
This helps save memory by avoiding the allocation of HEnvironment and related objects for AOT references to boot image strings and classes (kBootImage* load kinds) and also for JIT references (kJitTableAddress). Compiling aosp_taimen-userdebug boot image, the most memory hungry method BatteryStats.dumpLocked() needs - before: Used 55105384 bytes of arena memory... ... UseListNode 10009704 Environment 423248 EnvVRegs 20676560 ... - after: Used 50559176 bytes of arena memory... ... UseListNode 8568936 Environment 365680 EnvVRegs 17628704 ... Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Bug: 34053922 Change-Id: I68e73a438e6ac8e8908e6fccf53bbeea8a64a077
2017-11-20Refactored optimization passes setup. Aart Bik
Rationale: Refactors the way we set up optimization passes in the compiler into a more centralized approach. The refactoring also found some "holes" in the existing mechanism (missing string lookup in the debugging mechanism, or inablity to set alternative name for optimizations that may repeat). Bug: 64538565 Test: test-art-host test-art-target Change-Id: Ie5e0b70f67ac5acc706db91f64612dff0e561f83
2017-08-11optimizing: Refactor statistics to use OptimizingCompilerStats helper Igor Murashkin
Remove all copies of 'MaybeRecordStat', replacing them with a single OptimizingCompilerStats::MaybeRecordStat helper. Change-Id: I83b96b41439dccece3eee2e159b18c95336ea933
2015-06-24ART: Run GraphChecker after Builder and SsaBuilder David Brazdil
This patch refactors the way GraphChecker is invoked, utilizing the same scoping mechanism as pass timing and graph visualizer. Therefore, GraphChecker will now run not just after instances of HOptimization but after the builders and reg alloc, too. Change-Id: I8173b98b79afa95e1fcbf3ac9630a873d7f6c1d4