summaryrefslogtreecommitdiff
path: root/compiler/optimizing/induction_var_range.h
AgeCommit message (Collapse)Author
2024-07-16Update InductionVarRange::Replace to match more cases Santiago Aboy Solanes
We were missing to update some cases when `instruction` wasn't present in the induction analysis but `fetch` was. We can use `fetch` itself when querying for the InductionInfo to cover both the previous and new cases. Bug: 351868772 Fixes: 351868772 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Test: m test-art-host-gtest Change-Id: I3c412efdef36d242b0182ed80ce62567c0e3eafc
2023-11-24Improve linear induction var range creation Santiago Aboy Solanes
We can now detect and remove loops that require an is_taken test e.g. int a = 0; for (int i = 0; i < n; i++) { a += 1; } can be turned into `if (n < 0) then 0 else n`. Part of this logic can be reused in the future to help eliminate BoundsCheck instructions. Bug: 304967775 Fixes: 304967775 Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing Change-Id: I944f3408e623a0652977d4c3f72d29caf9c1f908
2023-11-10Improve linear loop optimization overflow checks Santiago Aboy Solanes
We can move the logic from GenerateLastValueLinear to GenerateCode, where we can: * be more granular, and * reuse that code for other methods. To do so we add an allow_potential_overflow variable. This is done because BCE does perform operations that might overflow, but it uses HDeoptimize instructions to guard against that. Loop optimization doesn't add HDeoptimize instructions so we must be more careful. Bug: 231415860 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I1d689e5ddd87d96707673b6065ab0cc48b288617
2023-04-27Optimizing: Add `HInstruction::As##type()`. Vladimir Marko
After the old implementation was renamed in https://android-review.googlesource.com/2526708 , we introduce a new function with the old name but new behavior, just `DCHECK()`-ing the instruction kind before casting down the pointer. We change appropriate calls from `As##type##OrNull()` to `As##type()` to avoid unncessary run-time checks and reduce the size of libart-compiler.so. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: run-gtests.sh Test: testrunner.py --target --optimizing Bug: 181943478 Change-Id: I025681612a77ca2157fed4886ca47f2053975d4e
2023-04-27Optimizing: Rename `As##type` to `As##type##OrNull`. Vladimir Marko
The null type check in the current implementation of `HInstruction::As##type()` often cannot be optimized away by clang++. It is therefore beneficial to have two functions HInstruction::As##type() HInstruction::As##type##OrNull() where the first function never returns null but the second one can return null. The additional text "OrNull" shall also flag the possibility of yielding null to the developer which may help avoid bugs similar to what we have seen previously. This requires renaming the existing function that can return null and introducing new function that cannot. However, defining the new function `HInstruction::As##type()` in the same change as renaming the old one would risk introducing bugs by missing a rename. Therefore we simply rename the old function here and the new function shall be introduced in a separate change with all behavioral changes being explicit. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: buildbot-build.sh --target Bug: 181943478 Change-Id: I4defd85038e28fe3506903ba3f33f723682b3298
2022-11-07Reland "Make compiler/optimizing/ symbols hidden." VladimĂ­r Marko
This reverts commit 0a51605ddd81635135463dab08b6f7c21b58ffb0. Reason for revert: Reland after some of the required work was merged in other CLs. Also address a TODO from the original CL to mark required symbols with EXPORT in `intrinsic_objects.h`. Also mark symbols in new files as HIDDEN. Bug: 186902856 Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I936d448983928af23614ca82c2d0bf9a645e2c52
2022-07-26Make linear loop optimization safe from overflow Santiago Aboy Solanes
In the calcuation of `a * i + b`, `a` itself is calculated by doing: `(end - start) + (step - 1) / step` (Note that we add `step - 1` as a way of doing `ceiling`). This way of calculating `a` can overflow and produce the wrong result if end and start are in opposite sides of the spectrum. We can force `a` to be a constant to guarantee that the right result will be generated when doing loop optimization. Bug: 231415860 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: Ic056441f8d672b3c48cbbd2f3e4ebd7528e2c65b
2022-03-28Fix last value generation in loop optimization. Vladimir Marko
Instead of `in_body`, propagate the context block and loop information to make better decisions for trip count if the context is outside the loop. In particular, fix `InductionVarRange::IsConstant()` to take and use this information instead of assuming that we are asking about values in the loop body. For trip count with context outside the loop, we know that the value shall be the maximum trip count if the context is dominated by the loop control exit block. Test: Enable run-test 835-b216762268. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 216762268 Change-Id: Id564ba75c812d54abdd9b229e643cc8ab4701c52
2019-10-14Revert "Make compiler/optimizing/ symbols hidden." Vladimir Marko
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf. Reason for revert: Breaks ASAN tests (ODR violation). Bug: 142365358 Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
2019-10-14Make compiler/optimizing/ symbols hidden. Vladimir Marko
Make symbols in compiler/optimizing hidden by a namespace attribute. The unit intrinsic_objects.{h,cc} is excluded as it is needed by dex2oat. As the symbols are no longer exported, gtests are now linked with the static version of the libartd-compiler library. libart-compiler.so size: - before: arm: 2396152 arm64: 3345280 - after: arm: 2016176 (-371KiB, -15.9%) arm64: 2874480 (-460KiB, -14.1%) Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Bug: 142365358 Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
2018-03-26ART: Implement scalar loop unrolling. Artem Serov
Implement scalar loop unrolling for small loops (on arm64) with known trip count to reduce loop check and branch penalty and to provide more opportunities for instruction scheduling. Note: this functionality is turned off by default now. Test: cloner_test.cc Test: test-art-target, test-art-host Change-Id: Ic27fd8fb0bc0d7b69251252da37b8b510bc30acc
2017-08-08Set basic framework for detecting reductions. Aart Bik
Rationale: Recognize reductions in loops. Note that reductions are *not* optimized yet (we would proceed with e.g. unrolling and vectorization). This CL merely sets up the basic detection framework. Also does a bit of cleanup on loop optimization code. Bug: 64091002 Test: test-art-host Change-Id: I0f52bd7ca69936315b03d02e83da743b8ad0ae72
2017-06-29Improved subscript and data dependence analysis. Aart Bik
Rationale: We missed vectorizing a simple stencil operation due to inaccurate unit stride analysis and failure to detect single runtime data dependence test. Test: test-art-host, test-art-target Change-Id: I07ba03455bfb1c0aff371c1244a1328f885d0916
2017-04-07Fixed missing context while detecting unit strides. Aart Bik
With regression test (found by fuzz testing). Bug: 37033123 Test: test-art-target Change-Id: Id738b2a3a353985c3d0bf3beeb581a31f1fcbc3f
2017-03-01New utilities for induction variables. Aart Bik
Rationale: Break-out CL of ART Vectorizer: 2 OF many. The purpose is making the original CL smaller and easier to review. Bug: 34083438 Test: test-art-host Change-Id: I46d297eba504af3850a5998ee279ea9f7b38bed8
2017-01-13Complete unrolling of loops with small body and trip count one. Aart Bik
Rationale: Avoids the unnecessary loop control overhead, suspend check, and exposes more opportunities for constant folding in the resulting loop body. Fully unrolls loop in execute() of the Dhrystone benchmark (3% to 8% improvements). Test: test-art-host Change-Id: If30f38caea9e9f87a929df041dfb7ed1c227aba3
2016-12-09Added polynomial induction variables analysis. With tests. Aart Bik
Rationale: Information on polynomial sequences is nice to further enhance BCE and last-value assignment. In this case, this CL enables more loop optimizations for benchpress' Sum (80 x speedup). Also changed rem-based geometric induction to wrap-around induction. Test: test-art-host Change-Id: Ie4d2659edefb814edda2c971c1f70ba400c31111
2016-12-05Added geometric induction variables analysis. Aart Bik
Rationale: Information on geometric and polynomial (coming soon) sequences are nice to have to further enhance BCE and last-value assignment. Test: test-art-host Change-Id: Ib5e2998c3eb1009def6fd00b82935da7c3ba7c6e
2016-10-24Improved induction variable analysis and loop optimizations. Aart Bik
Rationale: Rather than half-baked reconstructing cycles during loop optimizations, this CL passes the SCC computed during induction variable analysis to the loop optimizer (trading some memory for more optimizations). This further improves CaffeineLogic from 6000us down to 4200us (dx) and 2200us to 1690us (jack). Note that this is on top of prior improvements in previous CLs. Also, some narrowing type concerns are taken care of during transfer operations. Test: test-art-host Change-Id: Ice2764811a70073c5014b3a05fb51f39fd2f4c3c
2016-10-18Enable last value generation of periodic sequence. Aart Bik
Rationale: This helps to eliminate more dead induction. For example, CaffeineLogic when compiled with latest Jack improves with a 1.3 speedup (2900us -> 2200us) due to eliminating first loop (second loop can be removed also, but for a later case). The currently benchmarks.dex has a different construct for the periodics, however, still to be recognized. Test: test-art-host Change-Id: Ia81649a207a2b1f03ead0855436862ed4e4f45e0
2016-10-12Recognize XOR-based periodic induction. Aart Bik
Rationale: This is a commonly used construct (e.g. x = !x for booleans and x ^= 1 for integers). This CL prepares some upcoming optimizations that exploit such inductions. Change-Id: I46edffb9de1075a836995daf5c2dfff7891f3034 Test: 530-checker-loops2 and induction_var_analysis_test
2016-10-11Improved and simplified loop optimizations. Aart Bik
Rationale: Empty preheader simplification has been simplified to a much more general empty block removal optimization step. Incremental updating of induction variable analysis enables repeated elimination or simplification of induction cycles. This enabled an extra layer of optimization for e.g. Benchpress Loop (17.5us. -> 0.24us. -> 0.08us). So the original 73x speedup is now multiplied by another 3x, for a total of about 218x. Test: 618-checker-induction et al. Change-Id: I394699981481cdd5357e0531bce88cd48bd32879
2016-09-15Added ability to generate last-value of linear induction. Aart Bik
Also added utility to update fetches in induction nodes. Rationale: This is a first step towards the larger CL that introduces a new loop optimization framework in the optimizing compiler (see https://android-review.googlesource.com/#/c/271392/3). Change-Id: Ibecd674c8146d9665340e68718c498555646129a Tests: induction_var_range_test
2016-06-29Improvements in induction range analysis. Aart Bik
Rationale: Uses range analysis while determining whether trip-counts are "safe", which improves analysis of triangular loops. Also implements more effective triangular loop analysis by evaluating induction information only once and using a top level hint (instead of the "iterative refinement" that was used earlier). Also fixes analysis of triangular trip counts that may wrap-around. All with tests. Test: see induction_var_range_test/530-checker-loops* BUG=27151190 Change-Id: I1877c8ce0c9a52005900eb9dfdbb1918df100278
2016-02-24Use range analysis for better trip count analysis Aart Bik
Rationale: Marking more loops as always-taken avoids generating unnecessary new top tests while marking more loops are non-infinite enables more optimizations. This CL helps with these improvements. Also, some more code is shared between induction and range analysis and a bug with refining ranges has been fixed. Bug: 27151190 Change-Id: Iecc0d7f32ae4779ee5424cda9dcc20816220935e
2016-02-03Minor improvement on static BCE analysis. Aart Bik
Rationale: Avoid testing initial range if nothing is known. Change-Id: I22646a5fd6e4481245d1a2f57891d2805550489f
2015-12-15Various induction/range analysis improvements. Aart Bik
Rationale: this change list improves analysis of triangular loops both by changing loop order for induction analysis (enabling range analysis in inner loops) and by some symbolic improvements during range analysis; also, a mul/div bug has been fixed (with pass/fail unit tests); lastly this change list prepares some follow up optimizations. Change-Id: I84a03e848405009541c3fa8e3d3c2f430e100087
2015-12-03Step-wise improvement of range analysis with outer loop induction. Aart Bik
Rationale: Using a step-wise approach (rather than expanding all ranges at once) increases the opportunities for statically removing bound checks, as demonstrated by the new checker tests. Change-Id: Icbfd9406523a069e1fb7508546ea94f896e5a255
2015-11-04Finalized all components of range analysis needed for dynamic bce. Aart Bik
Rationale: added ability to generate taken-test, prompt back need for finite-test; cleaned up the API now that bounds check needs are all known. Change-Id: I3d09b249965d1a980c09381240de175ca4b2e455
2015-10-19Added ability to generate induction range code. Aart Bik
Rationale: used by dynamic BCE (done in another CL). Change-Id: Ia6ce75da57b5298fba74622822ae0bae69c74188
2015-09-30Implemented trip-count safety information. Aart Bik
As shown in the induction analysis presentation, trip-counts need to deal with potential taken/not-taken situations (so that trip-count is either valid in the full loop or just in the loop-body proper) and potential finite/infinite situations (the latter can still be analyzed but may need to run-time test later to guard against the infinite conditions). This CL provides that information. Change-Id: I0445d8e836b80a3614af217ce3e39d766e77b986
2015-09-23Minor cleanup in range analysis. Aart Bik
(1) replaced min/max macro as previously required. (2) removed some redundant code by merging min/max into one. Change-Id: I610879a06d550346bfac7e6e12ec0299ba226a37
2015-09-22Various improvements in range analysis. Aart Bik
Rationale: Using min/max values for "unknowns" is a bit wasteful, since it eliminates two useful values. Replaced this with additional boolean to make cases more accurate. Added few cases to handle examples found in real-life. Change-Id: I211f8d9a28b1ae79abdb55fb4569716f21d8043b
2015-09-15Use induction variable range analysis in BCE (statically). Aart Bik
Rationale: Finally! After lots of very large CLs, now a small CL that uses the new induction variable analysis in BCE (statically, using this dynamically with de-opt is TBD). Despite its relative small size, be aware though, since the CL introduces a new phase to the compiler. Change-Id: If5555a173fd5d55d147c63138ef51fc296fa1414
2015-09-10Induction variable range analysis. Aart Bik
Rationale: by computing an upper bound on the trip count of each loop after induction var analysis has completed, a simple range analysis yields lower and upper bounds on all induced expressions in a loop; this analysis plugs directly into BCE (follow-up CL). Change-Id: I46a3fe48721ca372547199b39a3498c47992597d