summaryrefslogtreecommitdiff
path: root/compiler/optimizing/builder.cc
AgeCommit message (Collapse)Author
2024-10-07Reland "Calculate the number of out vregs." Vladimír Marko
This reverts commit 434a327234f74eed3ef4072314d2e2bdb73e4dda. Reason for revert: Relanding with no change. The regressions that were the reason for the revert may reappear. However, these regressions are probably caused by subtle effects that are not directly related to this change. For example, a code size improvement can regress performance simply by moving the start of a loop from an aligned address to an unaligned address, or by splitting a loop across two cache lines. Bug: 358519867 Bug: 359722268 Change-Id: I997b8a4219418f79b3a5fc4e7e50817911f0a737
2024-08-21Revert "Calculate the number of out vregs." Vladimír Marko
This reverts commit 3e75615ad25b6af1842b194e78b429b0f585b46a. Reason for revert: Regressed some micro-benchmarks, see bug 359722268. Bug: 358519867 Bug: 359722268 Change-Id: I207cc78c88193564e90c98eda2c96a5ba354a588
2024-08-13Calculate the number of out vregs. Vladimir Marko
Determine the number of out vregs needed by invokes that actually make a call, and by `HStringBuilderAppend`s. This can yield smaller frame sizes of compiled methods when some calls are inlined or fully intrinsified. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 358519867 Change-Id: I4930a9bd811b1de14658f5ef44e65eadea6a7961
2024-02-06CompilerOptions refactor after aosp/2808063 Santiago Aboy Solanes
* Updated comments * Made constants constexpr * Renamed kBaselineMaxCodeUnits to include "Inline" Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I37569b3d9e5eecfd65a505a79945bbe5b290fbbf
2024-01-30Allow compilation of large methods with no branches Santiago Aboy Solanes
Popular apps include such methods in their profiles. Having the extra heuristic of skipping compilation for large methods with no branches can be unintuitive for developers who created those profiles. Some apps see startup improvements with this heuristic removed. Bug: 316617683 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I21a8da93e89399dac0e45c3ab43a8bbedc925a44
2023-01-13Update the graph flags and check consistency Santiago Aboy Solanes
Check that the flags are up to date in graph checker. Mainly a correctness check CL but it brings slight code size reduction (e.g. not needing vreg info if HasMonitorOperations is false). Update loop_optimization_test to stop using `LocalRun` directly as it meant that it was breaking assumptions (i.e. top_loop_ was nullptr when it was expected to have a value). Bug: 264278131 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I29765b3be46d4bd7c91ea9c80f7565a3c88fae2e
2022-12-14Move adding extra goto blocks to InlineInto Santiago Aboy Solanes
There are some cases in which we need to add extra goto blocks when inlining to avoid critical edges. The `TryBoundary` of `kind:exit` instructions will always have more than one successor (normal flow and exceptional flow). If its normal flow successor has more than one predecessor, we would be introducing a critical edge. We can avoid the critical edge in InlineInto instead of doing it in the builder which helps decoupling those two stages as well as simplifying surrounding code. We also have the benefit of adding the extra goto blocks only when necessary. Bug: 227283224 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: Ibe21623c94c798f7cea60ff892064e63a38a787a
2022-12-08Update loop information correctly in MaybeAddExtraGotoBlocks Santiago Aboy Solanes
There are cases in which we have a Return->TryBoundary kind:exit->Exit chain inside of a loop, and that said TryBoundary has loop information. Set that information in the new goto block. Bug: 261731237 Fixes: 261731237 Test: dex2oat compiling the apps in the bug Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I224d6cb449d1a7a1a987f4e9022098f84ae4fba2
2022-12-06Update domination chain and RPO manually in MaybeAddExtraGotoBlocks Santiago Aboy Solanes
There's no need to recompute the whole graph as we know what changed. As a drive-by, we now don't return false for graphs with irreducible loops so we can remove that restriction from the builder. However, if a graph with irreducible loops hits this path it means that: A) it's being inlined B) Has irreducible loops We don't inline graphs with irreducible loops, and after building for inline we don't remove them either because constant folding and instruction simplifier don't remove them, and DCE doesn't run for graphs with irreducible loops. So, in terms of dex2oat's outputs nothing should change. Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I8cbf1b5f0518bb5dd14ffd751100ea81f5478863
2022-11-07Reland "Make compiler/optimizing/ symbols hidden." Vladimír Marko
This reverts commit 0a51605ddd81635135463dab08b6f7c21b58ffb0. Reason for revert: Reland after some of the required work was merged in other CLs. Also address a TODO from the original CL to mark required symbols with EXPORT in `intrinsic_objects.h`. Also mark symbols in new files as HIDDEN. Bug: 186902856 Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I936d448983928af23614ca82c2d0bf9a645e2c52
2022-10-13Update MaybeAddExtraGotoBlocks to bail for irreducible loops Santiago Aboy Solanes
We may have graphs with irreducible loops at this point and recomputing the loop information is not supported. Bug: 253242440 Test: dex2oat compiling the apps in the bug Change-Id: I181b4fb9d9812ba2f14cd21602cf0e5a4e1fd18e
2022-10-10Compiler implementation of try catch inlining Santiago Aboy Solanes
Notable changes: 1) Wiring of the graph now allows for inlinees graph ending in TryBoundary, or Goto in some special cases. 2) Building a graph with try catch for inlining may add an extra Goto block. 3) Oat version bump. 4) Reduced kMaximumNumberOfCumulatedDexRegisters from 32 to 20. Bug: 227283224 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: Ic2fd956de24b72d1de29b4cd3d0b2a1ddab231d8
2021-07-14Remove the need of VerifiedMethod in the compiler. Nicolas Geoffray
The compiler only needs to know if a method is compilable or not. So just record a set of uncompilable methods (in some cases, we cannot have an ArtMethod, but the method can still be compiled). Test: test.py Bug: 28313047 Change-Id: Ic4235bc8160ec91daa5ebf6504741089b43e99cb
2021-03-23Remove Vdex::GetQuickenedInfoOf and all its users. Nicolas Geoffray
Test: test.py Bug: 170086509 Change-Id: I1e1a4abf71245c0fd37f951c9af85f62feba18ca
2020-12-04Remove DexCache arrays from image. David Srbecky
Remove the hashtable storage from the image and allocate it at runtime instead (but keep the DexCache object in the image). For compiled code, we have largely moved to using .bss, so the DexCache just costs us unnecessary extra space and dirty pages. For interpreted code, the hashtables are too small and will be overridden many times over at run-time regardless. The next step will be to make DexCache variable-size so it can adapt to both of the extremes (taking minimal amount of memory for compiled code and avoiding cache evictions in interpreter). Test: test.py --host Change-Id: I9f89e8f19829b812cf85dea1a964259ed8b87f4d
2020-05-13Move HandleCache to HGraph. Vladimir Marko
This avoids passing the `VariableSizedHandleScope*` argument around and eliminates HGraph::inexact_object_rti_ and its initialization. The latter shall allow running Optimizing gtests that do not require type information without creating a Runtime in future. (To be implemented in a separate CL.) Test: m test-art-host-gtest Test: testrunner.py --host --optmizing Test: aosp_taimen-userdebug boots. Change-Id: I36fe9bc556c6d610d644c8c14cc74c9985a14d64
2019-10-14Revert "Make compiler/optimizing/ symbols hidden." Vladimir Marko
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf. Reason for revert: Breaks ASAN tests (ODR violation). Bug: 142365358 Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
2019-10-14Make compiler/optimizing/ symbols hidden. Vladimir Marko
Make symbols in compiler/optimizing hidden by a namespace attribute. The unit intrinsic_objects.{h,cc} is excluded as it is needed by dex2oat. As the symbols are no longer exported, gtests are now linked with the static version of the libartd-compiler library. libart-compiler.so size: - before: arm: 2396152 arm64: 3345280 - after: arm: 2016176 (-371KiB, -15.9%) arm64: 2874480 (-460KiB, -14.1%) Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Bug: 142365358 Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
2018-11-01Do not cache RequiresConstructorBarrier() results. Vladimir Marko
Avoid caching the results. Caching was broken for JIT in the presence of class unloading; entries for unloaded dex files were leaked and potentially used erroneously with a newly loaded dex file. Test: m test-art-host-gtest Test: testrunner.py --host Test: Pixel 2 XL boots. Test: m test-art-target-gtest Test: testrunner.py --target Bug: 118808764 Change-Id: Ic1163601170364e060c2e3009752f543c9bb37b7
2018-01-13Revert "Revert "Move quickening info logic to its own table"" Mathieu Chartier
Bug: 71605148 Bug: 63756964 Test: test-art-target on angler This reverts commit 6716941120ae9f47ba1b8ef8e79820c4b5640350. Change-Id: Ic01ea4e8bb2c1de761fab354c5bbe27290538631
2018-01-12Revert "Move quickening info logic to its own table" Nicolas Geoffray
Bug: 71605148 Bug: 63756964 Seems to fail on armv7. This reverts commit f5245188d9c61f6b90eb30cca0875fbdcc493b15. Change-Id: I37786c04a8260ae3ec4a2cd73710126783c3ae7e
2018-01-11Move quickening info logic to its own table Mathieu Chartier
Added a table that is indexed by dex method index. To prevent size overhead, there is only one slot for each 16 method indices. This means there is up to 15 loop iterations to get the quickening info for a method. The quickening infos are now prefixed by a leb encoded length. This allows methods that aren't quickened to only have 1.25 bytes of space overhead. The value was picked arbitrarily, there is little advantage to increasing the value since the table only takes 1 byte per 4 method indices currently. JIT benchmarks do not regress with the change. There is a net space saving from removing 8 bytes from each quickening info since most scenarios have more quickened methods than compiled methods. For getting quick access to the table, a 4 byte preheader was added to each dex in the vdex file Removed logic that stored the quickening info in the CodeItem debug_info_offset field. The change adds a small quicken table for each method index, this means that filters that don't quicken will have a slight increase in size. The worst case scenario is compiling all the methods, this results in 0.3% larger vdex for this case. The change also disables deduping since the quicken infos need to be in dex method index order. For filters that don't compile most methods like quicken and speed-profile, there is space savings. For quicken, the vdex is 2% smaller. Bug: 71605148 Bug: 63756964 Test: test-art-host Change-Id: I89cb679538811369c36b6ac8c40ea93135f813cd
2017-12-22Make CodeItem fields private Mathieu Chartier
Make code item fields private and use accessors. Added a hand full of friend classes to reduce the size of the change. Changed default to be nullable and removed CreateNullable. CreateNullable was a bad API since it defaulted to the unsafe, may add a CreateNonNullable if it's important for performance. Motivation: Have a different layout for code items in cdex. Bug: 63756964 Test: test-art-host-gtest Test: test/testrunner/testrunner.py --host Test: art/tools/run-jdwp-tests.sh '--mode=host' '--variant=X32' --debug Change-Id: I42bc7435e20358682075cb6de52713b595f95bf9
2017-11-15Use intrinsic codegen for compiling intrinsic methods. Vladimir Marko
When compiling an intrinsic method, generate a graph that invokes the same method and try to compile it. If the call is actually intrinsified (or simplified to other HIR) and yields a leaf method, use the result of this compilation attempt, otherwise compile the actual code or JNI stub. Note that CodeGenerator::CreateThrowingSlowPathLocations() actually marks the locations as kNoCall if the throw is not in a catch block, thus considering some throwing methods (for example, String.charAt()) as leaf methods. We would ideally want to use the intrinsic codegen for all intrinsics that do not generate a slow-path call to the default implementation. Relying on the leaf method is suboptimal as we're missing out on methods that do other types of calls, for example runtime calls. This shall be fixed in a subsequent CL. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 67717501 Change-Id: I640fda7c22d4ff494b5ff77ebec3b7f5f75af652
2017-10-11Use ScopedArenaAllocator for building HGraph. Vladimir Marko
Memory needed to compile the two most expensive methods for aosp_angler-userdebug boot image: BatteryStats.dumpCheckinLocked() : 21.1MiB -> 20.2MiB BatteryStats.dumpLocked(): 42.0MiB -> 40.3MiB This is because all the memory previously used by the graph builder is reused by later passes. And finish the "arena"->"allocator" renaming; make renamed allocator pointers that are members of classes const when appropriate (and make a few more members around them const). Test: m test-art-host-gtest Test: testrunner.py --host Bug: 64312607 Change-Id: Ia50aafc80c05941ae5b96984ba4f31ed4c78255e
2017-10-06ART: Use ScopedArenaAllocator for pass-local data. Vladimir Marko
Passes using local ArenaAllocator were hiding their memory usage from the allocation counting, making it difficult to track down where memory was used. Using ScopedArenaAllocator reveals the memory usage. This changes the HGraph constructor which requires a lot of changes in tests. Refactor these tests to limit the amount of work needed the next time we change that constructor. Test: m test-art-host-gtest Test: testrunner.py --host Test: Build with kArenaAllocatorCountAllocations = true. Bug: 64312607 Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
2017-09-25ART: Introduce compiler data type. Vladimir Marko
Replace most uses of the runtime's Primitive in compiler with a new class DataType. This prepares for introducing new types, such as Uint8, that the runtime does not need to know about. Test: m test-art-host-gtest Test: testrunner.py --host Bug: 23964345 Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
2017-08-11optimizing: Refactor statistics to use OptimizingCompilerStats helper Igor Murashkin
Remove all copies of 'MaybeRecordStat', replacing them with a single OptimizingCompilerStats::MaybeRecordStat helper. Change-Id: I83b96b41439dccece3eee2e159b18c95336ea933
2016-10-18Remove mirror:: and ArtMethod deps in utils.{h,cc} David Sehr
The latest chapter in the ongoing saga of attempting to dump a DEX file without having to start a whole runtime instance. This episode finds us removing references to ArtMethod/ArtField/mirror. One aspect of this change that I would like to call out specfically is that the utils versions of the "Pretty*" functions all were written to accept nullptr as an argument. I have split these functions up as follows: 1) an instance method, such as PrettyClass that obviously requires this != nullptr. 2) a static method, that behaves the same way as the util method, but calls the instance method if p != nullptr. This requires using a full class qualifier for the static methods, which isn't exactly beautiful. I have tried to remove as many cases as possible where it was clear p != nullptr. Bug: 22322814 Test: test-art-host Change-Id: I21adee3614aa697aa580cd1b86b72d9206e1cb24
2016-04-07Revert "Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals"" David Brazdil
This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) This CL fixed an issue with parsing quickened instructions. Bug: 27894376 Bug: 27998571 Bug: 27995065 Change-Id: I20dbe1bf2d0fe296377478db98cb86cba695e694
2016-04-04Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals" David Brazdil
Bug: 27995065 This reverts commit e3ff7b293be2a6791fe9d135d660c0cffe4bd73f. Change-Id: I5363c7ce18f47fd422c15eed5423a345a57249d8
2016-04-04Refactor HGraphBuilder and SsaBuilder to remove HLocals David Brazdil
This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) Bug: 27894376 Change-Id: Iefe28d40600c169c5d306fd2c77034ae19476d90
2016-04-04Build dominator tree before generating HInstructions David Brazdil
Second CL in the series of merging HGraphBuilder and SsaBuilder. This patch refactors the builders so that dominator tree can be built before any HInstructions are generated. This puts the SsaBuilder removal of HLoadLocals/HStoreLocals straight after HGraphBuilder's HInstruction generation phase. Next CL will therefore be able to merge them. This patch also adds util classes for iterating bytecode and switch tables which allowed to simplify the code. Bug: 27894376 Change-Id: Ic425d298b2e6e7980481ed697230b1a0b7904526
2016-03-29Optimizing: Improve const-string code generation. Vladimir Marko
For strings in the boot image, use either direct pointers or pc-relative addresses. For other strings, use PC-relative access to the dex cache arrays for AOT and direct address of the string's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -692KiB (-0.9%) - 64-bit boot.oat: -948KiB (-1.1%) - 32-bit dalvik cache total: -900KiB (-0.9%) - 64-bit dalvik cache total: -3672KiB (-1.5%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -380KiB (-0.5%) - 64-bit boot.oat: -928KiB (-1.0%) - 32-bit dalvik cache total: -468KiB (-0.4%) - 64-bit dalvik cache total: -1928KiB (-0.8%) (contains more files than the 32-bit dalvik cache) Bug: 26884697 Change-Id: Iec7266ce67e6fedc107be78fab2e742a8dab2696
2016-03-24Merge "Optimizing: Do not insert suspend checks on back-edges." Vladimir Marko
2016-03-23Optimizing: Do not insert suspend checks on back-edges. Vladimir Marko
Rely on HGraph::SimplifyLoop() to insert suspend checks. CodeGenerator's CheckLoopEntriesCanBeUsedForOsr() checks the dex pcs of suspend checks against branch targets to verify that we always have an appropriate point for OSR transition. However, the HSuspendChecks that were added by HGraphBuilder to support the recently removed "baseline" interfered with this in a specific case, namely an infinite loop where the back-branch jumps to a nop. In that case, the HSuspendCheck added by HGraphBuilder had a dex pc different from the block and the branch target but its presence would stop the HGraph::SimplifyLoop() from adding a new HSuspendCheck with the correct dex pc. Bug: 27623547 Change-Id: I83566a260210bc05aea0c44509a39bb490aa7003
2016-03-23Revert "Revert "Use compiler filter to determine oat file status."" Andreas Gampe
This reverts commit 845e5064580bd37ad5014f7aa0d078be7265464d. Add an option to change what OatFileManager considers up-to-date. In our tests we're allowed to write to the dalvik-cache, so it cannot be kSpeed. Bug: 27689078 Change-Id: I0c578705a9921114ed1fb00d360cc7448addc93a
2016-03-23Revert "Use compiler filter to determine oat file status." Nicolas Geoffray
Bots are red. Tentative reverting as this is likely the offender. Bug: 27689078 This reverts commit a62d2f04a6ecf804f8a78e722a6ca8ccb2dfa931. Change-Id: I3ec6947a5a4be878ff81f26f17dc36a209734e2a
2016-03-22Use compiler filter to determine oat file status. Richard Uhler
Record the compiler filter in the oat header. Use that to determine when the oat file is up-to-date with respect to a target compiler filter level. New xxx-profile filter levels are added to specify if a profile should be used instead of testing for the presence of a profile file. This change should allow for different compiler-filters to be set for different package manager use cases. Bug: 27689078 Change-Id: Id6706d0ed91b45f307142692ea4316aa9713b023
2016-03-21Optimizing: Fix register allocator validation memory usage. Vladimir Marko
Also attribute ArenaBitVector allocations to appropriate passes. This was used to track down the source of the excessive memory alloactions. Bug: 27690481 Change-Id: Ib895984cb7c04e24cbc7abbd8322079bab8ab100
2016-02-26Optimizing: Reduce memory usage of HInstructions. Vladimir Marko
Pack narrow fields and flags into a single 32-bit field. Change-Id: Ib2f7abf987caee0339018d21f0d498f8db63542d
2016-02-24Remove HNativeDebugInfo from start of basic blocks. David Srbecky
We do not require full environment at the start of basic block. The dex pc contained in basic block is sufficient for line mapping. Change-Id: I5ba9e5f5acbc4a783ad544769f9a73bb33e2bafa
2016-02-15ART: Run SsaBuilder from HGraphBuilder David Brazdil
First step towards merging the two passes, which will later result in HGraphBuilder directly producing SSA form. This CL mostly just updates tests broken by not being able to inspect the pre-SSA form. Using HLocals outside the HGraphBuilder is now deprecated. Bug: 27150508 Change-Id: I00fb6050580f409dcc5aa5b5aa3a536d6e8d759e
2016-02-12ART: Remove HTemporary David Brazdil
Change-Id: I21b984224370a9ce7a4a13a9652503cfb03c5f03
2016-02-05Revert "Revert "Implement on-stack replacement for arm/arm64/x86/x86_64."" Nicolas Geoffray
This reverts commit bd89a5c556324062b7d841843b039392e84cfaf4. Change-Id: I08d190431520baa7fcec8fbdb444519f25ac8d44
2016-01-28Optimizing compiler support for directly calling interface methods Alex Light
This teaches the optimizing compiler how to perform invoke-super on interfaces. This should make the invokes generally faster. Bug: 24618811 Change-Id: I7f9b0fb1209775c1c8837ab5d21f8acba3cc72a5
2016-01-14ART: Remove incorrect HFakeString optimization David Brazdil
Simplification of HFakeString assumes that it cannot be used until String.<init> is called which is not true and causes different behaviour between the compiler and the interpreter. This patch removes the optimization together with the HFakeString instruction. Instead, HNewInstance is generated and an empty String allocated until it is replaced with the result of the StringFactory call. This is consistent with the behaviour of the interpreter but is too conservative. A follow-up CL will attempt to optimize out the initial allocation when possible. Bug: 26457745 Bug: 26486014 Change-Id: I7139e37ed00a880715bfc234896a930fde670c44
2016-01-06ART: Create BoundType for CheckCast early David Brazdil
ReferenceTypePropagation creates a BoundType for each CheckCast and replaces all dominated uses of the casted object with it. This does not include Phi uses on the boundary of the dominated scope, reducing typing precision. This patch creates the BoundType in Builder, causing SsaBuilder to replace uses of the object automatically. Bug: 26081304 Change-Id: I083979155cccb348071ff58cb9060a896ed7d2ac
2015-12-23Generate more stack maps during native debugging. David Srbecky
Generate extra stack map at the start of each java statement. The stack maps are later translated to DWARF which allows LLDB to set breakpoints and view local variables. Change-Id: If00ab875513308e4a1399d1e12e0fe8934a6f0c3
2015-12-10Don't generate a slow path for strings in the dex cache. Nicolas Geoffray
Change-Id: I1d258f1a89bf0ec7c7ddd134be9215d480f0b09a