summaryrefslogtreecommitdiff
path: root/compiler/optimizing/builder.cc
AgeCommit message (Collapse)Author
2018-01-13Revert "Revert "Move quickening info logic to its own table"" Mathieu Chartier
Bug: 71605148 Bug: 63756964 Test: test-art-target on angler This reverts commit 6716941120ae9f47ba1b8ef8e79820c4b5640350. Change-Id: Ic01ea4e8bb2c1de761fab354c5bbe27290538631
2018-01-12Revert "Move quickening info logic to its own table" Nicolas Geoffray
Bug: 71605148 Bug: 63756964 Seems to fail on armv7. This reverts commit f5245188d9c61f6b90eb30cca0875fbdcc493b15. Change-Id: I37786c04a8260ae3ec4a2cd73710126783c3ae7e
2018-01-11Move quickening info logic to its own table Mathieu Chartier
Added a table that is indexed by dex method index. To prevent size overhead, there is only one slot for each 16 method indices. This means there is up to 15 loop iterations to get the quickening info for a method. The quickening infos are now prefixed by a leb encoded length. This allows methods that aren't quickened to only have 1.25 bytes of space overhead. The value was picked arbitrarily, there is little advantage to increasing the value since the table only takes 1 byte per 4 method indices currently. JIT benchmarks do not regress with the change. There is a net space saving from removing 8 bytes from each quickening info since most scenarios have more quickened methods than compiled methods. For getting quick access to the table, a 4 byte preheader was added to each dex in the vdex file Removed logic that stored the quickening info in the CodeItem debug_info_offset field. The change adds a small quicken table for each method index, this means that filters that don't quicken will have a slight increase in size. The worst case scenario is compiling all the methods, this results in 0.3% larger vdex for this case. The change also disables deduping since the quicken infos need to be in dex method index order. For filters that don't compile most methods like quicken and speed-profile, there is space savings. For quicken, the vdex is 2% smaller. Bug: 71605148 Bug: 63756964 Test: test-art-host Change-Id: I89cb679538811369c36b6ac8c40ea93135f813cd
2017-12-22Make CodeItem fields private Mathieu Chartier
Make code item fields private and use accessors. Added a hand full of friend classes to reduce the size of the change. Changed default to be nullable and removed CreateNullable. CreateNullable was a bad API since it defaulted to the unsafe, may add a CreateNonNullable if it's important for performance. Motivation: Have a different layout for code items in cdex. Bug: 63756964 Test: test-art-host-gtest Test: test/testrunner/testrunner.py --host Test: art/tools/run-jdwp-tests.sh '--mode=host' '--variant=X32' --debug Change-Id: I42bc7435e20358682075cb6de52713b595f95bf9
2017-11-15Use intrinsic codegen for compiling intrinsic methods. Vladimir Marko
When compiling an intrinsic method, generate a graph that invokes the same method and try to compile it. If the call is actually intrinsified (or simplified to other HIR) and yields a leaf method, use the result of this compilation attempt, otherwise compile the actual code or JNI stub. Note that CodeGenerator::CreateThrowingSlowPathLocations() actually marks the locations as kNoCall if the throw is not in a catch block, thus considering some throwing methods (for example, String.charAt()) as leaf methods. We would ideally want to use the intrinsic codegen for all intrinsics that do not generate a slow-path call to the default implementation. Relying on the leaf method is suboptimal as we're missing out on methods that do other types of calls, for example runtime calls. This shall be fixed in a subsequent CL. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 67717501 Change-Id: I640fda7c22d4ff494b5ff77ebec3b7f5f75af652
2017-10-11Use ScopedArenaAllocator for building HGraph. Vladimir Marko
Memory needed to compile the two most expensive methods for aosp_angler-userdebug boot image: BatteryStats.dumpCheckinLocked() : 21.1MiB -> 20.2MiB BatteryStats.dumpLocked(): 42.0MiB -> 40.3MiB This is because all the memory previously used by the graph builder is reused by later passes. And finish the "arena"->"allocator" renaming; make renamed allocator pointers that are members of classes const when appropriate (and make a few more members around them const). Test: m test-art-host-gtest Test: testrunner.py --host Bug: 64312607 Change-Id: Ia50aafc80c05941ae5b96984ba4f31ed4c78255e
2017-10-06ART: Use ScopedArenaAllocator for pass-local data. Vladimir Marko
Passes using local ArenaAllocator were hiding their memory usage from the allocation counting, making it difficult to track down where memory was used. Using ScopedArenaAllocator reveals the memory usage. This changes the HGraph constructor which requires a lot of changes in tests. Refactor these tests to limit the amount of work needed the next time we change that constructor. Test: m test-art-host-gtest Test: testrunner.py --host Test: Build with kArenaAllocatorCountAllocations = true. Bug: 64312607 Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
2017-09-25ART: Introduce compiler data type. Vladimir Marko
Replace most uses of the runtime's Primitive in compiler with a new class DataType. This prepares for introducing new types, such as Uint8, that the runtime does not need to know about. Test: m test-art-host-gtest Test: testrunner.py --host Bug: 23964345 Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
2017-08-11optimizing: Refactor statistics to use OptimizingCompilerStats helper Igor Murashkin
Remove all copies of 'MaybeRecordStat', replacing them with a single OptimizingCompilerStats::MaybeRecordStat helper. Change-Id: I83b96b41439dccece3eee2e159b18c95336ea933
2016-10-18Remove mirror:: and ArtMethod deps in utils.{h,cc} David Sehr
The latest chapter in the ongoing saga of attempting to dump a DEX file without having to start a whole runtime instance. This episode finds us removing references to ArtMethod/ArtField/mirror. One aspect of this change that I would like to call out specfically is that the utils versions of the "Pretty*" functions all were written to accept nullptr as an argument. I have split these functions up as follows: 1) an instance method, such as PrettyClass that obviously requires this != nullptr. 2) a static method, that behaves the same way as the util method, but calls the instance method if p != nullptr. This requires using a full class qualifier for the static methods, which isn't exactly beautiful. I have tried to remove as many cases as possible where it was clear p != nullptr. Bug: 22322814 Test: test-art-host Change-Id: I21adee3614aa697aa580cd1b86b72d9206e1cb24
2016-04-07Revert "Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals"" David Brazdil
This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) This CL fixed an issue with parsing quickened instructions. Bug: 27894376 Bug: 27998571 Bug: 27995065 Change-Id: I20dbe1bf2d0fe296377478db98cb86cba695e694
2016-04-04Revert "Refactor HGraphBuilder and SsaBuilder to remove HLocals" David Brazdil
Bug: 27995065 This reverts commit e3ff7b293be2a6791fe9d135d660c0cffe4bd73f. Change-Id: I5363c7ce18f47fd422c15eed5423a345a57249d8
2016-04-04Refactor HGraphBuilder and SsaBuilder to remove HLocals David Brazdil
This patch merges the instruction-building phases from HGraphBuilder and SsaBuilder into a single HInstructionBuilder class. As a result, it is not necessary to generate HLocal, HLoadLocal and HStoreLocal instructions any more, as the builder produces SSA form directly. Saves 5-15% of arena-allocated memory (see bug for more data): GMS 20.46MB => 19.26MB (-5.86%) Maps 24.12MB => 21.47MB (-10.98%) YouTube 28.60MB => 26.01MB (-9.05%) Bug: 27894376 Change-Id: Iefe28d40600c169c5d306fd2c77034ae19476d90
2016-04-04Build dominator tree before generating HInstructions David Brazdil
Second CL in the series of merging HGraphBuilder and SsaBuilder. This patch refactors the builders so that dominator tree can be built before any HInstructions are generated. This puts the SsaBuilder removal of HLoadLocals/HStoreLocals straight after HGraphBuilder's HInstruction generation phase. Next CL will therefore be able to merge them. This patch also adds util classes for iterating bytecode and switch tables which allowed to simplify the code. Bug: 27894376 Change-Id: Ic425d298b2e6e7980481ed697230b1a0b7904526
2016-03-29Optimizing: Improve const-string code generation. Vladimir Marko
For strings in the boot image, use either direct pointers or pc-relative addresses. For other strings, use PC-relative access to the dex cache arrays for AOT and direct address of the string's dex cache slot for JIT. For aosp_flounder-userdebug: - 32-bit boot.oat: -692KiB (-0.9%) - 64-bit boot.oat: -948KiB (-1.1%) - 32-bit dalvik cache total: -900KiB (-0.9%) - 64-bit dalvik cache total: -3672KiB (-1.5%) (contains more files than the 32-bit dalvik cache) For aosp_flounder-userdebug forced to compile PIC: - 32-bit boot.oat: -380KiB (-0.5%) - 64-bit boot.oat: -928KiB (-1.0%) - 32-bit dalvik cache total: -468KiB (-0.4%) - 64-bit dalvik cache total: -1928KiB (-0.8%) (contains more files than the 32-bit dalvik cache) Bug: 26884697 Change-Id: Iec7266ce67e6fedc107be78fab2e742a8dab2696
2016-03-24Merge "Optimizing: Do not insert suspend checks on back-edges." Vladimir Marko
2016-03-23Optimizing: Do not insert suspend checks on back-edges. Vladimir Marko
Rely on HGraph::SimplifyLoop() to insert suspend checks. CodeGenerator's CheckLoopEntriesCanBeUsedForOsr() checks the dex pcs of suspend checks against branch targets to verify that we always have an appropriate point for OSR transition. However, the HSuspendChecks that were added by HGraphBuilder to support the recently removed "baseline" interfered with this in a specific case, namely an infinite loop where the back-branch jumps to a nop. In that case, the HSuspendCheck added by HGraphBuilder had a dex pc different from the block and the branch target but its presence would stop the HGraph::SimplifyLoop() from adding a new HSuspendCheck with the correct dex pc. Bug: 27623547 Change-Id: I83566a260210bc05aea0c44509a39bb490aa7003
2016-03-23Revert "Revert "Use compiler filter to determine oat file status."" Andreas Gampe
This reverts commit 845e5064580bd37ad5014f7aa0d078be7265464d. Add an option to change what OatFileManager considers up-to-date. In our tests we're allowed to write to the dalvik-cache, so it cannot be kSpeed. Bug: 27689078 Change-Id: I0c578705a9921114ed1fb00d360cc7448addc93a
2016-03-23Revert "Use compiler filter to determine oat file status." Nicolas Geoffray
Bots are red. Tentative reverting as this is likely the offender. Bug: 27689078 This reverts commit a62d2f04a6ecf804f8a78e722a6ca8ccb2dfa931. Change-Id: I3ec6947a5a4be878ff81f26f17dc36a209734e2a
2016-03-22Use compiler filter to determine oat file status. Richard Uhler
Record the compiler filter in the oat header. Use that to determine when the oat file is up-to-date with respect to a target compiler filter level. New xxx-profile filter levels are added to specify if a profile should be used instead of testing for the presence of a profile file. This change should allow for different compiler-filters to be set for different package manager use cases. Bug: 27689078 Change-Id: Id6706d0ed91b45f307142692ea4316aa9713b023
2016-03-21Optimizing: Fix register allocator validation memory usage. Vladimir Marko
Also attribute ArenaBitVector allocations to appropriate passes. This was used to track down the source of the excessive memory alloactions. Bug: 27690481 Change-Id: Ib895984cb7c04e24cbc7abbd8322079bab8ab100
2016-02-26Optimizing: Reduce memory usage of HInstructions. Vladimir Marko
Pack narrow fields and flags into a single 32-bit field. Change-Id: Ib2f7abf987caee0339018d21f0d498f8db63542d
2016-02-24Remove HNativeDebugInfo from start of basic blocks. David Srbecky
We do not require full environment at the start of basic block. The dex pc contained in basic block is sufficient for line mapping. Change-Id: I5ba9e5f5acbc4a783ad544769f9a73bb33e2bafa
2016-02-15ART: Run SsaBuilder from HGraphBuilder David Brazdil
First step towards merging the two passes, which will later result in HGraphBuilder directly producing SSA form. This CL mostly just updates tests broken by not being able to inspect the pre-SSA form. Using HLocals outside the HGraphBuilder is now deprecated. Bug: 27150508 Change-Id: I00fb6050580f409dcc5aa5b5aa3a536d6e8d759e
2016-02-12ART: Remove HTemporary David Brazdil
Change-Id: I21b984224370a9ce7a4a13a9652503cfb03c5f03
2016-02-05Revert "Revert "Implement on-stack replacement for arm/arm64/x86/x86_64."" Nicolas Geoffray
This reverts commit bd89a5c556324062b7d841843b039392e84cfaf4. Change-Id: I08d190431520baa7fcec8fbdb444519f25ac8d44
2016-01-28Optimizing compiler support for directly calling interface methods Alex Light
This teaches the optimizing compiler how to perform invoke-super on interfaces. This should make the invokes generally faster. Bug: 24618811 Change-Id: I7f9b0fb1209775c1c8837ab5d21f8acba3cc72a5
2016-01-14ART: Remove incorrect HFakeString optimization David Brazdil
Simplification of HFakeString assumes that it cannot be used until String.<init> is called which is not true and causes different behaviour between the compiler and the interpreter. This patch removes the optimization together with the HFakeString instruction. Instead, HNewInstance is generated and an empty String allocated until it is replaced with the result of the StringFactory call. This is consistent with the behaviour of the interpreter but is too conservative. A follow-up CL will attempt to optimize out the initial allocation when possible. Bug: 26457745 Bug: 26486014 Change-Id: I7139e37ed00a880715bfc234896a930fde670c44
2016-01-06ART: Create BoundType for CheckCast early David Brazdil
ReferenceTypePropagation creates a BoundType for each CheckCast and replaces all dominated uses of the casted object with it. This does not include Phi uses on the boundary of the dominated scope, reducing typing precision. This patch creates the BoundType in Builder, causing SsaBuilder to replace uses of the object automatically. Bug: 26081304 Change-Id: I083979155cccb348071ff58cb9060a896ed7d2ac
2015-12-23Generate more stack maps during native debugging. David Srbecky
Generate extra stack map at the start of each java statement. The stack maps are later translated to DWARF which allows LLDB to set breakpoints and view local variables. Change-Id: If00ab875513308e4a1399d1e12e0fe8934a6f0c3
2015-12-10Don't generate a slow path for strings in the dex cache. Nicolas Geoffray
Change-Id: I1d258f1a89bf0ec7c7ddd134be9215d480f0b09a
2015-12-10Fix braino when resolving an invoke-super. Nicolas Geoffray
We should check the actual_method, and not the resolved_method, on whether it is in the same dex file. bug:26022686 Change-Id: I8a9b0c68e162015e0aec397545d0607482949967
2015-12-08ART: Check invoke-interface earlier in verifier Andreas Gampe
Invoke-interface should only be called on an interface method. Move the check earlier, as otherwise we'll try to resolve and potentially inject a method into the dex cache. Also templatize ResolveMethod with a version always checking the invoke type, and on a cache miss check whether type target type is an interface when an interface invoke type was given. Bug: 21869691 Change-Id: Ica27158f675b5aa223d9229248189612f4706832
2015-12-02Revert "Revert "Don't use the compiler driver for method resolution."" Nicolas Geoffray
This reverts commit c88ef3a10c474045a3476a02ae75d07ddd3230b7. Change-Id: I0ed88a48b313a8d28bc39fae40631123aadb13ef
2015-12-01Revert "Don't use the compiler driver for method resolution." Nicolas Geoffray
Fails 425 in debuggable mode. This reverts commit 4db0bf9c4db6a09716c3388b7d2f88d534470339. Change-Id: I346df8f75674564fc4fb241c60f23e250fc7f0a7
2015-12-01Don't use the compiler driver for method resolution. Nicolas Geoffray
The compiler driver makes assumptions that don't hold for the optimizing compiler, and will for example always go to slow path for an invoke-super when there's no verified method. Also fix GenerateInvokeVirtual in the presence of intrinsics. Next change will address some of the TODOs in sharpening.cc. Change-Id: I2b0e543ee9b9bebcadb2d26de29e850c59ad58b9
2015-11-24Optimize HLoadClass when we know the class is in the cache. Nicolas Geoffray
Change-Id: Iaa74591eed0f2eabc9ba9f9988681d9582faa320
2015-11-24A few more optimizations on avoiding HClinit. Nicolas Geoffray
Change-Id: I622a98b620e9d261cb654e2f5ab578bd8b3484b1
2015-11-23Fix lint error. Nicolas Geoffray
Change-Id: I29632dc7e49f7ec63040455fa40fcf87e9282e5e
2015-11-20Explicitly add HLoadClass/HClinitCheck for HNewInstance. Nicolas Geoffray
bug:25735083 bug:25173758 Change-Id: Ie81cfa4fa9c47cc025edb291cdedd7af209a03db
2015-11-18Revert "Revert "Enable store elimination for singleton objects."" Mingyao Yang
This reverts commit 55d02cf056f993aeafebd54e7b7c68c7a48507c9, and makes the following change: Currently we leverage loop side effects to decide whether heap values are killed by the loop. Stores need to be kept if heap values may be killed by loops and the corresponding loads cannot be eliminated. Similar thing need to be done for each predecessor when we merge predecessor heap values. To do that, the HInstanceFieldSet instruction itself is put in the heap value array instead of the value of the store instruction. The store instruction may be added to possibly_removed_stores_ first, but can later be removed from possibly_removed_stores_ when it's found out that the store needs to be kept due to merging/loop side effects. Change-Id: I4f7bb1960f7b47240873e00ff1adac46fc102a02
2015-11-09Merge "Optimizing: Remove unused ArtMethod* input from HInvokeStaticOrDirect." Vladimir Marko
2015-11-06Optimizing: Remove unused ArtMethod* input from HInvokeStaticOrDirect. Vladimir Marko
Change-Id: Iea99fa683440673ff517e246f35fade96600f229
2015-11-06ART: Fix simplification of catch blocks in the presence of dead code David Brazdil
Simplification of catch blocks transforms the code so that catch blocks have only exceptional predecessors. However, it is invoked before trivially dead code is eliminated which breaks simple assumptions such as the fact that a catch block cannot start with move-exception if it has non-exceptional predecessors. This patch fixes the algorithm to work under these relaxed conditions. Bug: 25494450 Bug: 25492628 Change-Id: Idc8d010102a4b8b9a6cd918b98d6e11d1838db0c
2015-10-29Revert "Enable store elimination for singleton objects." Andreas Gampe
This reverts commit 7f43a3d48fc29045875d50e10bbc5d6ffc25d61e. Fails booting. Bug: 25357772 Change-Id: Ied19536f3ce8d81e76885cb6baed4853e2ed6714
2015-10-27Enable store elimination for singleton objects. Mingyao Yang
Enable store elimination for singleton objects. However for finalizable object, don't eliminate stores. Also added a testcase. Change-Id: Icf991e7ded5b490f55f580ef928ece5c45e89902
2015-10-27Merge "Optimizing: Determine invoke-static/-direct dispatch early." Vladimir Marko
2015-10-23Optimizing: Determine invoke-static/-direct dispatch early. Vladimir Marko
Determine the dispatch type of invoke-static/-direct in a special pass right after the type inference. This allows the inliner to pass the "needs dex cache" check and inline more. It also allows the code generator to avoid requesting a register location for the ArtMethod* for kDexCachePcRelative and direct methods. The supported dispatch check handles also situations that the CompilerDriver currently doesn't allow. The cleanup of the CompilerDriver and required changes to Quick will come in a separate change. Change-Id: I3f8e903a119949e95871d8ab0a995f4731a13a07
2015-10-22Revert "Revert "load store elimination."" Mingyao Yang
This reverts commit 8030c4100d2586fac39ed4007c61ee91d4ea4f25. Change-Id: I79558d85484be5f5d04e4a44bea7201fece440f0
2015-10-20Merge "Revert "Revert "optimizing: propagate type information of arguments""" Calin Juravle