summaryrefslogtreecommitdiff
path: root/compiler/optimizing/scheduler.cc
AgeCommit message (Collapse)Author
2025-03-19Introduce abstract instruction `HFieldAccess`. Vladimir Marko
Change-Id: Iaccc3a000f53a4b7198a45f04142983897f194f4
2024-02-13Optimizing: Refactor `HScheduler`. Vladimir Marko
Move `SchedulingLatencyVisitor{ARM,ARM64}` to .cc files. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: run-gtests.sh Test: testrunner.py --target --optimizing Change-Id: I15cb1a4cbef00a328fec947189412c502bf80f46
2024-02-13Unresolved field access HIR are not schedulable. Vladimir Marko
Add a comment and remove dead code. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: Ie14a1bd9633f322be7cb4bb312a1232df6697cbd
2024-01-31Do not create random scheduling selector if not needed. Vladimir Marko
Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 181943478 Change-Id: I1fc3234243315ee1133d5a12c52d7f42ea8273bc
2023-12-06Remove partial LSE Santiago Aboy Solanes
It has been disabled for a while and it has bit rotted Bug: 298176183 Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing Test: m test-art-host-gtest-art_compiler_tests64 Change-Id: I4fcd8b3d18a3388e078b5cb3c340b2e270aefef7
2023-04-27Optimizing: Add `HInstruction::As##type()`. Vladimir Marko
After the old implementation was renamed in https://android-review.googlesource.com/2526708 , we introduce a new function with the old name but new behavior, just `DCHECK()`-ing the instruction kind before casting down the pointer. We change appropriate calls from `As##type##OrNull()` to `As##type()` to avoid unncessary run-time checks and reduce the size of libart-compiler.so. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: run-gtests.sh Test: testrunner.py --target --optimizing Bug: 181943478 Change-Id: I025681612a77ca2157fed4886ca47f2053975d4e
2023-04-27Optimizing: Rename `As##type` to `As##type##OrNull`. Vladimir Marko
The null type check in the current implementation of `HInstruction::As##type()` often cannot be optimized away by clang++. It is therefore beneficial to have two functions HInstruction::As##type() HInstruction::As##type##OrNull() where the first function never returns null but the second one can return null. The additional text "OrNull" shall also flag the possibility of yielding null to the developer which may help avoid bugs similar to what we have seen previously. This requires renaming the existing function that can return null and introducing new function that cannot. However, defining the new function `HInstruction::As##type()` in the same change as renaming the old one would risk introducing bugs by missing a rename. Therefore we simply rename the old function here and the new function shall be introduced in a separate change with all behavioral changes being explicit. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: buildbot-build.sh --target Bug: 181943478 Change-Id: I4defd85038e28fe3506903ba3f33f723682b3298
2022-11-07Reland "Make compiler/optimizing/ symbols hidden." VladimĂ­r Marko
This reverts commit 0a51605ddd81635135463dab08b6f7c21b58ffb0. Reason for revert: Reland after some of the required work was merged in other CLs. Also address a TODO from the original CL to mark required symbols with EXPORT in `intrinsic_objects.h`. Also mark symbols in new files as HIDDEN. Bug: 186902856 Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I936d448983928af23614ca82c2d0bf9a645e2c52
2022-10-24Allow LSA to run with acquire/release operations Santiago Aboy Solanes
LSA will run in graphs with acquire loads (i.e. monitor enter and volatile load) and release stores (i.e. monitor exit and volatile stores). Helps both LSE and the Scheduler, and brings code size and memory use reductions. For example, ~40KB (~0.1%) reduction in memory use when compiling android framework in armv8. Code size gains (locally run on Pixel 5 w/ AOSP): Android Google Search App (AGSA): 209KB System server: 44KB System UI: 20KB which is ~0.1% for each compile. Bug: 227283233 Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I9ac79cf2324348414186f95e531c98b4215b28ea
2022-08-09Rename HNativeDebugInfo to HNop Santiago Aboy Solanes
We can generalize HNativeDebugInfo to be used as a Nop (i.e. no instructions are generated), and give it the option of having an environment to keep the current HNativeDebugInfo logic working. Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: I06b3a36e8b124bcda858d2c9cd8ff0ab21caea36
2022-02-25Update compiler/ implications to use (D)CHECK_IMPLIES Santiago Aboy Solanes
Follow-up to aosp/1988868 in which we added the (D)CHECK_IMPLIES macro. This CL uses it on compiler/ occurrences found by a regex. Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b Change-Id: If63aed969bfb8b31d6fbbcb3bca2b04314c894b7
2021-05-21Fix scheduler's `FieldAccessHeapLocation()`. Vladimir Marko
Use the correct target for predicated get. Also remove an always-false condition from LSE. Test: m Bug: 188188275 Bug: 188847019 Change-Id: I731e181c8c0d812120dc4fad0c011158053fa7a8
2021-01-25Revert^4 "Partial Load Store Elimination" Alex Light
This reverts commit 791df7a161ecfa28eb69862a4bc285282463b960. This unreverts commit fc1ce4e8be0d977e3d41699f5ec746d68f63c024. This unreverts commit b8686ce4c93eba7192ed7ef89e7ffd9f3aa6cd07. We incorrectly failed to include PredicatedInstanceFieldGet in a few conditions, including a DCHECK. This caused tests to fail under the read-barrier-table-lookup configuration. Reason for revert: Fixed 2 incorrect checks Bug: 67037140 Test: ./art/test/testrunner/run_build_test_target.py -j70 art-gtest-read-barrier-table-lookup Change-Id: I32b01b29fb32077fb5074e7c77a0226bd1fcaab4
2021-01-24Revert "Revert^2 "Partial Load Store Elimination"" Nicolas Geoffray
This reverts commit fc1ce4e8be0d977e3d41699f5ec746d68f63c024. Bug: 67037140 Reason for revert: Fails read-barrier-table-lookup tests. Change-Id: I373867c728789bc14a4370b93a045481167d5f76
2021-01-22Revert^2 "Partial Load Store Elimination" Alex Light
This reverts commit 47ac53100303e7e864b7f6d65f17b23088ccf1d6. There was a bug in LSE where we would incorrectly record the shadow$_monitor_ field as not having a default initial value. This caused partial LSE to be unable to compile the Object.identityHashCode function, causing crashes. This issue was fixed in a parent CL. Also updated all Offsets in LSE_test to be outside of the object header regardless of configuration. Test: ./test.py --host Bug: 67037140 Reason for revert: Fixed issue with shadow$_monitor_ field and offsets Change-Id: I4fb2afff4d410da818db38ed833927dfc0f6be33
2021-01-22Revert "Partial Load Store Elimination" Nicolas Geoffray
This reverts commit b8686ce4c93eba7192ed7ef89e7ffd9f3aa6cd07. Bug: 67037140 Reason for revert: Fails a few tests. Change-Id: Icf0635bffbfbba93bf0a5b854a9582c418198136
2021-01-21Partial Load Store Elimination Alex Light
Add partial load-store elimination to the LSE pass. Partial LSE will move object allocations which only escape along certain execution paths closer to the escape point and allow more values to be eliminated. It does this by creating new predicated load and store instructions that are used when an object has only escaped some of the time. In cases where the object has not escaped a default value will be used. Test: ./test.py --host Test: ./test.py --target Bug: 67037140 Change-Id: Idde67eb59ec90de79747cde17b552eec05b58497
2020-11-18Revert^4 "Partial LSE analysis & store removal" Alex Light
We incorrectly handled merging unknowns in some situations. Specifically in cases where we are unable to materialize loop-phis we could end up with PureUnknowns which could end up hiding stores that need to be kept. In an unrelated issue we were incorrectly considering some values as escapes when live at the point of an invoke. Since SearchPhiPlaceholdersForKeptStores used a more precise notion of escapes we could end up removing stores without being able to replace the values. This reverts commit 2316b3a0779f3721a78681f5c70ed6624ecaebef. This unreverts commit b6837f0350ff66c13582b0e94178dd5ca283ff0a This reverts commit fe270426c8a2a69a8f669339e83b86fbf40e25a1. This unreverts commit bb6cda60e4418c0ab557ea4090e046bed8206763. Bug: 67037140 Bug: 173120044 Reason for revert: Fixed issue causing incorrect store elimination Test: ./test.py --host Test: Boot cuttlefish atest FrameworksServicesTests:com.android.server.job.BackgroundRestrictionsTest#testPowerWhiteList Change-Id: I2ebae9ccfaf5169d551c5019b547589d0fce1dc9
2020-11-14Revert^3 "Partial LSE analysis & store removal" Alex Light
This reverts commit b6837f0350ff66c13582b0e94178dd5ca283ff0a This unreverts commit fe270426c8a2a69a8f669339e83b86fbf40e25a1. This rereverts commit bb6cda60e4418c0ab557ea4090e046bed8206763. Bug: 67037140 Bug: 173120044 Reason for revert: Git-blame seems to point to the CL as cause of b/173120044. Revert during investigation. Change-Id: I46f557ce79c15f07f4e77aacded1926b192754c3
2020-11-13Revert^2 "Partial LSE analysis & store removal" Alex Light
A ScopedArenaAllocator in a single test was accidentally loaded using operator new which is not supported. This caused a memory leak. This reverts commit fe270426c8a2a69a8f669339e83b86fbf40e25a1. This unreverts commit bb6cda60e4418c0ab557ea4090e046bed8206763. Bug: 67037140 Reason for revert: Fixed memory leak in LoadStoreAnalysisTest.PartialEscape test case Test: SANITIZE_HOST=address ASAN_OPTIONS=detect_leaks=0 m test-art-host-gtest-dependencies Run art_compiler_tests Change-Id: I34fa2079df946ae54b8c91fa771a44d56438a719
2020-11-12Revert "Partial LSE analysis & store removal" Nicolas Geoffray
This reverts commit bb6cda60e4418c0ab557ea4090e046bed8206763. Bug: 67037140 Reason for revert: memory leak detected in the test. Change-Id: I81cc2f61494e96964d8be40389eddcd7c66c9266
2020-11-12Partial LSE analysis & store removal Alex Light
This is the first piece of partial LSE for art. This CL adds analysis tools needed to implement partial LSE. More immediately, it improves LSE so that it will remove stores that are provably non-observable based on the location they occur. For example: ``` Foo o = new Foo(); if (xyz) { check(foo); foo.x++; } else { foo.x = 12; } return foo.x; ``` The store of 12 can be removed because the only escape in this method is unreachable and was not executed by the point we reach the store. The main purpose of this CL is to add the analysis tools needed to implement partial Load-Store elimination. Namely it includes tracking of which blocks are escaping and the groups of blocks that we cannot remove allocations from. The actual impact of this change is incredibly minor, being triggered only once in a AOSP code. go/lem shows only minor effects to compile-time and no effect on the compiled code. See go/lem-allight-partial-lse-2 for numbers. Compile time shows an average of 1.4% regression (max regression is 7% with 0.2 noise). This CL adds a new 'reachability' concept to the HGraph. If this has been calculated it allows one to quickly query whether there is any execution path containing two blocks in a given order. This is used to define a notion of sections of graph from which the escape of some allocation is inevitable. Test: art_compiler_tests Test: treehugger Bug: 67037140 Change-Id: I0edc8d6b73f7dd329cb1ea7923080a0abe913ea6
2020-06-08Run LSA as a part of the LSE pass. Vladimir Marko
Make LSA a helper class, not an optimization pass. Move all its allocations to ScopedArenaAllocator to reduce the peak memory usage a little bit. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I7fc634abe732d22c99005921ffecac5207bcf05f
2019-10-29Fix intersecting live ranges created by instruction scheduler Evgeny Astigeevich
When scheduling code like the following: LOOP: v2=phi(v0, v1) use(v2) v1=... goto LOOP the instruction scheduler can move 'v1=...' before 'use(v2)'. This causes live ranges of v1 and v2 to intersect and results to a MOV instruction to be created. The CL fixes this. Improvements, Pixel3: Little CPU, arm64 micro/GCCLoops Example12 14.1% Example10b 11.0% Example23 8.1% Example24 6.6% Example10a 5.0% FFT workload 4.7% Compress workload 1.2% Little CPU, arm32 micro/GCCLoops Example23 7.5% Example24 4.3% MonteCarlo workload 1.35% Big CPU, arm32 and arm64 No significant improvements No significant regressions (> 5%) are found. Test: test.py --host --optimizing --jit --gtest Test: test.py --target --optimizing --jit Test: run-gtests.sh Change-Id: I1e4282af18f2d51fde5325a0c00a57e8bbc4fbed
2019-10-14Revert "Make compiler/optimizing/ symbols hidden." Vladimir Marko
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf. Reason for revert: Breaks ASAN tests (ODR violation). Bug: 142365358 Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
2019-10-14Make compiler/optimizing/ symbols hidden. Vladimir Marko
Make symbols in compiler/optimizing hidden by a namespace attribute. The unit intrinsic_objects.{h,cc} is excluded as it is needed by dex2oat. As the symbols are no longer exported, gtests are now linked with the static version of the libartd-compiler library. libart-compiler.so size: - before: arm: 2396152 arm64: 3345280 - after: arm: 2016176 (-371KiB, -15.9%) arm64: 2874480 (-460KiB, -14.1%) Test: m test-art-host-gtest Test: testrunner.py --host --optimizing --jit Bug: 142365358 Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
2019-05-16ART: Refactor SchedulingGraph for consistency and clarity Evgeny Astigeevich
The CL moves functionality from SchedulingGraph to other classes, deletes unused code and moves code used for testing to the tests source file: 1. SchedulingGraph::AddDependency: move checks whether a dependency has been added to SchedulingNode::Add*Predecessor as it is a SchedulingNode responsibility to keep a unique set of predecessors. 2. Create SideEffectDependencyAnalysis class. Code doing side effect dependency analysis is moved from SchedulingGraph into the class. 3. Remove SchedulingGraph::HasImmediate*Dependency methods as there are SchedulingNode::Has*Dependency methods for such kind of checks. 4. SchedulingGraph::HasImmediate*Dependency(HInstruction,HInstruction) methods are only used by tests. Their code is moved to a new class TestSchedulingGraph in the tests source file. Test: test.py --host --optimizing --jit --gtest Test: test.py --target --optimizing --jit Test: run-gtests.sh Change-Id: Id16eb6e9f8b9706e616dff0ccc1d0353ed968367
2018-12-27ART: Refactor for bugprone-argument-comment Andreas Gampe
Handles compiler. Bug: 116054210 Test: WITH_TIDY=1 mmma art Change-Id: I5cdfe73c31ac39144838a2736146b71de037425e
2018-08-16Reduce memory usage by other deps in scheduler. Vladimir Marko
Rely on transitive dependencies instead of adding the full set of side effect dependencies. Compiling a certain apk, the most memory hungry method has the scheduler memory allocations in ArenaStack hidden by the register allocator: - before: MEM: used: 155744672, allocated: 168446408, lost: 12036488 Scheduler 155744672 - after: MEM: used: 5181680, allocated: 7096776, lost: 114752 SsaLiveness 4683440 RegAllocator 314312 RegAllocVldt 183928 The total arena memory used, including the ArenaAllocator not listed above, goes from 167170024 to 16607032 (-90%). (Measured with kArenaAllocatorCountAllocations=true, kArenaAllocatorPreciseTracking=false.) Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 64312607 Change-Id: I825bcfb490171070c46ad6d1f460785f4e75cfd7
2018-08-02Reuse arena memory for each block in scheduler. Vladimir Marko
This reduces the peak memory used for large methods with multiple blocks to schedule. Compiling the aosp_taimen-userdebug boot image, the most memory hungry method BatteryStats.dumpLocked has the Scheduler memory allocations in ArenaStack hidden by the register allocator: - before: MEM: used: 8300224, allocated: 9175040, lost: 197360 Scheduler 8300224 - after: MEM: used: 5914296, allocated: 7864320, lost: 78200 SsaLiveness 5532840 RegAllocator 144968 RegAllocVldt 236488 The total arena memory used, including the ArenaAllocator not listed above, goes from 44333648 to 41950324 (-5.4%). (Measured with kArenaAllocatorCountAllocations=true, kArenaAllocatorPreciseTracking=false.) Also remove one unnecessary -Wframe-larger-than= workaround and add one workaround for large frame with the above arena alloc tracking flags. Test: m test-art-host-gtest Test: testrunner.py --host Bug: 34053922 Change-Id: I7fd8d90dcc13b184b1e5bd0bcac072388710a129
2018-05-15Refactoring LSE/LSA: introduce heap location type Aart Bik
Rationale: This refactoring introduces data types to heap locations. This will allow better type disambiguation in the future. As a first showcase, it already removes rather error-prone "exceptional" code in LSE dealing with array types on null values. Furthermore, many LSA specific details started to "leak" into clients, which is also error-prone. This refactoring moves such details back into just LSA, where it belongs. Test: test-art-host,target Bug: b/77906240 Change-Id: Id327bbe86dde451a942c9c5f9e83054c36241882
2018-04-26Step 1 of 2: conditional passes. Aart Bik
Rationale: The change adds a return value to Run() in preparation of conditional pass execution. The value returned by Run() is best effort, returning false means no optimizations were applied or no useful information was obtained. I filled in a few cases with more exact information, others still just return true. In addition, it integrates inlining as a regular pass, avoiding the ugly "break" into optimizations1 and optimziations2. Bug: b/78171933, b/74026074 Test: test-art-host,target Change-Id: Ia39c5c83c01dcd79841e4b623917d61c754cf075
2018-03-07Introduce MIN/MAX/ABS as HIR nodes. Aart Bik
Rationale: Having explicit MIN/MAX/ABS operations (in contrast with intrinsics) simplifies recognition and optimization of these common operations (e.g. constant folding, hoisting, detection of saturation arithmetic). Furthermore, mapping conditionals, selectors, intrinsics, etc. (some still TBD) onto these operations generalizes the way they are optimized downstream substantially. Bug: b/65164101 Test: test-art-host,target Change-Id: I69240683339356e5a012802f179298f0b04c6726
2018-03-01Introduce ABS as HIR nodes. Aart Bik
NOTE: step 1 of 2 for "Introduce MIN/MAX/ABS as HIR nodes." Rationale: Having explicit MIN/MAX/ABS operations (in contrast with intrinsics) simplifies recognition and optimization of these common operations (e.g. constant folding, hoisting, detection of saturation arithmetic). Furthermore, mapping conditionals, selectors, intrinsics, etc. (some still TBD) onto these operations generalizes the way they are optimized downstream substantially. Bug: b/65164101 Test: test-art-host,target Change-Id: I9c93987197216158ba02c8aca2385086adedabc4
2017-11-09Support VecLoad and VecStore in LSA. xueliang.zhong
Test: test-art-host Test: test-art-target Test: load_store_analysis_test Change-Id: I7d819061ec9ea12f86a926566c3845231fce6e26
2017-11-02ART: Make InstructionSet an enum class and add kLast. Vladimir Marko
Adding InstructionSet::kLast shall make it easier to encode the InstructionSet in fewer bits using BitField<>. However, introducing `kLast` into the `art` namespace is not a good idea, so we change the InstructionSet to an enum class. This also uncovered a case of InstructionSet::kNone being erroneously used instead of vixl32::Condition::None(), so it's good to remove `kNone` from the `art` namespace. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: I6fa6168dfba4ed6da86d021a69c80224f09997a6
2017-10-11Use ScopedArenaAllocator for building HGraph. Vladimir Marko
Memory needed to compile the two most expensive methods for aosp_angler-userdebug boot image: BatteryStats.dumpCheckinLocked() : 21.1MiB -> 20.2MiB BatteryStats.dumpLocked(): 42.0MiB -> 40.3MiB This is because all the memory previously used by the graph builder is reused by later passes. And finish the "arena"->"allocator" renaming; make renamed allocator pointers that are members of classes const when appropriate (and make a few more members around them const). Test: m test-art-host-gtest Test: testrunner.py --host Bug: 64312607 Change-Id: Ia50aafc80c05941ae5b96984ba4f31ed4c78255e
2017-10-09Use ScopedArenaAllocator for register allocation. Vladimir Marko
Memory needed to compile the two most expensive methods for aosp_angler-userdebug boot image: BatteryStats.dumpCheckinLocked() : 25.1MiB -> 21.1MiB BatteryStats.dumpLocked(): 49.6MiB -> 42.0MiB This is because all the memory previously used by Scheduler is reused by the register allocator; the register allocator has a higher peak usage of the ArenaStack. And continue the "arena"->"allocator" renaming. Test: m test-art-host-gtest Test: testrunner.py --host Bug: 64312607 Change-Id: Idfd79a9901552b5147ec0bf591cb38120de86b01
2017-10-06ART: Use ScopedArenaAllocator for pass-local data. Vladimir Marko
Passes using local ArenaAllocator were hiding their memory usage from the allocation counting, making it difficult to track down where memory was used. Using ScopedArenaAllocator reveals the memory usage. This changes the HGraph constructor which requires a lot of changes in tests. Refactor these tests to limit the amount of work needed the next time we change that constructor. Test: m test-art-host-gtest Test: testrunner.py --host Test: Build with kArenaAllocatorCountAllocations = true. Bug: 64312607 Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
2017-09-25ART: Introduce compiler data type. Vladimir Marko
Replace most uses of the runtime's Primitive in compiler with a new class DataType. This prepares for introducing new types, such as Uint8, that the runtime does not need to know about. Test: m test-art-host-gtest Test: testrunner.py --host Bug: 23964345 Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
2017-08-10Merge "scheduler should not schedule volatile field accesses." Mingyao Yang
2017-08-10scheduler should not schedule volatile field accesses. Mingyao Yang
Unresolved field accesses are not scheduled either since it's not know whether they are volatile or not, and they are already expensive anyway. Test: 706-checker-scheduler Change-Id: Ie736542590a2459ee9b597e090fbedd4b527782a
2017-08-09Run HeapLocationCollector once in scheduler instead of locally. Mingyao Yang
HeapLocationCollector does alias analysis globally instead of at block scale. For example doing it locally breaks the pre-existence based alias analysis. It's also expensive to do it for each basic block. Test: run-test/gtest on target/host, 662-regression-alias Bug: 64018485 Change-Id: If001e2961b5a52b50b1bcefd5e4a89d9c25f25b8
2017-06-30Disambiguate memory accesses in instruction scheduling xueliang.zhong
Based on aliasing information from heap location collector, instruction scheduling can further eliminate side-effect dependencies between memory accesses to different locations, and perform better scheduling on memory loads and stores. Performance improvements of this CL, measured on Cortex-A53: | benchmarks | ARM64 backend | ARM backend | |----------------+---------------|-------------| | algorithm | 0.1 % | 0.1 % | | benchmarksgame | 0.5 % | 1.3 % | | caffeinemark | 0.0 % | 0.0 % | | math | 5.1 % | 5.0 % | | stanford | 1.1 % | 0.6 % | | testsimd | 0.4 % | 0.1 % | Compilation time impact is negligible, because this heap location load store analysis is only performed on loop basic blocks that get instruction scheduled. Test: m test-art-host Test: m test-art-target Test: 706-checker-scheduler Change-Id: I43d7003c09bfab9d3a1814715df666aea9a7360b
2017-06-29Add CHECKs to help diagnose a crash seen internally. Nicolas Geoffray
bug: 62855731 Test: test.py Change-Id: I7904257174ce11a138ca769172dbc2e33e10ef76
2017-05-11Clean up some uses of "auto". Vladimir Marko
Make actual types more explicit, either by replacing "auto" with actual type or by assigning std::pair<> elements of an "auto" variable to typed variables. Avoid binding const references to temporaries. Avoid copying a container. Test: m test-art-host-gtest Change-Id: I1a59f9ba1ee15950cacfc5853bd010c1726de603
2017-05-08Instruction scheduling for ARM. xueliang.zhong
Performance improvements on various benchmarks with this CL: benchmarks improvements --------------------------- algorithm 1% benchmarksgame 2% caffeinemark 2% math 3% stanford 4% Tested on ARM Cortex-A53 CPU. The code size impact is negligible. Test: m test-art-host Test: m test-art-target Change-Id: I314c90c09ce27e3d224fc686ef73c7d94a6b5a2c
2017-01-25AArch64: Add HInstruction scheduling support. Alexandre Rames
This commit adds a new `HInstructionScheduling` pass that performs basic scheduling on the `HGraph`. Currently, scheduling is performed at the block level, so no `HInstruction` ever leaves its block in this pass. The scheduling process iterates through blocks in the graph. For blocks that we can and want to schedule: 1) Build a dependency graph for instructions. It includes data dependencies (inputs/uses), but also environment dependencies and side-effect dependencies. 2) Schedule the dependency graph. This is a topological sort of the dependency graph, using heuristics to decide what node to schedule first when there are multiple candidates. Currently the heuristics only consider instruction latencies and schedule first the instructions that are on the critical path. Test: m test-art-host Test: m test-art-target Change-Id: Iec103177d4f059666d7c9626e5770531fbc5ccdc