Age | Commit message (Collapse) | Author |
|
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 331194861
Change-Id: If1b467579613864fd924fbb8776ccddaa9ad18c1
|
|
By adding extra bookkeeping, we can speed up the ValueSet reusable
algorithm of GVN. A block's ValueSet is reused if it will not be
used again in the future. This happens when said blocks last
dominated block or successor is visited, since we visit blocks
in RPO. This lets us create a list of free sets to be reused and
skip iterating through all visited blocks. This optimization is
better the bigger the graph.
Based on local compiles, this speeds up GVN by 12-27% and overall
compilation times by 0.6-1.6%.
Bug: 393108375
Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing
Change-Id: I3731e67796a2055907b795656146aaea4f448614
|
|
Skip calling DeleteAllImpureWhich for side effects which
MayDependOn will always return false, which happened 65-75% of
the times. In fact, SideEffects::None() was passed on ~50% of
the calls to Kill.
Based on local compiles, this CL improves GVN runtime by ~15% and
overall dex2oat runtime by ~1%.
Bug: 393108375
Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing
Change-Id: Ib5cdb33c9caa5f2cfffbc1a650dabbda185a3c6d
|
|
After GVN, we deduplicate instructions and we might have more
information regarding the nullability of the SSA variable. We can
run RTP to insert the BoundType instructions, which will later be
used by InstructionSimplifier to remove the NullCheck. RTP will be
run conditionally, only if GVN did at least one replacement.
It improves ~0.1% of odex size for speed compiled apps.
Bug: 369206455
Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing
Change-Id: I0a4a104690b3fe9ac4118642ab9e9916dc30a9c5
|
|
ArenaBitVector creation guarantees it starts empty. Add a debug
check to make sure this assumption doesn't change.
Note that ArenaAllocator guarantees zero-initialized memory but
ScopedArenaAllocators do not. This is fine either way since the
BitVector constructor calls ClearAllBits.
Bug: 329037671
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: Icbf5e5dd1869e80b5d5828ecca9f13de30c0242b
|
|
On local compiles I saw that DeleteAllImpureWhich was the third
most time consuming method, using pprofs sorting in bottom-up.
By passing the functor it uses as rvalue, the method speeds up
~10% and it drops from the third most time consuming to the fourth.
The other modified methods in the CL are not showing up in the
pprof profile that I took, but changing them to use rvalues
shouldn't affect them negatively.
Test: Locally compile, take a trace, and observe time spent
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: I6c363d5601fd4865f4e7881e64b883bd6bbedb69
|
|
After the old implementation was renamed in
https://android-review.googlesource.com/2526708 ,
we introduce a new function with the old name but new
behavior, just `DCHECK()`-ing the instruction kind before
casting down the pointer. We change appropriate calls from
`As##type##OrNull()` to `As##type()` to avoid unncessary
run-time checks and reduce the size of libart-compiler.so.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Bug: 181943478
Change-Id: I025681612a77ca2157fed4886ca47f2053975d4e
|
|
The null type check in the current implementation of
`HInstruction::As##type()` often cannot be optimized away
by clang++. It is therefore beneficial to have two functions
HInstruction::As##type()
HInstruction::As##type##OrNull()
where the first function never returns null but the second
one can return null. The additional text "OrNull" shall also
flag the possibility of yielding null to the developer which
may help avoid bugs similar to what we have seen previously.
This requires renaming the existing function that can return
null and introducing new function that cannot. However,
defining the new function `HInstruction::As##type()` in the
same change as renaming the old one would risk introducing
bugs by missing a rename. Therefore we simply rename the old
function here and the new function shall be introduced in a
separate change with all behavioral changes being explicit.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: buildbot-build.sh --target
Bug: 181943478
Change-Id: I4defd85038e28fe3506903ba3f33f723682b3298
|
|
This reverts commit 0a51605ddd81635135463dab08b6f7c21b58ffb0.
Reason for revert: Reland after some of the required work
was merged in other CLs.
Also address a TODO from the original CL to mark required
symbols with EXPORT in `intrinsic_objects.h`.
Also mark symbols in new files as HIDDEN.
Bug: 186902856
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I936d448983928af23614ca82c2d0bf9a645e2c52
|
|
The GVN system didn't handle the deoptimization very well since
deoptimize is a predicated operation. This means that HDeoptimize
needs to prevent some code motion but if it isn't taken the operation
has no effect. This confused the GVN system into thinking that it
cannot merge deoptimizations. To fix this we special cased the side
effects GVN considers deoptimizations to have.
Test: ./test.py --host
Change-Id: Ic79d975f9ae584a07026647cee2768ed1105e5a9
|
|
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf.
Reason for revert: Breaks ASAN tests (ODR violation).
Bug: 142365358
Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
|
|
Make symbols in compiler/optimizing hidden by a namespace
attribute. The unit intrinsic_objects.{h,cc} is excluded as
it is needed by dex2oat.
As the symbols are no longer exported, gtests are now linked
with the static version of the libartd-compiler library.
libart-compiler.so size:
- before:
arm: 2396152
arm64: 3345280
- after:
arm: 2016176 (-371KiB, -15.9%)
arm64: 2874480 (-460KiB, -14.1%)
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Bug: 142365358
Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
|
|
GVN is now using ScopedArenaAllocator and explicitly
filling allocated memory with zeros as needed.
Test: Rely on TreeHugger.
Change-Id: Ic2059014ee4ab9157a8cce64ba0206f9b276481e
|
|
Handles compiler.
Bug: 116054210
Test: WITH_TIDY=1 mmma art
Change-Id: I5cdfe73c31ac39144838a2736146b71de037425e
|
|
Support BoundType instruction treatment in GVN.
Note: BoundType must not be a subject to LICM as it must not be
moved from more control dependent basic blocks to less control
dependent (e.g. hoisted out from the loop) due to semantics of
bounding the type.
Test: 477-checker-bound-type.
Test: test-art-target, test-art-host.
Change-Id: I64263d6ec7d9ad75d1fb07d3a89e9973be67682b
|
|
Rationale:
The change adds a return value to Run() in preparation of
conditional pass execution. The value returned by Run() is
best effort, returning false means no optimizations were
applied or no useful information was obtained. I filled
in a few cases with more exact information, others
still just return true. In addition, it integrates inlining
as a regular pass, avoiding the ugly "break" into
optimizations1 and optimziations2.
Bug: b/78171933, b/74026074
Test: test-art-host,target
Change-Id: Ia39c5c83c01dcd79841e4b623917d61c754cf075
|
|
Enforce the layering that code in runtime/base should not depend on
runtime by separating it into libartbase. Some of the code in
runtime/base depends on the Runtime class, so it cannot be moved yet.
Also, some of the tests depend on CommonRuntimeTest, which itself needs
to be factored (in a subsequent CL).
Bug: 22322814
Test: make -j 50 checkbuild
make -j 50 test-art-host
Change-Id: I8b096c1e2542f829eb456b4b057c71421b77d7e2
|
|
HClinitCheck obviously does reads so it's side effects should include
all reads and writes, just like HInvoke. GVN now explicitly allows
clinit check to be reused, which would otherwise be disallowed based on
the dependency introduced by the new side effects. Also make licm's
logic cleaner and treat clinit check as a special case also,
otherwise licm can't hoist clinit check due to the dependency
introduced by the new side effects also.
Test: run-test on host.
Change-Id: I16886cfe557803d84d84ce68fbb185ebfc0b84dc
|
|
Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 20.0MiB -> 19.7MiB (-331KiB)
BatteryStats.dumpLocked(): 39.9MiB -> 39.4MiB (-458KiB)
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 64312607
Change-Id: I4bbb21bf3ecc4b91db7c374737c5f0e0cb7fa462
|
|
Passes using local ArenaAllocator were hiding their memory
usage from the allocation counting, making it difficult to
track down where memory was used. Using ScopedArenaAllocator
reveals the memory usage.
This changes the HGraph constructor which requires a lot of
changes in tests. Refactor these tests to limit the amount
of work needed the next time we change that constructor.
Test: m test-art-host-gtest
Test: testrunner.py --host
Test: Build with kArenaAllocatorCountAllocations = true.
Bug: 64312607
Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
|
|
Make actual types more explicit, either by replacing "auto"
with actual type or by assigning std::pair<> elements of
an "auto" variable to typed variables. Avoid binding const
references to temporaries. Avoid copying a container.
Test: m test-art-host-gtest
Change-Id: I1a59f9ba1ee15950cacfc5853bd010c1726de603
|
|
Test: m build-art-host
Test: m cpplint-art
Change-Id: Ifc6ce3d0d645c4a8dca72dd483fc03fc05077130
|
|
Use range-based loops instead, introducing helper functions
ReverseRange() for iteration in reverse order in containers.
When the contents of the underlying container change inside
the loop, use an index-based loop that better exposes the
container data modifications, compared to the old iterator
interface that's hiding it which may lead to subtle bugs.
Test: m test-art-host
Change-Id: I2a4e6c508b854c37a697fc4b1e8423a8c92c5ea0
|
|
GVN was implicitly extending the liveness of an instruction across
an irreducible loop.
Fix this problem by clearing the value set at loop entries that contain
an irreducible loop.
bug:28252896
Change-Id: I68823cb88dceb4c2b4545286ba54fd0c958a48b0
|
|
Added extra comments and removed redundant code as requested.
Bug: 28173563
Bug: 28287086
Change-Id: If6aff68c4c30427a86a27ffba5df1ae135edd294
(cherry picked from commit 94408d3144061bd6efc74b3d884d38169969c63f)
|
|
Implement recycling of ValueSet data structures which the GVN
algorithm will not access any more.
Savings depend on the shape of the graph, but can be as high as 93%.
Peak memory usage for GSA drops from 32MB to 26MB, compile times seem
unaffected.
Bug: 28173563
Bug: 28287086
Change-Id: If227177449bc90ad24fa68c37b0c2615924af1ed
(cherry picked from commit cc857cfbe4a179dfa7935b7334f1efbb21f2ac76)
|
|
Also attribute ArenaBitVector allocations to appropriate
passes. This was used to track down the source of the
excessive memory alloactions.
Bug: 27690481
Change-Id: Ib895984cb7c04e24cbc7abbd8322079bab8ab100
|
|
So we don't fallback to the interpreter in the presence of
irreducible loops.
Implications:
- A loop pre-header does not necessarily dominate a loop header.
- Non-constant redundant phis will be kept in loop headers, to
satisfy our linear scan register allocation algorithm.
- while-graph optimizations, such as gvn, licm, lse, and dce
need to know when they are dealing with irreducible loops.
Change-Id: I2cea8934ce0b40162d215353497c7f77d6c9137e
|
|
bug:25735083
bug:25173758
Change-Id: Ie81cfa4fa9c47cc025edb291cdedd7af209a03db
|
|
Motivation is using arenas in the verifier.
Bug: 10921004
Change-Id: I3c7ed369194b2309a47b12a621e897e0f2f65fcf
|
|
Implement dchecked_vector<> template that DCHECK()s element
access and insert()/emplace()/erase() positions. Change the
ArenaVector<> and ScopedArenaVector<> aliases to use the new
template instead of std::vector<>. Remove DCHECK()s that
have now become unnecessary from the Optimizing compiler.
Change-Id: Ib8506bd30d223f68f52bd4476c76d9991acacadc
|
|
Tag previously "Misc" arena allocations with more specific
allocation types. Move some native heap allocations to the
arena in BCE.
Bug: 23736311
Change-Id: If8ef15a8b614dc3314bdfb35caa23862c9d4d25c
|
|
Replace GrowableArray with ArenaVector and tag arena
allocations with new allocation types.
As part of this, make the register allocator a bit more
efficient, doing bulk insert/erase. Some loops are now
O(n) instead of O(n^2).
Change-Id: Ifac0871ffb34b121cc0447801a2d07eefd308c14
|
|
Replace GrowableArray with ArenaVector in HGraph and related
classes HEnvironment, HLoopInformation, HInvoke and HPhi,
and tag allocations with new arena allocation types.
Change-Id: I3d79897af405b9a1a5b98bfc372e70fe0b3bc40d
|
|
Replace GrowableArray with ArenaVector in HBasicBlock and,
to track the source of allocations, assign one new and two
Quick's arena allocation types to these vectors. Rename
kArenaAllocSuccessor to kArenaAllocSuccessors.
Bug: 23736311
Change-Id: Ib52e51698890675bde61f007fe6039338cf1a025
|
|
Reverting so that we can have more discussion about the STL API.
This reverts commit 91e11c0c840193c6822e66846020b6647de243d5.
Change-Id: I187fe52f2c16b6e7c5c9d49c42921eb6c7063dba
|
|
Replace GrowableArray with ArenaVector in HBasicBlock and,
to track the source of allocations, assign one new and two
Quick's arena allocation types to these vectors. Rename
kArenaAllocSuccessor to kArenaAllocSuccessors.
Bug: 23736311
Change-Id: I984aef6e615ae2380a532f5c6726af21015f43f5
|
|
This will be used by incoming architecture specific optimizations. The
dependencies must be conservative. When an HInstruction is created we
may not be sure whether it can trigger GC. In that case the
'ChangesGC' dependency must be set. We control at code-generation time
that HInstructions that can call have the 'ChangesGC' dependency
set.
Change-Id: Iea6a7f430009f37a9599b0a0039207049906e45d
|
|
Rationale:
Types (int, float etc.) and access type (field vs. array)
can be used to disambiguate write/read side-effects analysis.
This directly improves e.g. dead code elimination and licm.
Change-Id: I371f6909a3f42bda13190a03f04c4a867bde1d06
|
|
Also fixed some lines that were too long, and a few other minor
details.
Change-Id: I6efba5fb6e03eb5d0a300fddb2a75bf8e2f175cb
|
|
The basic hash table in Optimizing's GVN pass does not scale for
larger methods and quickly becomes a bottleneck. This patch provides
a different implementation, focusing on the following:
(1) Proper buckets with chaining for near constant-time lookup.
(2) Bucket inheritance for faster cloning. A clone does not actually
copy the entries until a first change is made.
(3) Table resizing for better load management. Done during cloning.
(4) Kill() and IntersectWith() applied only on impure instructions.
This is achieved by splitting (im)pure entries between even- and
odd-indexed buckets.
Benchmarks show that this optimization speeds up GVN by ~10%, which
translates to a rougly 2% change in the overall compilation time.
Change-Id: Ib4058359701d990194cfd49c6ee46ac2372f090c
|
|
Change-Id: I415b50d58b30cab4ec38077be22373eb9598ec40
|
|
During the GVN analysis, a basic block inherits the set of movable
instructions from its dominator. If the block is the only successor
of the dominating block, there is no need for cloning of the parent
set (a very expensive operation).
Change-Id: I59e033b9e9e093984dc8e903e3a7be1cb3645cc2
|
|
- Move SideEffectsAnalysis to its own file.
- Move most of gvn.h to gvn.cc.
- Don't call Resize in GrowableArray constructor, but just set num_used
directly.
Change-Id: I1f1291207945d678d3c99cc0ec1ec155bcae82f6
|
|
LICM also needs the side effects information of loops, so move
the GVN::ComputeSideEffects method into its own analysis class.
Change-Id: I810c8230a0eb6b9b536e8f808e17a3a4ad72f7db
|
|
When a predecessor block was killing instructions in a set, we were
not taking into account side effects of blocks between the dominator to
this predecessor.
Implementation now intersects the copied set of the dominator with
the predecessors to take these side effects into account.
Change-Id: If297439cc4e50cee91e9fffd028216a3e49e19ef
|
|
Enable -Wno-conversion-null, -Wredundant-decls and -Wshadow in general,
and -Wunused-but-set-parameter for GCC builds.
Change-Id: I81bbdd762213444673c65d85edae594a523836e5
|
|
Change-Id: Ibe0efa4e84fd020a53ded310a92e0b4363f91b12
|