Age | Commit message (Collapse) | Author |
|
This reverts commit 434a327234f74eed3ef4072314d2e2bdb73e4dda.
Reason for revert: Relanding with no change. The regressions
that were the reason for the revert may reappear. However,
these regressions are probably caused by subtle effects that
are not directly related to this change. For example, a code
size improvement can regress performance simply by moving
the start of a loop from an aligned address to an unaligned
address, or by splitting a loop across two cache lines.
Bug: 358519867
Bug: 359722268
Change-Id: I997b8a4219418f79b3a5fc4e7e50817911f0a737
|
|
This reverts commit 3e75615ad25b6af1842b194e78b429b0f585b46a.
Reason for revert: Regressed some micro-benchmarks, see bug
359722268.
Bug: 358519867
Bug: 359722268
Change-Id: I207cc78c88193564e90c98eda2c96a5ba354a588
|
|
Determine the number of out vregs needed by invokes that
actually make a call, and by `HStringBuilderAppend`s.
This can yield smaller frame sizes of compiled methods when
some calls are inlined or fully intrinsified.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 358519867
Change-Id: I4930a9bd811b1de14658f5ef44e65eadea6a7961
|
|
* Updated comments
* Made constants constexpr
* Renamed kBaselineMaxCodeUnits to include "Inline"
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: I37569b3d9e5eecfd65a505a79945bbe5b290fbbf
|
|
Popular apps include such methods in their profiles. Having
the extra heuristic of skipping compilation for large methods
with no branches can be unintuitive for developers who created
those profiles.
Some apps see startup improvements with this heuristic removed.
Bug: 316617683
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: I21a8da93e89399dac0e45c3ab43a8bbedc925a44
|
|
Check that the flags are up to date in graph checker. Mainly a
correctness check CL but it brings slight code size reduction
(e.g. not needing vreg info if HasMonitorOperations is false).
Update loop_optimization_test to stop using `LocalRun` directly
as it meant that it was breaking assumptions (i.e. top_loop_ was
nullptr when it was expected to have a value).
Bug: 264278131
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: I29765b3be46d4bd7c91ea9c80f7565a3c88fae2e
|
|
There are some cases in which we need to add extra goto blocks
when inlining to avoid critical edges. The `TryBoundary` of
`kind:exit` instructions will always have more than one successor
(normal flow and exceptional flow). If its normal flow successor
has more than one predecessor, we would be introducing a critical edge.
We can avoid the critical edge in InlineInto instead of doing it in
the builder which helps decoupling those two stages as well as
simplifying surrounding code.
We also have the benefit of adding the extra goto blocks only
when necessary.
Bug: 227283224
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: Ibe21623c94c798f7cea60ff892064e63a38a787a
|
|
There are cases in which we have a
Return->TryBoundary kind:exit->Exit
chain inside of a loop, and that said TryBoundary has loop
information. Set that information in the new goto block.
Bug: 261731237
Fixes: 261731237
Test: dex2oat compiling the apps in the bug
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: I224d6cb449d1a7a1a987f4e9022098f84ae4fba2
|
|
There's no need to recompute the whole graph as we know what changed.
As a drive-by, we now don't return false for graphs with irreducible
loops so we can remove that restriction from the builder. However,
if a graph with irreducible loops hits this path it means that:
A) it's being inlined
B) Has irreducible loops
We don't inline graphs with irreducible loops, and after building
for inline we don't remove them either because constant folding
and instruction simplifier don't remove them, and DCE doesn't
run for graphs with irreducible loops. So, in terms of dex2oat's
outputs nothing should change.
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: I8cbf1b5f0518bb5dd14ffd751100ea81f5478863
|
|
This reverts commit 0a51605ddd81635135463dab08b6f7c21b58ffb0.
Reason for revert: Reland after some of the required work
was merged in other CLs.
Also address a TODO from the original CL to mark required
symbols with EXPORT in `intrinsic_objects.h`.
Also mark symbols in new files as HIDDEN.
Bug: 186902856
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I936d448983928af23614ca82c2d0bf9a645e2c52
|
|
We may have graphs with irreducible loops at this point and
recomputing the loop information is not supported.
Bug: 253242440
Test: dex2oat compiling the apps in the bug
Change-Id: I181b4fb9d9812ba2f14cd21602cf0e5a4e1fd18e
|
|
Notable changes:
1) Wiring of the graph now allows for inlinees graph ending in
TryBoundary, or Goto in some special cases.
2) Building a graph with try catch for inlining may add an extra
Goto block.
3) Oat version bump.
4) Reduced kMaximumNumberOfCumulatedDexRegisters from 32 to 20.
Bug: 227283224
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: Ic2fd956de24b72d1de29b4cd3d0b2a1ddab231d8
|
|
The compiler only needs to know if a method is compilable or not. So
just record a set of uncompilable methods (in some cases, we cannot have
an ArtMethod, but the method can still be compiled).
Test: test.py
Bug: 28313047
Change-Id: Ic4235bc8160ec91daa5ebf6504741089b43e99cb
|
|
Test: test.py
Bug: 170086509
Change-Id: I1e1a4abf71245c0fd37f951c9af85f62feba18ca
|
|
Remove the hashtable storage from the image and allocate it at
runtime instead (but keep the DexCache object in the image).
For compiled code, we have largely moved to using .bss, so the
DexCache just costs us unnecessary extra space and dirty pages.
For interpreted code, the hashtables are too small and will be
overridden many times over at run-time regardless.
The next step will be to make DexCache variable-size so it can
adapt to both of the extremes (taking minimal amount of memory
for compiled code and avoiding cache evictions in interpreter).
Test: test.py --host
Change-Id: I9f89e8f19829b812cf85dea1a964259ed8b87f4d
|
|
This avoids passing the `VariableSizedHandleScope*` argument
around and eliminates HGraph::inexact_object_rti_ and its
initialization. The latter shall allow running Optimizing
gtests that do not require type information without creating
a Runtime in future. (To be implemented in a separate CL.)
Test: m test-art-host-gtest
Test: testrunner.py --host --optmizing
Test: aosp_taimen-userdebug boots.
Change-Id: I36fe9bc556c6d610d644c8c14cc74c9985a14d64
|
|
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf.
Reason for revert: Breaks ASAN tests (ODR violation).
Bug: 142365358
Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
|
|
Make symbols in compiler/optimizing hidden by a namespace
attribute. The unit intrinsic_objects.{h,cc} is excluded as
it is needed by dex2oat.
As the symbols are no longer exported, gtests are now linked
with the static version of the libartd-compiler library.
libart-compiler.so size:
- before:
arm: 2396152
arm64: 3345280
- after:
arm: 2016176 (-371KiB, -15.9%)
arm64: 2874480 (-460KiB, -14.1%)
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Bug: 142365358
Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
|
|
Avoid caching the results. Caching was broken for JIT in the
presence of class unloading; entries for unloaded dex files
were leaked and potentially used erroneously with a newly
loaded dex file.
Test: m test-art-host-gtest
Test: testrunner.py --host
Test: Pixel 2 XL boots.
Test: m test-art-target-gtest
Test: testrunner.py --target
Bug: 118808764
Change-Id: Ic1163601170364e060c2e3009752f543c9bb37b7
|
|
Bug: 71605148
Bug: 63756964
Test: test-art-target on angler
This reverts commit 6716941120ae9f47ba1b8ef8e79820c4b5640350.
Change-Id: Ic01ea4e8bb2c1de761fab354c5bbe27290538631
|
|
Bug: 71605148
Bug: 63756964
Seems to fail on armv7.
This reverts commit f5245188d9c61f6b90eb30cca0875fbdcc493b15.
Change-Id: I37786c04a8260ae3ec4a2cd73710126783c3ae7e
|
|
Added a table that is indexed by dex method index. To prevent size
overhead, there is only one slot for each 16 method indices. This
means there is up to 15 loop iterations to get the quickening info
for a method. The quickening infos are now prefixed by a leb
encoded length. This allows methods that aren't quickened to only
have 1.25 bytes of space overhead.
The value was picked arbitrarily, there is little advantage to
increasing the value since the table only takes 1 byte per 4 method
indices currently. JIT benchmarks do not regress with the change.
There is a net space saving from removing 8 bytes from each
quickening info since most scenarios have more quickened methods than
compiled methods.
For getting quick access to the table, a 4 byte preheader was added
to each dex in the vdex file
Removed logic that stored the quickening info in the CodeItem
debug_info_offset field.
The change adds a small quicken table for each method index, this
means that filters that don't quicken will have a slight increase in
size. The worst case scenario is compiling all the methods, this
results in 0.3% larger vdex for this case. The change also disables
deduping since the quicken infos need to be in dex method index
order.
For filters that don't compile most methods like quicken and
speed-profile, there is space savings. For quicken, the vdex is 2%
smaller.
Bug: 71605148
Bug: 63756964
Test: test-art-host
Change-Id: I89cb679538811369c36b6ac8c40ea93135f813cd
|
|
Make code item fields private and use accessors. Added a hand full of
friend classes to reduce the size of the change.
Changed default to be nullable and removed CreateNullable.
CreateNullable was a bad API since it defaulted to the unsafe, may
add a CreateNonNullable if it's important for performance.
Motivation:
Have a different layout for code items in cdex.
Bug: 63756964
Test: test-art-host-gtest
Test: test/testrunner/testrunner.py --host
Test: art/tools/run-jdwp-tests.sh '--mode=host' '--variant=X32' --debug
Change-Id: I42bc7435e20358682075cb6de52713b595f95bf9
|
|
When compiling an intrinsic method, generate a graph that
invokes the same method and try to compile it. If the call
is actually intrinsified (or simplified to other HIR) and
yields a leaf method, use the result of this compilation
attempt, otherwise compile the actual code or JNI stub.
Note that CodeGenerator::CreateThrowingSlowPathLocations()
actually marks the locations as kNoCall if the throw is not
in a catch block, thus considering some throwing methods
(for example, String.charAt()) as leaf methods.
We would ideally want to use the intrinsic codegen for all
intrinsics that do not generate a slow-path call to the
default implementation. Relying on the leaf method is
suboptimal as we're missing out on methods that do other
types of calls, for example runtime calls. This shall be
fixed in a subsequent CL.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 67717501
Change-Id: I640fda7c22d4ff494b5ff77ebec3b7f5f75af652
|
|
Memory needed to compile the two most expensive methods for
aosp_angler-userdebug boot image:
BatteryStats.dumpCheckinLocked() : 21.1MiB -> 20.2MiB
BatteryStats.dumpLocked(): 42.0MiB -> 40.3MiB
This is because all the memory previously used by the graph
builder is reused by later passes.
And finish the "arena"->"allocator" renaming; make renamed
allocator pointers that are members of classes const when
appropriate (and make a few more members around them const).
Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 64312607
Change-Id: Ia50aafc80c05941ae5b96984ba4f31ed4c78255e
|
|
Passes using local ArenaAllocator were hiding their memory
usage from the allocation counting, making it difficult to
track down where memory was used. Using ScopedArenaAllocator
reveals the memory usage.
This changes the HGraph constructor which requires a lot of
changes in tests. Refactor these tests to limit the amount
of work needed the next time we change that constructor.
Test: m test-art-host-gtest
Test: testrunner.py --host
Test: Build with kArenaAllocatorCountAllocations = true.
Bug: 64312607
Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
|
|
Replace most uses of the runtime's Primitive in compiler
with a new class DataType. This prepares for introducing
new types, such as Uint8, that the runtime does not need
to know about.
Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 23964345
Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
|
|
Remove all copies of 'MaybeRecordStat', replacing them with a single
OptimizingCompilerStats::MaybeRecordStat helper.
Change-Id: I83b96b41439dccece3eee2e159b18c95336ea933
|
|
The latest chapter in the ongoing saga of attempting to dump a DEX
file without having to start a whole runtime instance. This episode
finds us removing references to ArtMethod/ArtField/mirror.
One aspect of this change that I would like to call out specfically
is that the utils versions of the "Pretty*" functions all were written
to accept nullptr as an argument. I have split these functions up as
follows:
1) an instance method, such as PrettyClass that obviously requires
this != nullptr.
2) a static method, that behaves the same way as the util method, but
calls the instance method if p != nullptr.
This requires using a full class qualifier for the static methods,
which isn't exactly beautiful. I have tried to remove as many cases
as possible where it was clear p != nullptr.
Bug: 22322814
Test: test-art-host
Change-Id: I21adee3614aa697aa580cd1b86b72d9206e1cb24
|
|
This patch merges the instruction-building phases from HGraphBuilder
and SsaBuilder into a single HInstructionBuilder class. As a result,
it is not necessary to generate HLocal, HLoadLocal and HStoreLocal
instructions any more, as the builder produces SSA form directly.
Saves 5-15% of arena-allocated memory (see bug for more data):
GMS 20.46MB => 19.26MB (-5.86%)
Maps 24.12MB => 21.47MB (-10.98%)
YouTube 28.60MB => 26.01MB (-9.05%)
This CL fixed an issue with parsing quickened instructions.
Bug: 27894376
Bug: 27998571
Bug: 27995065
Change-Id: I20dbe1bf2d0fe296377478db98cb86cba695e694
|
|
Bug: 27995065
This reverts commit e3ff7b293be2a6791fe9d135d660c0cffe4bd73f.
Change-Id: I5363c7ce18f47fd422c15eed5423a345a57249d8
|
|
This patch merges the instruction-building phases from HGraphBuilder
and SsaBuilder into a single HInstructionBuilder class. As a result,
it is not necessary to generate HLocal, HLoadLocal and HStoreLocal
instructions any more, as the builder produces SSA form directly.
Saves 5-15% of arena-allocated memory (see bug for more data):
GMS 20.46MB => 19.26MB (-5.86%)
Maps 24.12MB => 21.47MB (-10.98%)
YouTube 28.60MB => 26.01MB (-9.05%)
Bug: 27894376
Change-Id: Iefe28d40600c169c5d306fd2c77034ae19476d90
|
|
Second CL in the series of merging HGraphBuilder and SsaBuilder. This
patch refactors the builders so that dominator tree can be built
before any HInstructions are generated. This puts the SsaBuilder
removal of HLoadLocals/HStoreLocals straight after HGraphBuilder's
HInstruction generation phase. Next CL will therefore be able to
merge them.
This patch also adds util classes for iterating bytecode and switch
tables which allowed to simplify the code.
Bug: 27894376
Change-Id: Ic425d298b2e6e7980481ed697230b1a0b7904526
|
|
For strings in the boot image, use either direct pointers
or pc-relative addresses. For other strings, use PC-relative
access to the dex cache arrays for AOT and direct address of
the string's dex cache slot for JIT.
For aosp_flounder-userdebug:
- 32-bit boot.oat: -692KiB (-0.9%)
- 64-bit boot.oat: -948KiB (-1.1%)
- 32-bit dalvik cache total: -900KiB (-0.9%)
- 64-bit dalvik cache total: -3672KiB (-1.5%)
(contains more files than the 32-bit dalvik cache)
For aosp_flounder-userdebug forced to compile PIC:
- 32-bit boot.oat: -380KiB (-0.5%)
- 64-bit boot.oat: -928KiB (-1.0%)
- 32-bit dalvik cache total: -468KiB (-0.4%)
- 64-bit dalvik cache total: -1928KiB (-0.8%)
(contains more files than the 32-bit dalvik cache)
Bug: 26884697
Change-Id: Iec7266ce67e6fedc107be78fab2e742a8dab2696
|
|
|
|
Rely on HGraph::SimplifyLoop() to insert suspend checks.
CodeGenerator's CheckLoopEntriesCanBeUsedForOsr() checks the
dex pcs of suspend checks against branch targets to verify
that we always have an appropriate point for OSR transition.
However, the HSuspendChecks that were added by HGraphBuilder
to support the recently removed "baseline" interfered with
this in a specific case, namely an infinite loop where the
back-branch jumps to a nop. In that case, the HSuspendCheck
added by HGraphBuilder had a dex pc different from the block
and the branch target but its presence would stop the
HGraph::SimplifyLoop() from adding a new HSuspendCheck with
the correct dex pc.
Bug: 27623547
Change-Id: I83566a260210bc05aea0c44509a39bb490aa7003
|
|
This reverts commit 845e5064580bd37ad5014f7aa0d078be7265464d.
Add an option to change what OatFileManager considers up-to-date.
In our tests we're allowed to write to the dalvik-cache, so it
cannot be kSpeed.
Bug: 27689078
Change-Id: I0c578705a9921114ed1fb00d360cc7448addc93a
|
|
Bots are red. Tentative reverting as this is likely the offender.
Bug: 27689078
This reverts commit a62d2f04a6ecf804f8a78e722a6ca8ccb2dfa931.
Change-Id: I3ec6947a5a4be878ff81f26f17dc36a209734e2a
|
|
Record the compiler filter in the oat header. Use that to determine
when the oat file is up-to-date with respect to a target compiler
filter level.
New xxx-profile filter levels are added to specify if a profile should
be used instead of testing for the presence of a profile file.
This change should allow for different compiler-filters to be set for
different package manager use cases.
Bug: 27689078
Change-Id: Id6706d0ed91b45f307142692ea4316aa9713b023
|
|
Also attribute ArenaBitVector allocations to appropriate
passes. This was used to track down the source of the
excessive memory alloactions.
Bug: 27690481
Change-Id: Ib895984cb7c04e24cbc7abbd8322079bab8ab100
|
|
Pack narrow fields and flags into a single 32-bit field.
Change-Id: Ib2f7abf987caee0339018d21f0d498f8db63542d
|
|
We do not require full environment at the start of basic block.
The dex pc contained in basic block is sufficient for line mapping.
Change-Id: I5ba9e5f5acbc4a783ad544769f9a73bb33e2bafa
|
|
First step towards merging the two passes, which will later result in
HGraphBuilder directly producing SSA form. This CL mostly just updates
tests broken by not being able to inspect the pre-SSA form.
Using HLocals outside the HGraphBuilder is now deprecated.
Bug: 27150508
Change-Id: I00fb6050580f409dcc5aa5b5aa3a536d6e8d759e
|
|
Change-Id: I21b984224370a9ce7a4a13a9652503cfb03c5f03
|
|
This reverts commit bd89a5c556324062b7d841843b039392e84cfaf4.
Change-Id: I08d190431520baa7fcec8fbdb444519f25ac8d44
|
|
This teaches the optimizing compiler how to perform invoke-super on
interfaces. This should make the invokes generally faster.
Bug: 24618811
Change-Id: I7f9b0fb1209775c1c8837ab5d21f8acba3cc72a5
|
|
Simplification of HFakeString assumes that it cannot be used until
String.<init> is called which is not true and causes different
behaviour between the compiler and the interpreter. This patch
removes the optimization together with the HFakeString instruction.
Instead, HNewInstance is generated and an empty String allocated
until it is replaced with the result of the StringFactory call. This
is consistent with the behaviour of the interpreter but is too
conservative. A follow-up CL will attempt to optimize out the initial
allocation when possible.
Bug: 26457745
Bug: 26486014
Change-Id: I7139e37ed00a880715bfc234896a930fde670c44
|
|
ReferenceTypePropagation creates a BoundType for each CheckCast and
replaces all dominated uses of the casted object with it. This does
not include Phi uses on the boundary of the dominated scope, reducing
typing precision. This patch creates the BoundType in Builder, causing
SsaBuilder to replace uses of the object automatically.
Bug: 26081304
Change-Id: I083979155cccb348071ff58cb9060a896ed7d2ac
|
|
Generate extra stack map at the start of each java statement.
The stack maps are later translated to DWARF which allows
LLDB to set breakpoints and view local variables.
Change-Id: If00ab875513308e4a1399d1e12e0fe8934a6f0c3
|
|
Change-Id: I1d258f1a89bf0ec7c7ddd134be9215d480f0b09a
|