Age | Commit message (Collapse) | Author |
|
Add app image relocations for classes in the app image,
similar to the existing relocations for boot image. This new
load kind lets the compiled code avoid the null check and
slow path.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --speed-profile
Test: run-test.sh
Test: testrunner.py --target --optimizing --speed-profile
Bug: 38313278
Change-Id: Iffd76fe9ac6b95c37c2781fd6257e1d5cd0790d0
|
|
Treat app image objects similar to boot image objects and
avoid unnecessary read barriers for app image `HLoadClass`
and `HInstanceOf` checks with app image `HLoadClass` input.
Extend other optimizations to treat app image classes the
same way as boot image classes even though this remains
mostly dormant because we currently do not initialize app
image classes with class initializers; the experimental
flag `--initialize-app-image-classes` is false by default.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --speed-profile
Bug: 38313278
Change-Id: I359dd8897f6d128213602f5731d40edace298ab8
|
|
Prepare for adding app image patches to the same section.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 38313278
Change-Id: Ib552f005b3a2859152d0de9fa6b2fcd48a0f3feb
|
|
This reverts commit 53ca944020bb86199f6f80d8594d5deb1b1d46dd.
Bug: 297147201
Reason for revert: Crash on bot
Change-Id: Ibf3b53a8fe67aa633686990881a96acb783af9a3
|
|
In aosp/2876518 JIT code made runtime calls.
Bug: 297147201
Test: ./art/test/testrunner/testrunner.py --host --64 --jit -b
Test: ./art/test/testrunner/testrunner.py --host --64 -b
Test: ./art/test.py --host -b
Change-Id: Ifdfd3ace9419b34f8079c9ec4b1b2de31cb50ef7
|
|
Create InstructionSimplifierRiscv64 optimization.
Replace Shl (1|2|3) and Add with Riscv64ShiftAdd IR instruction.
By compiling with dex2oat all the methods of applications below I got:
Facebook: 45 cases
TikTok: 26 cases
YouTube: 19 cases of the pattern.
Test: art/test/testrunner/testrunner.py --target --64 --ndebug --optimizing
Change-Id: I88903450d998983bb2a628942112d7518099c3f5
|
|
Move the `LOG(FATAL)` code to the `HUnaryOperation` and
`HBinaryOperation` classes to avoid duplication in derived
classes. For consistency, implement an unreachable function
`HUnaryOperation::Evaluate(HIntConstant*)` even though all
currently existing subclasses override it.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I908c494b125e370ce9fe253c97ebacdd54ed1d96
|
|
When clearing the NeedsTypeCheck flag in ArraySet instructions,
we can remove the environment as it is no longer needed.
Also add a check in GraphChecker that instructions have an
environment iff they need one.
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: I698d9d88bc7c6c8569caf6397cbebf29b34585d5
|
|
This reverts commit 35a1479ab434257e9db629fda5f4ca96bfbef3fc.
Reason for revert: Disable failing test on debuggable
Change-Id: Icd012ac9e3b37c1187adf5e915ba7c1ffc415805
|
|
This reverts commit e872656585952f993eb84633a66e0aedcbdf52ac.
Reason for revert: Test failures
Change-Id: I05aadb695b87f661063ff87f63eb68048d16e050
|
|
Skip `HPhi::Accept()` and add functions to visit only Phis
or only non-Phi instructions.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 181943478
Change-Id: Iba0690ae70f46d6a2bafa9055b2ae5167e58a2f4
|
|
Investigating DCE I noticed that it was the 3rd most time consuming
optimization phase (Inliner, GVN, DCE) on local pprof profiles.
Inside RemoveDeadInstructions we ask IsDeadAndRemovable for every
instruction and Phi. We can speed it up by:
* Swap the order of IsDead and IsRemovable for earlier breaks with
e.g. LoadClass. LoadClass instructions are used by ClinitCheck
instructions (until very late in the graph). These instructions
are never going to be removed in DCE.
* Phi instructions always pass the IsRemovable check so we can
skip it.
Swapping the order improves RemoveDeadInstructions by ~20%, which
is DCE's most time consuming method. Overall, DCE improves by ~5%
and in my local trace now is the 4th most time consuming
optimization (LSE is now 3rd).
The Phi optimization didn't show up in my pprof profile. It may
improve apps with many Phi instructions.
Test: Locally compile and take a look at pprof profiles
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: I59932a8d8d627fc71628e2255582f35282dd0b4e
|
|
If profiling doesn't benefit the method, switch a baseline compilation
into optimized.
Reduces the number of JIT compilations on the Sheets benchmark from
~3100 (2250 baseline, 850 optimized) to ~2750 (2250 baseline, 500
optimized).
Test: test.py
Change-Id: I94760481d130d2dc168152daa94429baf201f66e
|
|
This code is dead after Partial LSE removal.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 298176183
Change-Id: If67efa9d1df908232b6c2f32f3d2c64fb91759ae
|
|
This reverts commit 1ba3516e8c3e2b86c73084893dd297f468469181.
Reason for revert:
PS1 is reland as-is
PS2 has two related fixes (for RISC-V and arm64) taking into
account that when we store zero, we use a special register.
Bug: 301833859
Bug: 310755375
Bug: 260843353
Test: lunch cf_riscv64_wear-trunk_staging-userdebug && m
Test: lunch starnix_wear_yukawa-trunk_staging-userdebug && m
Change-Id: I5e69890fd56404ddde56ebc457179241363d9243
|
|
This reverts commit 1be176f5a78750e2f0e32470f8c83e3d1643954d.
Reason for revert: Potential cause of build breakage for cf_riscv64_wear-trunk_staging-userdebug build 11353124
Change-Id: I5db1c9fba1edd4ab1eef30e2b547bb9649af5c10
|
|
This reverts commit 31b949bc4a76e5c6d00a8e18c346f123b5321a1c.
Reason for revert:
PS1 is reland as-is
PS2 has two fixes:
* Missed poisoning heap references in a code path
* Removed incorrect DCHECK
Change-Id: I81b317ddc704dbd8a173f5d5c624dbc69e2d9e60
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Test: art/test/testrunner/testrunner.py --target --64 --optimizing -b
Both commnands with `export ART_HEAP_POISONING=true`
Bug: 301833859
Bug: 310755375
Bug: 260843353
|
|
This reverts commit b5b98b9bb31acb2deffb692c50d0fbc71476663b.
Reason for revert: Breaks tests in arm64 + heap poison configurations
Bug: 310755375
Bug: 260843353
Change-Id: I682c74987a365497e0dbe47eba26a9ccf0513561
|
|
This reverts commit 9f8df195b7ff2ce47eec4e9b193ff3214ebed19c.
Reason for revert: Fix for x86_64 with heap poison enabled
This case uses a temp with index `1` in the regular FieldSet case.
This is done like this due to GenerateVarHandleSet also calling
HandleFieldSet. The bug was that we were allocating only one
temp in the regular FieldSet case and therefore not having the
temp with index `1` available.
PS1 is the revert as-is.
PS2 contains the fix.
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Test: Same command with heap poison enabled too
Bug: 301833859
Bug: 310755375
Bug: 260843353
Change-Id: Ie2740b4c443158c4e72810ce1d8268353c5f0055
|
|
This reverts commit 7c1dd6e2d1893f288214413c4b97273980f3aa4a.
Reason for revert: build breakages, using a different number of temps vs the expected (crashing in https://cs.android.com/android/platform/superproject/main/+/main:art/compiler/optimizing/code_generator_x86_64.cc;l=5488;drc=7c1dd6e2d1893f288214413c4b97273980f3aa4a)
Change-Id: I843c039394dd666776ea5bcb5b10b1f47df12d53
|
|
This reverts commit 5a3271d7caafefd10a20f5a5db09d2c178838b76.
Reason for revert:
This CL has two fixes (codegen not doing a null check if a write
barrier is being relied on, and codegen not recomputing
skipping write barriers), regression tests, a new
runtime check which runs in debug mode for the CC GC to ensure
that the card table is set correctly for skipped write barriers,
and new compile time (graph checker) tests to ensure graph
consistency.
This patchset updates the WriteBarrierKind to be
{emit being relied on, emit not being relied on, dont emit},
which leaves the null check skip implementation to codegen.
Test 2247- is removed from knownfailures.json but still skipped
in MTS due to SLO requirements.
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Bug: 301833859
Bug: 310755375
Bug: 260843353
Change-Id: I025597e284b2765986e2091538680ee629fb5ae7
|
|
This reverts commit d014fd019e84471665ac02f2de285541009892cd.
Reason for revert: fix codegen to do runtime call in JIT for now.
Bug: 297147201
Test: ./art/test/testrunner/testrunner.py --host --64 --optimizing -b
Test: ./art/test/testrunner/testrunner.py --jvm -b
Test: ./art/test.py --host -b
Test: ./art/test/testrunner/testrunner.py --host --64 --jit -b
Change-Id: I0f01c8391b09659bb6195955ecd8f88159141872
|
|
This reverts commit a4fd8bb141fdb877bfd0d69700dad4e2859634a7.
Bug: 259258187
Reason for revert: Failures on bots:
https://ci.chromium.org/ui/p/art/builders/ci/angler-armv8-non-gen-cc/3617/overview
Change-Id: Ia4aa3532b137d9022853a9f82ef6bacc9246d0ce
|
|
The current method is passed in a register and we can use the
GetCurrentMethod as an input to the method entry / exit hooks. In most
cases we may just have the current method in the register on method
entry.
Bug: 259258187
Test: art/test.py
Change-Id: Iea75f41b0ec5ebbc2aef857c84f39846b594e8e7
|
|
Define a new optimization flag for source and destination
position match. Use it to avoid the forward-copy check
(where the assembler optimized away a BLT instruction,
so we had just a useless BNE to the next instruction) and
one position sign check.
Avoid checking that the position is inside the array. The
subsequent subtraction cannot underflow an `int32_t` and
the following BLT shall go to the slow path for negative
values anyway.
Rewrite the array type check to avoid unnecessary checks
and read barriers.
Use an allocated temporary instead of scratch register
for the marking in the read barrier slow path. Simplify
the gray bit check and the fake dependency.
Use constant position and length locations for small
constant values. (It was probably an oversight that we
used it only for large constant values.)
Emit threshold check when the length equals source or
destination length. The old code allowed the intrinsic
to process array copy of an arbirary length.
Use `ShNAdd()` for faster array address calculations.
Use helper functions and lambdas to simplify the code.
Pass registers and locations by value. Prefer load/store
macro instructions over raw load/store instructions. Use
a bare conditional branch to assert the `TMP` shall not
be clobbered.
Test: testrunner.py --target --64 --ndebug --optimizing
Bug: 283082089
Change-Id: I3f697b4a74497d6d712a92450a6a45e772430662
|
|
It has been disabled for a while and it has bit rotted
Bug: 298176183
Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing
Test: m test-art-host-gtest-art_compiler_tests64
Change-Id: I4fcd8b3d18a3388e078b5cb3c340b2e270aefef7
|
|
Until we evaluate its usefulness and reduce its overhead.
Bug: 306638020
Test: test.py
Change-Id: Ibb01c70a7ea19b03802dcc1b0792d3d2ff4f4d67
|
|
Also mark boxing `valueOf()` intrinsics as never null
to avoid creating unnecessary `HNullCheck` instructions.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I86e7721e3af6c59407aa2ddfc1bd11bd2fdac83c
|
|
Currently unused. Follow-up CLs will make use of the data.
Test: test.py
Bug: 304969871
Change-Id: I486faba3de030061715d06ab9fdb33970d319d9b
|
|
It clears loop and dominance information, and builds the dominator
tree. It also dchecks that we are not calling this methods with
irreducible loops, as it is not supported.
When adding this helper we found a partial LSE bug as it was
recomputing dominator tree for irreducible loops.
Test: art/test/testrunner/testrunner.py --host --64 -b --optimizing
Bug: 304749506
Change-Id: Ia4cc72cd19779ad881fa686e52b43679fe5a64d3
|
|
We can simplify a Select + Binary/Unary Op if:
* Both inputs to the Select instruction are constant, and
* The Select instruction is not used in another instruction
to avoid duplicating Selects.
* In the case of Binary ops, both inputs can't be Select.
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: Ic716155e9a8515126c2867bb1d54593fa63011ae
|
|
Bug: none
Test: treehugger
Change-Id: I589cc36a95e505c3c356ad245dd03286680aa4f7
|
|
std::iterator was deprecated in C++17 and after upgrading libc++, the
compiler warns about the many uses of it in ART.
p0174r1 gives a couple of reasons for its deprecation:
"The long sequence of void arguments is much less clear to the reader
than simply providing the expected typedefs in the class definition
itself, which is the approach taken by the current working draft, ...
"In addition to the reduced clarity, the iterator template also lays a
trap for the unwary, as in typical usage it will be a dependent base
class, which means it will not be looking into during name lookup from
within the class or its member functions."
The first reason is illustrated by the comments in
BitTable::const_iterator. The second reason is illustrated by the
various "using foo = typename std::iterator<...>::foo" declarations.
Follow p0174r1's advice and simply provide the 5 typedefs in each
iterator class.
Bug: b/175635923
Test: treehugger
Change-Id: I2fba5af68eb05fd0a8ba5e2add0c8b8ed1ebee1a
|
|
Test: Rely on TreeHugger.
Change-Id: I4e3c0ba13d576ef62121d47ebc4965f6667b624f
|
|
Polymorphic invokes are expensive and some of the methods in Atomic*
classes uses polymorphic methods. We use intrinsics to generate
efficient code for them. Intrinsic optimization was disabled in
debuggable runtimes to be able to support breakpoint in intrinsic
functions. It might be less useful to break in such methods so we want
to enable intrinsic optimization for polymorphic invokes which are
performance sensitive.
Bug: 296298460
Test: art/test.py
Change-Id: I575695d82e8bc7d703cfbf5ff22ea7d5a35f6937
|
|
Be consistent when checking bss kind.
Test: m
Change-Id: If6f6c06d79fba8caea8dded962c20f34f553dc7f
|
|
This CL enables predicated autovectorization of loops with
control flow, currently only for simple diamond pattern ones:
header------------------+
| |
diamond_hif |
/ \ |
diamond_true diamond_false |
\ / |
back_edge |
| |
+---------------------+
Original author: Artem Serov <Artem.Serov@linaro.org>
Test: ./art/test.py --host --optimizing --jit
Test: ./art/test.py --target --optimizing --jit
Test: 661-checker-simd-cf-loops.
Test: target tests on arm64 with SVE (for details see
art/test/README.arm_fvp).
Change-Id: I8dbc266278b4ab074b831d6c224f02024030cc8a
|
|
Test: m test-art-host-gtest
Bug: 283082089
Change-Id: Icb40802df7504791f3eaba7b0ce06538c9194ff6
|
|
Bug: 169680875
Test: mmm art
Change-Id: Ic0cc320891c42b07a2b5520a584d2b62052e7235
|
|
After the old implementation was renamed in
https://android-review.googlesource.com/2526708 ,
we introduce a new function with the old name but new
behavior, just `DCHECK()`-ing the instruction kind before
casting down the pointer. We change appropriate calls from
`As##type##OrNull()` to `As##type()` to avoid unncessary
run-time checks and reduce the size of libart-compiler.so.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Bug: 181943478
Change-Id: I025681612a77ca2157fed4886ca47f2053975d4e
|
|
The null type check in the current implementation of
`HInstruction::As##type()` often cannot be optimized away
by clang++. It is therefore beneficial to have two functions
HInstruction::As##type()
HInstruction::As##type##OrNull()
where the first function never returns null but the second
one can return null. The additional text "OrNull" shall also
flag the possibility of yielding null to the developer which
may help avoid bugs similar to what we have seen previously.
This requires renaming the existing function that can return
null and introducing new function that cannot. However,
defining the new function `HInstruction::As##type()` in the
same change as renaming the old one would risk introducing
bugs by missing a rename. Therefore we simply rename the old
function here and the new function shall be introduced in a
separate change with all behavioral changes being explicit.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: buildbot-build.sh --target
Bug: 181943478
Change-Id: I4defd85038e28fe3506903ba3f33f723682b3298
|
|
This reverts commit b5fcab944b3786f27ab6b698685109bfc7f785fd.
Reason for revert: test/988 is a CTS test and we shouldn't modify the
Main to do any real work other than calling run. Also there's no way to
call ensureJitCompiled from atests, so restoring 988 to original and
adding another test for testing JIT tracing
Bug: 279547861
Test: test.py -t 988, 2263
Change-Id: I0908c29996a550b93ba6c38f99460ff0d51a2964
|
|
Also so some style cleanup.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I34304acb39bc5197dde03543a6c157b3c319f94f
|
|
The original code introduced in
https://android-review.googlesource.com/355337
was technically correct because `HDeoptimize` had its own
`GetKind()` function which was hiding the one from the base
class. However, that function was renamed in
https://android-review.googlesource.com/391053
without updating the `InstructionDataEquals()`.
Test: m test-art-host-gtest
Change-Id: Idfb35bd935858a50d16bd6ea0237ba1230787158
|
|
One overload used `down_cast<>` and the other used
`static_cast<>`, so make it consistent.
Also avoid some unnecessary `As##type()` calls and make some
style adjustments.
Test: m test-art-host-gtest
Change-Id: I1f368a0c21647b44fffb7361dbb92d8a09fbe904
|
|
This reverts commit cb008914fbc5a2334e3c00366afdb5f8af5a23ba.
Reason for revert: Failures on some configs
https://buganizer.corp.google.com/issues/279562617
Change-Id: I4d26cd00e76d8ec4aef76ab26987418eab24d217
|
|
We have optimizations that generate code inline for intrinsics instead
of leaving them as invoke for better performance. Some debug features
like method entry / exit or setting a breakpoint on intrinsics wouldn't
work if intrinsics are inlined. So disable those optimizations in
debuggable runtimes.
Also update 988-method-trace test to test intrinsics on JITed code.
Test: art/test.py -t 988
Bug: 279547861
Change-Id: Ic7c61d1b1541ff534faa24ccec5c2d0b574b0537
|
|
This should improve optimization opportunities for clang++.
Test: buildbot-build.sh
Change-Id: Ib0c1ebfb157176a9063cca2ed465cff6fb280442
|
|
Note that these fields were the last packed fields in their
instruction definitions, so the higher bits were unused and
always zero, therefore the erroneous bigger field size did
not cause any observable bugs.
Test: m test-art-host-test
Test: testrunner.py --host --optimizing
Change-Id: I2410fbb09d86f4d49a69b598748701d59860c55b
|
|
Follow-up to aosp/2442280. We haven't seen crashes with these ones,
but we can't guarantee that the RTI will be valid in these code paths.
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: I80da85a6549ba0275a80027016363e0cf9fb8045
|