Age | Commit message (Collapse) | Author |
|
Support all condition types inside the condition when performing
diamond loop auto-vectorization. This allows diamond loop
auto-vectorization to be performed on a greater variety of loops.
To support this change, new vector condition nodes are added to
mirror the scalar condition nodes.
Also add a new gtest class to test whether predicated vectorization
can be performed on different combinations of condition types and
data types.
Authors: Chris Jones <christopher.jones@arm.com>,
Konstantin Baladurin <konstantin.baladurin@arm.com>
Test: export ART_FORCE_TRY_PREDICATED_SIMD=true && \
art/test.py --target --optimizing
Test: art/test.py --target --host --optimizing
Test: 661-checker-simd-cf-loops
Test: art/test.py --gtest art_compiler_tests
Change-Id: Ic9c925f1a58ada13d9031de3b445dcd4f77764b7
|
|
This patch fixes code generation for VecPredToBoolean so it updates
conditional flags itself based on its predicate input. Prior to this
patch, code generation for VecPredToBoolean (incorrectly) implicitly
assumed that the conditional flags were always updated by its input
HIR (VecPredWhile) and that it immediately followed that HIR.
Authors: Konstantin Baladurin <konstantin.baladurin@arm.com>
Chris Jones <christopher.jones@arm.com>
Test: env ART_FORCE_TRY_PREDICATED_SIMD=true
art/test.py --target --optimizing
Test: art/tools/run-gtests.sh
Change-Id: Id4c2494cdefd008509f9039e36081151aaf0e4a6
|
|
When running codegen tests, verify that the ISA features used in the
codegen are supported by the hardware. This strengthens checks for
gtests that may use ISA features which are not supported by the
hardware and therefore should not be run on that hardware.
Test: art/test.py --gtest art_compiler_tests
Change-Id: I56ebd6e964947fe004f466010a18faceb019bfae
|
|
This patch fixes a bug in arm64 PackedSwitch code generation
for very large methods where we exceeded the range of Adr
instruction - jump tables were emited in the very end of the
method. Instead we now emit the jump table in-place as part of
the PackedSwitch visitor - in the same way how it is done
in arm32 backend.
This patch also removes an incorrect assumption that the size of
a method has a linear dependency on the number of its HIR
instructions. This was used to choose whether to emit a jump
table for a PackedSwitch.
Test: art/test.py --target --host --optimizing
Test: art/test.py --gtest art_compiler_tests
Change-Id: I0795811a6408a25021879ab6be9e23ef5f1f50e4
|
|
... instead of the instruction type argument.
And continue with loop construction cleanup in gtests.
Test: m test-art-host-gtest
Change-Id: I8cb83ae0c6d3cdb2a2ee4da0608cfeb69df722eb
|
|
Make `OptimizingUnitTestHelper::Make*()` functions add the
the new instruction to the block. If the block already ends
with a control flow instruction, the new instruction is
inserted before the control flow instruction (some tests
create the control flow before adding instruction). Add new
helper functions for additional instruction types, rename
and clean up existing helpers.
Test: m test-art-host-gtest
Change-Id: I0bb88bc4d2ff6ce98ddbec25990a1ae68f582042
|
|
This CL enables predicated autovectorization of loops with
control flow, currently only for simple diamond pattern ones:
header------------------+
| |
diamond_hif |
/ \ |
diamond_true diamond_false |
\ / |
back_edge |
| |
+---------------------+
Original author: Artem Serov <Artem.Serov@linaro.org>
Test: ./art/test.py --host --optimizing --jit
Test: ./art/test.py --target --optimizing --jit
Test: 661-checker-simd-cf-loops.
Test: target tests on arm64 with SVE (for details see
art/test/README.arm_fvp).
Change-Id: I8dbc266278b4ab074b831d6c224f02024030cc8a
|
|
The code used to copy the final generated code twice: from assembler to
CodeAllocator, and then to CodeAllocator to SwapAllocator/JitMemory.
The assemblers never depended on the exact location of the generated
code, so just drop that feature.
Test: test.py
Change-Id: I8dc82e4926097092b9aac336a5a5d40f79dc62ca
|
|
Update some of our gtests to create a Runtime, as
ReferenceTypePropagation expects to have one.
Test: run gtests
Change-Id: I75986b1a9dc0227ee05f507f2b03ffa8aa8f8e58
|
|
This reverts commit 0a51605ddd81635135463dab08b6f7c21b58ffb0.
Reason for revert: Reland after some of the required work
was merged in other CLs.
Also address a TODO from the original CL to mark required
symbols with EXPORT in `intrinsic_objects.h`.
Also mark symbols in new files as HIDDEN.
Bug: 186902856
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I936d448983928af23614ca82c2d0bf9a645e2c52
|
|
See https://source.android.com/setup/contribute/respectful-code for
reference
Bug: 161896447
Bug: 161850439
Bug: 161336379
Test: m -j checkbuild cts docs tests
Change-Id: I32d869c274a5d9a3dac63221e25874fe685d38c4
|
|
The only Optimizing test that actually needs a Runtime is
the ReferenceTypePropagationTest, so we make it subclass
CommonCompilerTest explicitly and change OptimizingUnitTest
to subclass CommonArtTest for the other tests.
On host, each test that initializes the Runtime takes ~220ms
more than without initializing the Runtime. For example, the
ConstantFoldingTest that has 10 individual tests previously
took over 2.2s to run but without the Runtime initialization
it takes around 3-5ms. On target, running 32-bit gtests on
taimen with run-gtests.sh (single-threaded) goes from
~28m47s to ~26m13s, a reduction of ~9%.
Test: m test-art-host-gtest
Test: run-gtests.sh
Change-Id: I43e50ed58e52cc0ad04cdb4d39801bfbae840a3d
|
|
Test: aosp_taimen-userdebug boots.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 147346243
Change-Id: I97fdc15e568ae3fe390efb1da690343025f84944
|
|
VIXL requires NEONHalf CPUFeature to emit half floating points
NEON instructions.
Test: codegen_test
Change-Id: I797d7a27087103491871e86d283f9860d3f20624
|
|
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf.
Reason for revert: Breaks ASAN tests (ODR violation).
Bug: 142365358
Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
|
|
Make symbols in compiler/optimizing hidden by a namespace
attribute. The unit intrinsic_objects.{h,cc} is excluded as
it is needed by dex2oat.
As the symbols are no longer exported, gtests are now linked
with the static version of the libartd-compiler library.
libart-compiler.so size:
- before:
arm: 2396152
arm64: 3345280
- after:
arm: 2016176 (-371KiB, -15.9%)
arm64: 2874480 (-460KiB, -14.1%)
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Bug: 142365358
Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
|
|
For SIMD graphs allocate 64 bit instead of 128 bit on stack for
each FP register to be preserved by the callee in the frame entry
as ABI suggests (currently 64-bit registers are preserved but
more space on stack is allocated).
Note: slow paths still require spilling full 128-bit Q-Registers
for SIMD graphs due to register allocator restrictions.
Test: test-art-target.
Change-Id: Ie0b12e4b769158445f3d0f4562c70d4fb0ea7744
|
|
VIXL macroassembler should be initialized properly
to support Armv8.X features in order to emit corresponding
instructions.
Test: codegen_test.cc, relative_patcher_arm64_test.
Test: test-art-host, test-art-target.
Change-Id: I2f9e155c28b4d2252a3cfb19717f5d25824d5e11
|
|
The TODO has been there since M (so forever :)):
https://android-review.googlesource.com/c/platform/art/+/122794/13//COMMIT_MSG#13
We hardly see the issue in our tests as we need to have:
1) A GC happening while creating the NPE object.
2) ParallelMoves between the NullCheck and implicit null check operation
that moves references.
The CL piggy backs on the "IsEmittedAtUseSite" flag, to set implicit
null checks with it. The liveness analysis then special cases implicit
null checks to record environment uses at the location of the actual
instruction that will do the implicit null check.
Test: test.py --gcstress
Test: run-libcore-tests --gcstress
bug: 111545159
Change-Id: I3ecea4fe0d7e483e93db83281ca10db47da228c5
|
|
Removes CompilerDriver dependency from ImageWriter and
several other classes.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: Pixel 2 XL boots.
Test: m test-art-target-gtest
Test: testrunner.py --target --optimizing
Change-Id: I3c5b8ff73732128b9c4fad9405231a216ea72465
|
|
Enforce the layering that code in runtime/base should not depend on
runtime by separating it into libartbase. Some of the code in
runtime/base depends on the Runtime class, so it cannot be moved yet.
Also, some of the tests depend on CommonRuntimeTest, which itself needs
to be factored (in a subsequent CL).
Bug: 22322814
Test: make -j 50 checkbuild
make -j 50 test-art-host
Change-Id: I8b096c1e2542f829eb456b4b057c71421b77d7e2
|
|
Previously, the code item was not necessarily 32 bit aligned. This
caused bus errors on armv7.
Also create a real dexfile object instead of casting 0 initialized
memory to a dex file pointer. We just got lucky before that the cdex
boolean was false.
Test: test-art-target-gtest
Bug: 63756964
Bug: 71605148
Change-Id: Ic7199f2b97bbd421de1d702efa5c6531ff45c022
|
|
Move all the DexFile related source to a common subdirectory dex/ of
runtime.
Bug: 71361973
Test: make -j 50 test-art-host
Change-Id: I59e984ed660b93e0776556308be3d653722f5223
|
|
Adding InstructionSet::kLast shall make it easier to encode
the InstructionSet in fewer bits using BitField<>. However,
introducing `kLast` into the `art` namespace is not a good
idea, so we change the InstructionSet to an enum class.
This also uncovered a case of InstructionSet::kNone being
erroneously used instead of vixl32::Condition::None(), so
it's good to remove `kNone` from the `art` namespace.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I6fa6168dfba4ed6da86d021a69c80224f09997a6
|
|
Passes using local ArenaAllocator were hiding their memory
usage from the allocation counting, making it difficult to
track down where memory was used. Using ScopedArenaAllocator
reveals the memory usage.
This changes the HGraph constructor which requires a lot of
changes in tests. Refactor these tests to limit the amount
of work needed the next time we change that constructor.
Test: m test-art-host-gtest
Test: testrunner.py --host
Test: Build with kArenaAllocatorCountAllocations = true.
Bug: 64312607
Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
|
|
Replace most uses of the runtime's Primitive in compiler
with a new class DataType. This prepares for introducing
new types, such as Uint8, that the runtime does not need
to know about.
Test: m test-art-host-gtest
Test: testrunner.py --host
Bug: 23964345
Change-Id: Iec2ad82454eec678fffcd8279a9746b90feb9b0c
|
|
The AArch32 VIXL-based code generator has been the default
ARM code generator in ART for some time now. The old ARM
code generator does not compile anymore; retiring it.
Test: test.py
Bug: 63316036
Change-Id: Iab8fbc4ac73eac2c1a809cd7b22fec6b619755db
|
|
Make actual types more explicit, either by replacing "auto"
with actual type or by assigning std::pair<> elements of
an "auto" variable to typed variables. Avoid binding const
references to temporaries. Avoid copying a container.
Test: m test-art-host-gtest
Change-Id: I1a59f9ba1ee15950cacfc5853bd010c1726de603
|
|
Test: m test-art-host-gtest-codegen_test
Bug: 34760542
Bug: 34834461
Change-Id: I7e716c4b665ed51af9908042f88fb2e4bcefb849
|
|
Test: test-art-host, test-art-target
Change-Id: Ifb931a99d34ea77602a0e0781040ed092de9faaa
|
|
|
|
When acquiring a scratch register to emit a move between two
double stack slots, ask for a FP register first, to avoid
depleting the core scratch register pool, which is used in
vixl::aarch64::MacroAssembler::LoadStoreMacro when the
offset does not fit in the immediate field of the load
instruction.
Test: make test-art-target (on ARM64)
Bug: 34760542
Change-Id: Ie9b37d007ed6ec5886931a35dcb22a9aff73bbbe
|
|
This commit adds a new `HInstructionScheduling` pass that performs
basic scheduling on the `HGraph`.
Currently, scheduling is performed at the block level, so no
`HInstruction` ever leaves its block in this pass.
The scheduling process iterates through blocks in the graph. For
blocks that we can and want to schedule:
1) Build a dependency graph for instructions. It includes data
dependencies (inputs/uses), but also environment dependencies and
side-effect dependencies.
2) Schedule the dependency graph. This is a topological sort of the
dependency graph, using heuristics to decide what node to schedule
first when there are multiple candidates. Currently the heuristics
only consider instruction latencies and schedule first the
instructions that are on the critical path.
Test: m test-art-host
Test: m test-art-target
Change-Id: Iec103177d4f059666d7c9626e5770531fbc5ccdc
|
|
* Update the target suppression file.
* Disable the detection of mismatched free() / delete / delete []
calls, since it results in a lot of false positives (a known
Valgrind limitation associated with asymmetric inlining of
operator new() and operator delete()).
* Avoid a memory leak in the code generator tests, caused by the
fact that the VIXL-based ARM code generator does not always use
the arena allocator.
* Fix an access to uninitialized memory.
Test: m valgrind-test-art-target
Test: valgrind --leak-check=full --show-mismatched-frees=no \
--ignore-range-below-sp=1024-1 \
--suppressions=valgrind-target-suppressions.txt \
dalvikvm ...
Change-Id: I891a3247aa9828226b4e62c69d6e1c8398d757b8
|
|
In ParallelMoveResolverARMVIXL::Exchange(int mem1, int mem2)
scratch general purpose register was used without any spilling
(like in StoreToOffset) which led to lack of scratch register
for VLDR with big offset. Now it uses two scratch S-registers.
Test: ART_USE_VIXL_ARM_BACKEND=true m test-art-host
Test: ART_USE_VIXL_ARM_BACKEND=true m test-art-target
Change-Id: I0416a69e281d09a04dd1689efa5a8c1994c82638
|
|
Switch to char versions of find variants.
Add "explicit" constructor variants or refactor and
remove defaults.
Use const references.
Bug: 32619234
Test: m test-art-host
Change-Id: I970cc2f47d6cf8f0c74104b994b075b2fafb3d45
|
|
Legacy code for compatibility with quick?
Test: test-art-host CC
Change-Id: I9de261daea67dfd9bd3df89826ba9d10f135e29e
|
|
codegen_tests.""
This VIXL32-based code generator is not enabled in the optimizing
compiler by default. Changes in codegen_test.cc test it in parallel with
the existing ARM backend.
This patch provides a base for further work, the new backend will not
be enabled in the optimizing compiler until parity is proven with the
current ARM backend and assembler.
Test: gtest-codegen_test on host and target
This reverts commit 7863a2152865a12ad9593d8caad32698264153c1.
Change-Id: Ia09627bac22e78732ca982d207dc0b00bda435bb
|
|
codegen_tests.""
|
|
Failing with:
art/compiler/optimizing/code_generator_arm_vixl.cc:396:47: error: too few arguments to function call, expected 3, have 2
ValidateInvokeRuntime(instruction, slow_path);
This reverts commit b138dfbd76f9d8b64fb9dbaf1a7c25e2549b2a8c.
Change-Id: Idccfe076f5905ea92ecbe3afbc7c8c64ecda94be
|
|
|
|
This VIXL32-based code generator is not enabled in the optimizing
compiler by default. Changes in codegen_test.cc test it in parallel
with the existing ARM backend.
This patch provides a base for further work, the new backend will not
be enabled in the optimizing compiler until parity is proven with the
current ARM backend and assembler.
Test: gtest-codegen_test on host and target
Change-Id: Id556a975b2645bf1d98ab2984650e8435b2312c2
|
|
Test: test-art-host-gtest-codegen_test
Test: test-art-target-gtest-codegen_test (MIPS32R2 & R6, MIPS64)
Change-Id: Ieae0fdb2ed30f262baac0eb7c6b658341c511a47
|
|
This will be used in a later patch to test a new VIXL32-based backend
in parallel with the existing code_generator_arm.
Test: gtest-codegen_test on host and target
Change-Id: I0316da0430fa6da0a7c668315f531888d18e7eb3
|
|
Test: booted MIPS32 in QEMU
Test: test-art-host-gtest
Test: test-art-target-gtest-codegen_test in QEMU
Test: test-art-target-run-test-optimizing on CI20
Change-Id: Ia3da5902d967cd7af313f03ebf414320b0063619
|
|
Add conditionals around more code that is only used for codegen for
specific architectures, and move a few more files into the
architecture-specific codegen lists.
Tests: ART_HOST_CODEGEN_ARCHS="x86_64 mips" m -j ART_TARGET_CODEGEN_ARCHS=svelte test-art-host
Bug: 30928847
Change-Id: I0444d15e1cafe4c9b13ff78718c3b13b544270e7
|
|
The codegen unit tests are supposed to use special "test" code
generators when targeting ARM and x86 (due to differing calling
conventions between the C++ source code and the generated code),
yet TestCodeGeneratorX86 was not being used. This fixes that.
(The tests were only succeeding because the register allocator happened
to not assign the EBX register.)
Test: m test-art-host-gtest-codegen_test
Change-Id: Ia3dd6998c38e9ff27b8c2734457f86b3fed44ab4
|
|
Allow alternate register allocation strategies to be implemented
in subclasses of a common register allocation base class.
Test: m test-art-host
Change-Id: I7c5866aa9ddff8f53fcaf721bad47654ab221b4f
|
|
This will allow a cleaner commit in an upcoming
refactoring of register allocation.
Test: m test-art-host
Change-Id: If420c97b088b3c934411ff83373e024003120746
|
|
First step towards merging the two passes, which will later result in
HGraphBuilder directly producing SSA form. This CL mostly just updates
tests broken by not being able to inspect the pre-SSA form.
Using HLocals outside the HGraphBuilder is now deprecated.
Bug: 27150508
Change-Id: I00fb6050580f409dcc5aa5b5aa3a536d6e8d759e
|