Age | Commit message (Collapse) | Author |
|
This reduces `sizeof(LocationSummary)` from 128B down to 88B
on 64-bit architectures.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I24a35fb433e89533727f6c786eb1253178cc05bf
|
|
It has been obsolete since graph color was removed.
Bug: 281793697
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Test: m test-art-host-gtest
Change-Id: I8b42b0fe39a8601da7aa1288a8581ab8b4742614
|
|
Test: testrunner.py --target --64 --ndebug --optimizing
Bug: 283082089
Change-Id: I8016cb046d1fbaa5ffe71917a4cce685dfc65f02
|
|
Change `Location::ConstantLocation()` to allow passing any
instruction and `DCHECK()` that it is indeed a constant.
Skip explicit calls to `HInstruction::AsConstant()` before
calling `Location::ConstantLocation()`.
Also cache results of `instuction->InputAt(.)` in some cases
when it's used more than once.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Change-Id: I3c07642f6b3523b576ec229e4d234561ad74a20e
|
|
This reverts commit 0a51605ddd81635135463dab08b6f7c21b58ffb0.
Reason for revert: Reland after some of the required work
was merged in other CLs.
Also address a TODO from the original CL to mark required
symbols with EXPORT in `intrinsic_objects.h`.
Also mark symbols in new files as HIDDEN.
Bug: 186902856
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: I936d448983928af23614ca82c2d0bf9a645e2c52
|
|
We can skip SuspendCheck if the method is e.g. a leaf method with a read barrier call in the slow path.
Bug: 135477345
Test: ART tests
Change-Id: I6e04f10544ec61b46bb5763a88c28248e88193bf
|
|
Replace many occurences of `typedef` with `using`. For now,
do not update typedefs for function types and aligned types
and do not touch some parts such as jvmti or dmtracedump.
Test: m
Change-Id: Ie97ecbc5abf7e7109ef4b01f208752e2dc26c36d
|
|
This commit checks if a VarHandle access mode is supported. If not, an
UnsupportedOperationException is raised by calling the runtime to handle it.
I added the polymorphic intrinsics case in the IntrinsicSlowPath
code generation to handle all the eventual exceptions. For now,
none of the operations are actually compiled. If the slow path is
not called, the runtime handles the operation.
Bug: b/65872996
Test: art/test.py --host -r -t 712-varhandle-invocations --32
Test: art/test.py --host --all-compiler -r
Change-Id: I5a637561549b3fdd64fa53e2d7dbf835d3ae0d64
|
|
Pass enums by value instead of const reference.
Do not generate operator<< sources for headers that have no
enums or no declarations of operator<<. Do not define the
operator<< for flag enums; these were unused anyway.
Add generated operator<< for some enums in nodes.h . Change
the operator<< for ComparisonBias so that the graph
visualizer can use it but do not use the generated
operator<< yet as that would require changing checker tests.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: Ifd4c455c2fa921a9668c966a13068d43b9c6e173
|
|
ART vectorizer assumes that there is single size of SIMD
register used for the whole program. Make this assumption explicit
and refactor the code.
Note: This is a base for the future introduction of SIMD slots of
size other than 8 or 16 bytes.
Test: test-art-target, test-art-host.
Change-Id: Id699d5e3590ca8c655ecd9f9ed4e63f49e3c4f9c
|
|
This reverts commit e2727154f25e0db9a5bb92af494d8e47b181dfcf.
Reason for revert: Breaks ASAN tests (ODR violation).
Bug: 142365358
Change-Id: I38103d74a1297256c81d90872b6902ff1e9ef7a4
|
|
Make symbols in compiler/optimizing hidden by a namespace
attribute. The unit intrinsic_objects.{h,cc} is excluded as
it is needed by dex2oat.
As the symbols are no longer exported, gtests are now linked
with the static version of the libartd-compiler library.
libart-compiler.so size:
- before:
arm: 2396152
arm64: 3345280
- after:
arm: 2016176 (-371KiB, -15.9%)
arm64: 2874480 (-460KiB, -14.1%)
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing --jit
Bug: 142365358
Change-Id: I1fb04a33351f53f00b389a1642e81a68e40912a8
|
|
Passes using local ArenaAllocator were hiding their memory
usage from the allocation counting, making it difficult to
track down where memory was used. Using ScopedArenaAllocator
reveals the memory usage.
This changes the HGraph constructor which requires a lot of
changes in tests. Refactor these tests to limit the amount
of work needed the next time we change that constructor.
Test: m test-art-host-gtest
Test: testrunner.py --host
Test: Build with kArenaAllocatorCountAllocations = true.
Bug: 64312607
Change-Id: I34939e4086b500d6e827ff3ef2211d1a421ac91a
|
|
Rationale:
The last ART vectorizer break-out CL \O/
This ensures spilling on x86 and x86_4 is correct.
Also, it paves the way to wider SIMD on ARM and MIPS.
Test: test-art-host
Bug: 34083438
Change-Id: I5b27d18c2045f3ab70b64c335423b3ff2a507ac2
|
|
Rationale:
Break-out CL of ART Vectorizer. We need to save 128-bit
of data (default ABI of ART runtime only saves 64-bit)
Note that this is *only* done for xmm registers that
are live, so overhead is not too big.
Bug: 34083438
Test: test-art-host
Change-Id: Ic89988b0acb0c104634271d0c6c3e29b6596d59b
|
|
Bug: 12687968
Bug: 32577579
Test: test-art-host, test-art-target CC
Change-Id: Ia57099d499fa704803cc5f0135f0f53fefe39826
|
|
Location is a ValueObject and should be trivially copyable. Move
copy constructor and copy assignment to default.
Add static assert.
Bug: 32619234
Test: m
Change-Id: I1ef8b65aafdbf84e3d4b7724b93f13936b590eba
|
|
Change DivZeroCheck, BoundsCheck and explicit NullCheck
slow path entrypoints to conform to kSaveEverything.
On Nexus 9, AOSP ToT, the boot.oat size reduction is
prebuilt multi-part boot image:
- 32-bit boot.oat: -12KiB (-0.04%)
- 64-bit boot.oat: -24KiB (-0.06%)
on-device built single boot image:
- 32-bit boot.oat: -8KiB (-0.03%)
- 64-bit boot.oat: -16KiB (-0.04%)
Test: Run ART test suite including gcstress on host and Nexus 9.
Test: Manually disable implicit null checks and test as above.
Change-Id: If82a8082ea9ae571c5d03b5e545e67fcefafb163
|
|
* Add explicit keyword to conversion constructors,
or NOLINT for implicit converters.
Bug: 28341362
Test: build with WITH_TIDY=1
Change-Id: I1e1ee2661812944904fedadeff97b620506db47d
|
|
Reducing the frame size makes stack maps smaller as we need
fewer bits for stack masks and some dex register locations
may use short location kind rather than long. On Nexus 9,
AOSP ToT, the boot.oat size reduction is
prebuilt multi-part boot image:
- 32-bit boot.oat: -416KiB (-0.6%)
- 64-bit boot.oat: -635KiB (-0.9%)
prebuilt multi-part boot image with read barrier:
- 32-bit boot.oat: -483KiB (-0.7%)
- 64-bit boot.oat: -703KiB (-0.9%)
on-device built single boot image:
- 32-bit boot.oat: -380KiB (-0.6%)
- 64-bit boot.oat: -632KiB (-0.9%)
on-device built single boot image with read barrier:
- 32-bit boot.oat: -448KiB (-0.6%)
- 64-bit boot.oat: -692KiB (-0.9%)
The other benefit is that at runtime, threads may need fewer
pages for their stacks, reducing overall memory usage.
We defer the calculation of the maximum spill size from
the main register allocator (linear scan or graph coloring)
to the RegisterAllocationResolver and do it based on the
live registers at slow path safepoints. The old notion of
an artificial slow path safepoint interval is removed as
it is no longer needed.
Test: Run ART test suite on host and Nexus 9.
Bug: 30212852
Change-Id: I40b3d114e278e2c5807982904fa49bf6642c6275
|
|
Previously, the gtest only exercised the default register allocator.
Note that the line count is high due mostly to whitespace changes.
Test: m test-art-host-gtest-register_allocator_test
Change-Id: I783edf98ae11d605d4f69834866c387abb71d34f
|
|
Test: m test-art-host
Change-Id: I8c0d77f339ab02b33588a54b96ecce5c8322cfce
|
|
Some of the intrinsics call on both the main and slowpath. This patch
adds support for such a CallKind and marks the intrinsics accordingly.
This will be exercised by a later patch that refactors all the runtime
calls to use InvokeRuntime().
Please note that without this patch, the calls to ValidateInvokeRuntime()
exercised by the following patches would fail.
Change-Id: I450571b8b47280a004b714996189ba6db13fb57d
|
|
This patch renames kCall to kCallOnMainOnly in preparation for
the next patch in this series which will be adding kCallOnMainAndSlowPath.
Note: With this patch there will be places where we use kCallOnMainOnly
even though we call on the slow path too. The next patch in this series
will fix that.
Test: ART host tests.
Change-Id: Iabfdb0901990d163be5d780f3bdd2fab6fa17b32
|
|
Also extend sun.misc.Unsafe test coverage to exercise
sun.misc.Unsafe.{get,put}{Int,Long,Object}Volatile.
Bug: 26205973
Bug: 29516905
Change-Id: I4d8da7cee5c8a310c8825c1631f71e5cb2b80b30
Test: Covered by ART's run-tests.
|
|
This first implementation uses slow paths to instrument heap
reference loads and GC root loads for the concurrent copying
collector, respectively calling the artReadBarrierSlow and
artReadBarrierForRootSlow (new) runtime entry points.
Notes:
- This implementation does not instrument HInvokeVirtual
nor HInvokeInterface instructions (for class reference
loads), as the corresponding read barriers are not stricly
required with the current concurrent copying collector.
- Intrinsics which may eventually call (on slow path) are
disabled when read barriers are enabled, as the current
slow path infrastructure does not support this case.
- When read barriers are enabled, the code generated for a
HArraySet instruction always go into the array set slow
path for object arrays (delegating the operation to the
runtime), as we are lacking a mechanism to keep a
temporary register live accross a runtime call (needed for
the instrumentation of type checking code, which requires
two successive read barriers).
Bug: 12687968
Change-Id: I14cd6107233c326389120336f93955b28ffbb329
|
|
A long constant needs to be in a register to store to memory.
By allowing stores of constants that are outside of the range of
int32_t, we reduce register usage.
Also support sets of float/double constants by using integer stores.
Rename RegisterOrInt32LongConstant to RegisterOrInt32Constant as it
now handles any type of constant.
Change-Id: I025d9ef889a5a433e45aa03b376bae40f14197d2
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|
Implement dchecked_vector<> template that DCHECK()s element
access and insert()/emplace()/erase() positions. Change the
ArenaVector<> and ScopedArenaVector<> aliases to use the new
template instead of std::vector<>. Remove DCHECK()s that
have now become unnecessary from the Optimizing compiler.
Change-Id: Ib8506bd30d223f68f52bd4476c76d9991acacadc
|
|
Tag previously "Misc" arena allocations with more specific
allocation types. Move some native heap allocations to the
arena in BCE.
Bug: 23736311
Change-Id: If8ef15a8b614dc3314bdfb35caa23862c9d4d25c
|
|
And completely remove the deprecated GrowableArray.
Replace GrowableArray with ArenaVector in code generators
and related classes and tag arena allocations.
Label arrays use direct allocations from ArenaAllocator
because Label is non-copyable and non-movable and as such
cannot be really held in a container. The GrowableArray
never actually constructed them, instead relying on the
zero-initialized storage from the arena allocator to be
correct. We now actually construct the labels.
Also avoid StackMapStream::ComputeDexRegisterMapSize() being
passed null references, even though unused.
Change-Id: I26a46fdd406b23a3969300a67739d55528df8bf4
|
|
Replace GrowableArray with ArenaVector and tag arena
allocations with new allocation types.
As part of this, make the register allocator a bit more
efficient, doing bulk insert/erase. Some loops are now
O(n) instead of O(n^2).
Change-Id: Ifac0871ffb34b121cc0447801a2d07eefd308c14
|
|
Change-Id: Ic9c6b62e36706e571fd71c18d24d8e76ae2d5c7b
|
|
Replace GrowableArray with ArenaVector in HGraph and related
classes HEnvironment, HLoopInformation, HInvoke and HPhi,
and tag allocations with new arena allocation types.
Change-Id: I3d79897af405b9a1a5b98bfc372e70fe0b3bc40d
|
|
This reverts commit a5fc140ff315dda9bc0a8e59963ed547676cd941.
Change-Id: Ic322484176e55d0c7cd7250d629b9e5046006a4f
|
|
register_allocator_test32 fails.
This reverts commit 283b8541546e7673d33d104241623d07c91cf500.
Change-Id: I2a46f3c68de3e8273e402102065c13797045c481
|
|
A temporary with an explicit RegisterLocation, such as ESI on x86 didn't
have the register marked as allocated. This caused it to not be
saved/restored in the prologue/epilogue, causing problems in the caller
routine, which expected it to be saved. Found while implementing
https://android-review.googlesource.com/#/c/157522/.
Change-Id: I22ca2b24c2d21b1c6ab6cfb7dec26cb38034a891
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|
Fix was to special case baseline for x86, which does not have enough
registers to allocate the current method.
This reverts commit c345f141f11faad177aa9635a78088d00cf66086.
Change-Id: I5997aa52f8d4df373ae5ff4d4150dac0c44c4c10
|
|
Fails on baseline/x86.
This reverts commit 38207af82afb6f99c687f64b15601ed20d82220a.
Change-Id: Ib71018367eb7c6046965494a7e996c22af3de403
|
|
Change-Id: I0d15244b6b44c8b10079398c55da5071a3e3af66
|
|
This code has no functionality change. It adds a placeholder
for chaining inlined frames.
Change-Id: I5ec57335af76ee406052345b947aad98a6a4423a
|
|
The algorithm of ParallelMoveResolverNoSwap() is almost the same with
ParallelMoveResolverWithSwap(), except the way we resolve the circular
dependency. NoSwap() uses additional scratch register to resolve the
circular dependency. For example, (0->1) (1->2) (2->0) will be performed
as (2->scratch) (1->2) (0->1) (scratch->0).
On architectures without swap register support, NoSwap() can reduce the
number of moves from 3x(N-1) to (N+1) when there is circular dependency
with N moves.
And also, NoSwap() algorithm does not depend on architecture register
layout information, which means it can support register pairs on arm32
and X/W, D/S registers on arm64 without additional modification.
Change-Id: Idf56bd5469bb78c0e339e43ab16387428a082318
|
|
My assumption was wrong. We actually can use same as first input with any, *only if* the generate code does not clobber the first input. We use this for, e.g. DivZeroCheck, NullCheck.
This reverts commit 95bf7547986e68d4ac93b0a529aaa8eb3c998c1f.
Change-Id: Ib72d73fe580f5bc707b41c651f2c8936bd4e2407
|
|
Having SameAsFirstInput for out, and first input Any does not
make sense currently. If it's stack, we are going to overwrite
it, potentially clobbering another local. And constant does not
make sense.
Change-Id: I0ce357137487ed3dcecf4efd9922a039a2a1a29d
|
|
Tweak the generated code to allow more use of constants and other small
changes
- Use test vs. compare to 0
- EmitMove of 0.0 should use xorps
- VisitCompare kPrimLong can use constants
- cmp/add/sub/mul on x86_64 can use constants if in int32_t range
- long bit operations on x86 examine long constant high/low to optimize
- Use 3 operand imulq if constant is in int32_t range
Change-Id: I2dd4010fdffa129fe00905b0020590fe95f3f926
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|
This reverts commit 154552e666347d41d95d7619c6ee56249ff4feca.
Change-Id: Idc726551c249a888b7ff5fde8508ae50e81b2e13
|
|
Few libcore failures.
This reverts commit b4ba354cf8d22b261205494875cc014f18587b50.
Change-Id: I4a28d853e730dff9b69aec9555505803cf2fcd63
|
|
Change-Id: I9006972a65a1f191c45691104a960366747f9d16
|
|
Moved arena pool into the runtime.
Motivation:
Allow GC to use arena allocators, recycle arena pool for linear alloc.
Bug: 19264997
Change-Id: I8ddbb6d55ee923a980b28fb656c758c5d7697c2f
|
|
Change-Id: Ia49dc5bf3e9a2bd481425bfe7fbeea9feb66c8e6
|
|
Change-Id: Ie2a540ffdb78f7f15d69c16a08ca2d3e794f65b9
|