Age | Commit message (Collapse) | Author |
|
... which targets invoke-virtual methods.
New entrypoint changes deliverException's offset, hence arm test
change.
Bug: 297147201
Test: ./art/test/testrunner/testrunner.py --host --64 -b --optimizing
Test: ./art/test.py --host -g
Change-Id: I636fc60c088bfdf9b695c92de47f1c539e3956f1
|
|
Both exception delivery (various methods calling
Thread::QuickDeliverException()) and deoptimization (via
artDeoptimizeImpl) use QuickExceptionHandler to find the target
context and do a long jump to it via
QuickExceptionHandler::DoLongJump. The long jump is done
directly from the C++ code, so the frames of the related C++ method
are still on the stack before the change of the pc. Note that all
those methods are marked as NO_RETURN to reflect that.
This patch changes the approach; instead of having the long jump
directly from the C++ methods related to exceptions and
deoptimization, those methods now only prepare the long jump
context and return. So their callers (mainly .S quick entry points
and stubs) now need to do a long jump explicitly; thus there will
be no C++ frames on the stack before the jump.
This approach makes it possible to support exceptions and
deoptimization in simulator mode; so we don't need to unwind
native (C++ methods' frames) and simulated stacks at the same.
Authors: Artem Serov <artem.serov@linaro.org>,
Chris Jones <christopher.jones@arm.com>
Test: test.py --host --target
Change-Id: I5f90e6b5ba152fc2205728f1e814bbe3d609af9d
|
|
This is to support on-demand tracing. This is behind a flag that is
disabled by default. This is the initial CL that adds support to ART.
There will be a followup CL to add an API that can be used from the
frameworks to request a trace of dex methods that got executed. This is
different from method tracing in two ways:
1. Method tracing is precise whereas this traces on a best effort basis.
2. Unlike method tracing this uses a circular buffer so can only trace a
limited window into the past.
Bug: 352518093
Test: art/test.py
Change-Id: I8d958dd2ccefe8205a6c05b4daf339ea71b5dbc4
|
|
buffer"
This reverts commit 44b5204a81e9263a612af65f426e66395ae9426b.
Reason for revert: Relanding after fix for failures. The curr_entry
should use a register that isn't rax or rdx on x86 and x86_64.
Change-Id: I9e19eae72b93b4c49c619a1b58a892040d975e3e
|
|
buffer"
This reverts commit b67495b6aa3f1253383938f2699da6eada1a0ead.
Bug: 259258187
Reason for revert: Fails on bot
Change-Id: If1f360a50d28e3166ca262463b80371c41f7404e
|
|
For the thread local method trace buffers we used to store the pointer
to the buffer and the current index of the next free space. We compute
the address of the location to store the entry in the generated JITed
code when storing the method trace entry. Instead we could use a pointer
to the entry to avoid computing the address in the generated code. This
doesn't have a noticeable impact for the regular method tracing but is
better for the upcoming changes that will introduce an experimental
feature for on-demand method tracing.
Bug: 259258187
Test: art/test.py
Change-Id: I7e39e686bee62ee91c651a1cb3ff242470010cb6
|
|
... with `ScopedAssertNoTransactionChecks`. The new check
is stronger than the old one but does not work during early
setup when we do not have a `Thread` object yet.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Change-Id: Iba5a5cda0d97993ff324b4d11de02cb07f770699
|
|
Bug: 260881207
Test: presubmit
Test: abtd app_compat_drm
Test: abtd app_compat_top_100
Test: abtd app_compat_banking
Change-Id: I0eca5c4fd64ec61388eded9b9e066091581e0c7e
|
|
This reverts commit 8bc6a58df7046b4d6f4b51eb274c7e60fea396ff.
PS1 is identical to
https://android-review.git.corp.google.com/c/platform/art/+/2746640
PS2 makes the following changes:
- Remove one DCHECK each from the two WaitForFlipFunction variants.
The DCHECK could fail if another GC was started in the interim.
- Break up the WaitForSuspendBarrier timeout into shorter ones.
so we don't time out as easily if our process is frozen.
- Include the thread name for ThreadSuspendByThreadIdWarning, since
we don't get complete tombstones for some failures.
Test: Treehugger, host tests.
Bug: 240742796
Bug: 203363895
Bug: 238032384
Bug: 253671779
Bug: 276660630
Bug: 295880862
Bug: 294334417
Bug: 301090887
Bug: 313347640
(and more)
Change-Id: I12c5c01b1e006baab4ee4148aadbc721723fb89e
|
|
This reverts commit c6371b52df0da31acc174a3526274417b7aac0a7.
Reason for revert: This seems to have two remaining issues:
1. The second DCHECK in WaitForFlipFunction is not completely guaranteed to hold, resulting in failures for 658-fp-read-barrier.
2. WaitForSuspendBarrier seems to time out occasionally, possibly spuriously so. We fail when the futex times out once. That's probably incompatible with the app freezer. We should retry a few times.
Change-Id: Ibd8909b31083fc29e6d4f1fcde003d08eb16fc0a
|
|
This reverts commit a43e67ea1a314e5c6faf77457ffc5ea39c24d4ca.
PS1 is identical to aosp/2725875 .
PS2 improves static and dynamic lock checking and makes the
documentation more precise. ThreadList::Unregister for a thread
that wasn't registered becomes fatal; I can't convince myself that
any reasonable recovery is possible, and we could otherwise turn an
obvious error into a very subtle and potentially dangerous one.
Perhaps controversially, we now REQUIRE thread_list_lock_ for
IncrementSuspendCount, eventhough that requires a kludge in the one
case in which we legitimately don't have it. But after thinking about
it, the extra checking and documentation outweighs the kludge, and we
may want to consider this elsewhere as well. Added FakeMutexLock
to enable the kludge here and elsewhere.
PS3 adds some documentation for thread lifetime rules and enforces
a sufficient, though in some cases overly strong, set of related
restrictions for EnsureFlipFunctionStarted. It ensures that callers
conform to this stronger restriction. This required a simplification
in StackUtil::GetThreadListStackTraces.
PS4 Fix lint issue. Add Thread lifetime DCHECK to WaitForFlipFunction.
Rebase to adjust for the fact that aosp/2813551 effectively merged
a small piece of this.
PS5 Add Thread::VerifyState(). We previouslly checked for kTerminated
in a couple of places. That doesn't make sense, since we have no
way to tell whether the thread has been deallocated and reallocated at
that point. Instead of checking that the state is not kTerminated,
just check that it is a sane value. Address some old reviewer
comments. Add more output for EnsureFlipFunctionStarted DCHECK.
RequestSynchronousCheckpoint now aborts rather than returning false
when invoked on a terminated thread. Seeing kTerminated would mean
the thread could have been destroyed, and thus this call was unsafe.
PS6 Add another VerifyState call to RequestSynchronousCheckpoint.
PS7 Rebase and add more ThreadExitFlag tests.
PS8 Rebase. Temporary workaround for compile error.
PS9 Remove PS8 workaround. Add a version of GetPeerFromOtherThread
that expects thread_list_lock_ to be initially held, and relies on
ThreadExitFlag to detect terminated threads. Modify several jvmti
clients to use this correctly. This effectively includes a fixed
version of aosp/2847246.
PS10 Work around the fact that GetReferenceKind in ti_heap.cc may
call GetPeerFromOtherThread with or without thread_list_lock_.
I think this is kind of benign, though it makes reasoning harder,
and weakens our debug checking.
PS11 Remove extra semicolon.
PS12 Add another DCHECK in UnregisterThreadExitFlag.
PS13 Minor tweaks to address reviewer comments.
PS14 More tweaks to address reviewer comments.
PS15 Do not report that a thread exited while its flip-function is
still running.
PS16 Fix comment typo.
Test: Treehugger
Bug: 240742796
Bug: 203363895
Bug: 238032384
Bug: 253671779
Bug: 276660630
Bug: 295880862
Bug: 294334417
Bug: 301090887
Bug: 313347640
(and more)
Change-Id: I44caa30a0a4da8ab105fedd4d2238f59efc1d675
|
|
This reverts commit f9fdd3ce0180972dc8d4f0c8410ea7702828a703.
Reason for revert: Very suspicious host-x86_64-debug failure on LUCI.
Change-Id: Ia01dd3df8d64d6bc0d12319b06a8380f64a46785
|
|
This reverts commit 2a52e8a50c3bc9011713a085f058130bb13fd6a6.
PS1 is identical to aosp/2710354.
PS2 fixes a serious bug in the ThreadExitFlag list handling code.
This didn't show up in presubmit because the list rarely contains
more than a single element. Added an explicit gtest, and a bunch
of DCHECKS around this.
PS3 Rebase and fix oat version. Once more.
PS4 Weaken CheckEmptyCheckpointFromWeakRefAccess check to allow
weak reference access in a checkpoint.
This happens via DumpLavaStack -> ... -> MonitorObjectsStackVisitor
-> ... -> FindResolvedMethod -> ... -> IsDexFileRegistered. I haven't
yet been able to convince myself that this is inherently broken,
though it is trickier than I would like.
PS5 Move cp_placeholder_mutex_ declaration higher in thread.h.
Test: m test-art-host-gtest
Change-Id: I66342ef1de27bfa0272702b5e1d3063ef8da7394
|
|
This reverts commit 996cbb566a5521ca3b0653007e7921469f58981a.
Reason for revert: Some new intermittent master-art-host buildbot failures look related and need investigation.
PS2: Fix oat.h merge conflict by not letting the revert touch it.
PS3: Correct PS2 to actually bump the version once more instead.
Change-Id: I70c46dc4494b585768f36e5074d34645d2fb562a
|
|
This reverts commit b6f3b439d4f12e89393ba8101eea8671c94ba237.
PS1: Identical to aosp/2652371 .
PS2: Introduce kSuspensionImmune to disable suspension of a thread
that is being relied upon to execute ResumeAll(). This replaces
the test in SuspendAll() to check whether the caller was being asked
to suspend itself. That test was deadlock-prone, since a SuspendAll
request from e.g. the GC to block, and GC progress might be required
to resume the thread running the GC.
Since SuspendAll() now only loops for a single reason, we no longer
need to track why we looped.
Reduce the number of iterations in each 129-ThreadGetId thread
drammatically.
PS3: Address reviewer comments, including fixing a newly introduced
bug in CheckSuspend(). Fix 129-GetThreadId by drammatically reducing
the iteration count when we appear to be running slowly, which is
normally the case for gcstress. Earlier versions of this CL were
apparently also failing on this test, but the failure was hidden by
other failures. This mostly undoes the PS2 change to this test, now
that the failure is better understood.
PS4: Rebase.
PS5: Fix 129-GetThreadId code formatting.
PS6: Address more reviewer comments related to 129-GetThreadId.
PS7: Remove DCHECK in EnsureFlipFunctionStarted. It was unsafe,
since the thread may no longer be around.
Test: Treehugger.
Bug: 240742796
Bug: 203363895
Bug: 238032384
Bug: 253671779
Bug: 276660630
Bug: 295880862
Bug: 294334417
(and more)
Change-Id: I99260fdc4feb9bcdc8b8b566e40912532f1a4937
|
|
This reverts commit 2caa640269faabd2455ec29cfe6ad330d442b715.
Reason for revert: It looks like there may be some new timeout failures on the master-art-host buildbot.
I'll go ahead and generate a revert. Please submit once there are enough failures to investigate.
Change-Id: I272e4ac5f4367a12a2eb027e456d789e8fd26ae6
|
|
This reverts commit 63af30b8fe8d4e1dc32db4dcb5e5dae1efdc7f31.
master (aosp/2530206) PS1 is identical to aosp/2377951 .
master (aosp/2530206) PS2 is a rebase.
At this point, master branch was replaced by main, and this CL moved.
PS1: Restructure documentation for the IncrementSuspendCount handshake
to install a suspend barrier.
Document a couple of additional mutator lock assumptions.
Add some DCHECKs to check that suspended threads really are
suspended.
Weaken seq_cst memory order in a couple of places where it really
didn't make sense here. Clearly not a correctness fix.
Includes a rebase and merge with aosp/2587606.
PS2: Another rebase.
Fix thumb assembler test to compensate for Thread structure layout
changes.
PS3: Messy rebase, primarily to handle aosp/2670108, which included both
new fixes around this and a few small snippets of this CL.
Call EnsureFlipFunctionStarted without a state-and-flags argument
only when we actually hold the mutator lock, as promised.
PS4: Minor rebase, some lint fixes.
PS5: Another minor lint fix that I had missed in PS4.
PS6: Fix for RunCheckpoint bug introduced around PS3. Fix expectations
in jni_cfi_test to compensate for thread structure layout changes.
PS7: In PS3+, EnsureFlipFunctionStarted could access a destroyed "this"
thread. Fix that, and make the function static to make this
constraint more explicit. (And running a method on a potentially
destroyed object just seemed unclean.)
PS8: Address reviewer comments. The major issue was that we released
the suspend_count_lock too early in FlipThreadRoots, potentially
allowing an intervening SuspendAll to block us. The fix involved
a very minor extention of the mutex API.
PS9: Comment typo fix.
PS10: Address new reviewer comments. Rebase.
MUST_SLEEP for 129_ThreadGetId debug output.
Test: Treehugger.
Bug: 240742796
Bug: 203363895
Bug: 238032384
Bug: 253671779
Bug: 276660630
(and more)
Change-Id: I0f2450e394c03c17eece3698286b2f3e45727967
|
|
This removes the cruft in creating static instances, and the need to
explicitly visit verifier roots.
Test: test.py
Change-Id: Ia0f0a82cbc66bb57f30610587f080e75d4d32e92
|
|
This reverts commit 221b6c5fcd66d4b6f2626c311d03bde2fb1589f9.
Reason for revert: Preemptive revert. Earlier versions have had a tendency to cause subtle breakage.
Please do not submit unless something breaks.
Change-Id: Iad2a7f920756f365789c422948632f5db5a28fd5
|
|
This reverts commit c85ae17f8267ac528e58892099dcefcc73bb8a26.
PS1: Identical to aosp/2266238
PS2: Address the lint failure that was the primary cause of the revert.
Don't print information about what caused a SuspendAll loop
unless we actually gathered the information.
Restructure thread flip to drop the assumption that certain threads
will shortly become Runnable. That added a lot fo complexity and
was deadlock-prone. We now simply try to run the flip function both
in the target thread when it tries to become runnable again, and in
the requesting thread, without paying attention to the target
thread's state. The first attempt succeeds. Which means the
originating thread will only succeed if the target is still
suspended.
This adds some complexity to deal with threads terminating in the
meantime, but it avoids several issues:
1) RequestSynchronousCheckpoint blocked with thread_list_lock
and suspend_count_lock, while waiting for flip_function to become
non-null. This made it at best hard to reason about deadlock
freedom. Several other functions also had to wait for a null
flip_function, complicating the code.
2) Synchronization in FlipThreadRoots was questionable. In order to
tell when to treat a thread as previously runnable, it looked at
thread state bits that could change asynchronously. AFAICT, this
was probably correct under sequential consistency, but not with
the actual specified memory ordering. That code was deleted.
3) If a thread was intended to become runnable shortly after
the start of the flip, we paused it until all thread flips were
completed. This probably occasionally added latency that
escaped our measurements.
Weaken several assertions involving IsSuspended() to merely
claim the thread is not runnable, to be consistent with the
above change. The stringer assertion no longer holds in the
flip function.
Assert that we never suspend while running the GC, to ensure
the GC never acts on a thread suspension request, which would
likely result in deadlock.
Update mutator_gc_coord.md to reflect additional insights about
this code.
Change the last parameter of DecrementSuspendCount to be a boolean
rather than SuspendReason, since we mostly ignore the precise
SuspendReason.
Add NotifyOnThreadExit mechanism so that we can tell whether
a thread exited, even if we release thread_list_lock_.
Rewrite RequestSynchronousCheckpoint to take advantage of the
above. The new thread-flip code also uses it.
Remove now unnecessary checks that we do not suspend with a thread
flip in progress.
Various secondary changes and simplifications that follow from the
above.
Reduce DefaultThreadSuspendTimeout to something below ANR timeout.
Explicitly ensure that when FlipThreadRoots() returns, all thread
flips have completed. Previously that was mostly true, but actually
guaranteed by barrier code in the collector. Remove that code.
(The old version was hard to fix in light of potential exiting
threads.)
PS3: Rebase
PS4: Fix and complete PS2 changes.
PS5: Edit commit message.
PS6: Update entry_points_order_test, again.
PS7-8: Address many minor reviewer comments. Remove more dead code,
including all the IsTransitioningToRunnable stuff.
PS9: Slightly messy rebase
PS10: Address comments. Most notably:
SuspendAll now ensures that the caller is not left with a pending
flip function.
GetPeerFromOtherThread() sometimes runs the flip function instead
of calling mark. The old way would not work for CMC. This makes it
no longer const.
PS11: Fix a PS10 oversight, mostly in the CMC collector code.
PS12: Fix comment and documentation typos.
Test: Run host run tests. TreeHugger.
Bug: 240742796
Bug: 203363895
Bug: 238032384
Bug: 253671779
Change-Id: I81e366d4b739c5b48bd3c1509acb90d2a14e18d1
|
|
Remove the code to handle instrumentation stubs. We no longer use them.
Bug: 206029744
Test: art/test.py
Change-Id: I2b7eabf80bd34989314c0d2b299d7b1b35de0b85
|
|
This reverts commit fe9b34f845e8e439b4ae47ae999ef2cfdbd66462.
Reason for revert: Breaks full-eng build
Change-Id: I230b31809e274740b8fae9358c260787462efe4d
|
|
This reverts commit 0db9605ec8bd3158f4f0107a511dd09a889c9341.
PS1: Identical to aosp/2255904
PS2: Detect and report excessive waiting for a suspend-friendly state.
Add internal timeout to 129-ThreadGetId, so that we can print more
state information on exit.
We explicitly avoid suspending the HeapTaskDaemon to retrieve its
stack trace. Fix a race that allowed this to happen anyway (with
very low probability).
Includes a slightly nontrivial rebase.
PS3: Address a couple of minor comments.
PS4: Reformatted, as suggested by the upload script, except for
tls_ptr_sized_values, where it seemed too likely to cause
unnecessary merge conflicts.
PS5: SuspendAllInternal does not hold mutator lock, but may take a
long time with suspend_all_count_ = 1. Another thread waiting
for suspend_all_count_ could sleep many times. Explicitly wait
on a condition variable instead. This intentionally has a low
kMaxSuspendRetries so that we can see whether it is hit in
presubmit.
PS6: Adjust kMaxSuspendRetries to a bit lower than the PS3/PS4
version, but much higher than the PS5 debug version.
Test: Build and boot AOSP, Treehugger
Bug: 240742796
Bug: 203363895
Bug: 238032384
Bug: 253671779
Change-Id: I58d63f494a7454e00473b23864f8952abed7bf6f
|
|
Method tracing in streaming mode uses a global buffer to record method
entry / exit events and uses locks to synchronize across threads. Taking
a lock for each event is expensive and makes the method tracing slow.
This CL changes it to use a per-thread buffer so that each thread
accesses its own buffer. This also allows us to fast path method trace
events in JITed code in the future. The changes in this CL:
1. Add a per-thread buffer which is initialized lazily on the first
method trace event.
2. When the per-thread buffer is initialized we record the information
about the thread. This means we no longer need the bitmap we used
to record the thread info when a new thread is seen.
3. The data from the buffer is flushed to file:
1. When a thread detaches, so we can flush any recorded data
2. When the buffer is full
3. When we stop tracing.
The per-thread buffer is always accessed by the thread that owns it
except when we record the method enter events for on stack methods. It
is safe to access other thread's buffer since everything is suspended at
that point.
This CL also adds a test to check that the generated trace is in the
expected format.
Bug: 259258187
Test: art/testrunner.py -t 2246
Change-Id: I074bf2edb8c884dec0c9a7a9c37b4ef0ec7892a8
|
|
This reverts commit a23d325152c7cd81ccb426a407f6da280797e61d.
Reason for revert: Triggered failures in org.apache.harmony.jpda.tests.jdwp.Events_CombinedEventsTest#testCombinedEvents_05
Change-Id: I0604a60f73a983c92e29827222bfa6158ee043aa
|
|
This reverts commit ebd76406bf5fa74185998bc29f0f27c20fa2e683.
PS1 is identical to aosp/2216806.
PS2 in addition converts the RunCheckpoint call used from
StackUtil::GetAllStackTraces to RunCheckpointUnchecked to temporarily
work around another checkpoint Run() function lock ordering
issue.
PS3 is a nontrivial rebase.
Test: Build and boot AOSP, Treehugger
Bug: 240742796
Bug: 203363895
Bug: 238032384
Bug: 253671779
Change-Id: I38385e41392652cc30e5e74fd8b93e22088827a5
|
|
Avoid creating `Runtime` or create the `Runtime` with a boot
image to make the test setup faster.
Test: m test-art-host-gtest
Test: run-gtests.sh
Change-Id: I3f09de81491402442f1704d25bb06de995d8a3ca
|
|
This reverts commit fd20a745227aa7cae7a08728bb29e5bfce64ea87.
Reason for revert: Lots of libartd failures due to new checkpoint lock level check.
Change-Id: I0cf88ff893f8743a9a830a49489807d0921199a3
|
|
This reverts commit 7c8835df16147b9096dd4c9380ab4b5f700ea17d.
PS1 is identical to aosp/2171862 .
PS2 makes the following significant changes:
1) Avoid inflating locks from the thread dumping checkpoint Run()
method. This violates the repeatedly stated claim that
checkpoint Run() methods don't suspend threads. This requires
that we print object addresses in thread dumps in some cases in
which we were previously able to give hashcodes instead.
2) For debug builds, check that we do not acquire a higher
level lock in checkpoint Run() methods, thus enforcing that
previously stated property. (Lokesh suggested this, and I
think it's a great idea. But it requires changes 4-6 below.)
3) Add a bit more justification that RunCheckpoint cannot
result in circular suspend requests.
4) For now, allow an explicit override of (2) for ddms code in
which it would otherwise fail. This should be fixed later.
5) Raise the level of monitor locks, to correctly reflect
the fact that they may be held while much of the runtime
is executed.
6) (5) was in conflict with monitor deflation acquiring a
monitor lock after acquiring the monitor list lock. But this
failure is spurious, both because it is a TryLock acquisition
that can't possibly contributed to a deadlock, and secondly
because it conflates all monitor locks, and an actual deadlock
is probably not possible anyway. Leverage the former and add
a facility to avoid checking for safe TryLock calls.
(1) Should fix the one failure I managed to debug after the
last submission attempt. Hopefully it also accounts for the
others.
PS3, PS5, PS6: Trivial corrections and cleanups
PS4, PS7, PS8: Rebase
Test: Build and boot AOSP, Treehugger
Bug: 240742796
Bug: 203363895
Bug: 238032384
Change-Id: I80d441ebe21bb30b586131f7d22b7f2797f2c45f
|
|
1. String(byte[], byte) constructor is added.
2. StringFactory.newStringFromBytes(byte[], int, int, int) is allowlisted
to be called in the unstarted runtime.
Bug: 247773125
Test: art/test/testrunner/testrunner.py -b --host
Change-Id: I9386b3529a94a122654574e3110d08222be7f282
|
|
This reverts commit 7c39c86b17c91e651de1fcc0876fe5565e3f5204.
Reason for revert: We're see a number of new, somewhat rare, timeouts on multiple tests.
Change-Id: Ida9a4f80b64b6fedc16db176a8df9c2e985ef482
|
|
Have SuspendAll check that no other SuspendAlls are running, and have
it refuse to start if the invoking thread is also being suspended.
This limits us to a single SuspendAll call at a time,
but that was almost required anyway. It limits us to a single active
SuspendAll-related suspend barrier, a large simplification.
It appears to me that this avoids some cyclic suspension scenarios
that were previously still possible.
Move the deadlock-avoidance checks for flip_function to
ModifySuspendCount callers instead of failing there.
Make single-thread suspension use suspend barriers to avoid the
complexity of having to reaccess the thread data structure from another
thread while waiting for it to suspend. Add a new data structure to
remember the single thread suspension barriers without allocating
memory, This uses a linked list of stack allocated data structures,
as in MCS locks.
The combination of the above avoids a suspend_barrier data structure
that can overflow, and removes any possibility of ModifySuspendCount
needing to fail and retry. Recombine ModifySuspendCount and
ModifySuspendCountInternal.
Simplified barrier decrement in PassActiveSuspendBarriers.
Strengthened the relaxed memory order, it was probably wrong.
Fix the "ignored" logic in SuspendAllInternal. We only ever ignored
self, and ResumeAll didn't support anything else anyway.
Explicitly assume that the initiating thread, if not null, is
registered. Have SuspendAll and friends only ignore self, which was
the only actually used case anyway, and ResumeAll was otherwise wrong.
Make flip_function atomic<>, since it could be read while being cleared.
Remove the poorly used timed_out parameter from the SuspendThreadByX().
Make IsSuspended read with acquire semantics; we often count on having
the target thread suspended after that, including having its prior
effects on the Java state visible. The TransitionTo... functions already
use acquire/release.
Shrink the retry loop in RequestSynchronousCheckpoint. Retrying the
whole loop appeared to have no benefit over the smaller loop.
Clarify the behavior of RunCheckpoint with respect to the mutator
lock.
Split up ModifySuspendCount into IncrementSuspendCount and
DecrementSuspendCount for improved clarity. This is not quite a
semantic no-op since it eliminates some redundant work when
decrementing a suspend count to a nonzero value. (Thanks to
Mythri for the suggestion.)
I could not convince myself that RequestCheckpoint returned false
only if the target thread was already suspended; there seemed to
be no easy way to preclude the state_and_flags compare-exchange
failing for other reasons. Yet callers seemed to assume that property.
Change the implementation to make that property clearly true.
Various trivial cleanups.
This hopefully reduces thread suspension deadlocks in general.
We've seen a bunch of other bugs that may have been due to the cyclic
suspension issues. At least this should make bug diagnosis easier.
Test: ./art/test/testrunner/testrunner.py --host --64 -b
Test: Build and boot AOSP
Bug: 240742796
Bug: 203363895
Bug: 238032384
Change-Id: Ifc2358dd6489c0b92f4673886c98e45974134941
|
|
This reverts commit 1d1d25eea72cf22aed802352a82588d97403f7b6.
Reason for revert: Relanding after fix to failures:
https://android-review.googlesource.com/c/platform/cts/+/2145979
Bug: 206029744
Change-Id: Id3c7508c86f9aeb0ddfc1c4792ed54f003b88e77
|
|
debuggable""
This reverts commit 6fb0acc14459a856c35b642e3368aff853259260.
Reason for revert: Breaks android.jvmti.cts.JvmtiHostTest
https://buganizer.corp.google.com/issues/237991413
Change-Id: I00fb58080693ddebc03c7b62ea67c91150ef7a21
|
|
This reverts commit 5c9b55aa95295a287abd86f1e7fbe98c3f35ffd6.
Reason for revert: Relanding with fixes for failure
Fixes:
1. Arm64 needs to use 64-bit registers
2. We cannot deoptimize directly from GenericJniEndTrampoline since we
only have a refs and args frame. So call the method exit hooks from
art_quick_generic_jni_trampoline.
Change-Id: If1f08eca69626f60f42f10205b482a3764610846
|
|
This reverts commit 90f12677f80169dc3ef919c2067349f94b943e7f.
Reason for revert: Failures on device
https://ci.chromium.org/ui/p/art/builders/ci/angler-armv7-ndebug/3058/overview
https://ci.chromium.org/ui/p/art/builders/ci/angler-armv8-ndebug/3049/overview
Change-Id: I43f943f9180b8c76db02a2a5c228a209a2f18a82
|
|
Don't install instrumentation stubs for native methods in debuggable
runtimes. The GenericJniTrampoline is updated to call method entry /
exit hooks. When JITing JNI stubs in debuggable runtimes we also include
calls to method entry / exit hooks when required.
Bug: 206029744
Test: art/test.py
Change-Id: I1d92ddb1d03daed74d88f5c70d38427dc6055446
|
|
Golem results for art-opt-cc (higher is better):
linux-ia32 before after
NativeDowncallStaticNormal 46.766 51.016 (+9.086%)
NativeDowncallStaticNormal6 42.268 45.748 (+8.235%)
NativeDowncallStaticNormalRefs6 41.355 44.776 (+8.272%)
NativeDowncallVirtualNormal 46.361 52.527 (+13.30%)
NativeDowncallVirtualNormal6 41.812 45.206 (+8.118%)
NativeDowncallVirtualNormalRefs6 40.500 44.169 (+9.059%)
(The NativeDowncallVirtualNormal result for x86 is skewed
by one extra good run as Golem reports the best result in
the summary. Using the second best and most frequent
result 50.5, the improvement is only around 8.9%.)
linux-x64 before after
NativeDowncallStaticNormal 44.169 47.976 (+8.620%)
NativeDowncallStaticNormal6 43.198 46.836 (+8.423%)
NativeDowncallStaticNormalRefs6 38.481 44.687 (+16.13%)
NativeDowncallVirtualNormal 43.672 47.405 (+8.547%)
NativeDowncallVirtualNormal6 42.268 45.726 (+8.182%)
NativeDowncallVirtualNormalRefs6 41.355 44.687 (+8.057%)
(The NativeDowncallStaticNormalRefs6 result for x86-64 is
a bit inflated because recent results jump between ~38.5
and ~40.5. If we take the latter as the baseline, the
improvements is only around 10.3%.)
linux-armv7 before after
NativeDowncallStaticNormal 10.659 14.620 (+37.16%)
NativeDowncallStaticNormal6 9.8377 13.120 (+33.36%)
NativeDowncallStaticNormalRefs6 8.8714 11.454 (+29.11%)
NativeDowncallVirtualNormal 10.511 14.349 (+36.51%)
NativeDowncallVirtualNormal6 9.9701 13.347 (+33.87%)
NativeDowncallVirtualNormalRefs6 8.9241 11.454 (+28.35%)
linux-armv8 before after
NativeDowncallStaticNormal 10.608 16.329 (+53.93%)
NativeDowncallStaticNormal6 10.179 15.347 (+50.76%)
NativeDowncallStaticNormalRefs6 9.2457 13.705 (+48.23%)
NativeDowncallVirtualNormal 9.9850 14.903 (+49.25%)
NativeDowncallVirtualNormal6 9.9206 14.757 (+48.75%)
NativeDowncallVirtualNormalRefs6 8.8235 12.789 (+44.94%)
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Bug: 172332525
Change-Id: Ie144bc4f7f82be95790ea7d3123b81a3b6bfa603
|
|
Move JNI entrypoints to `jni_entrypoints_<arch>.S` and
shared helper macros to `asm_support_<arch>.S`. Introduce
some new macros to reduce code duplication. Fix x86-64
using ESP in the JNI lock slow path.
Rename JNI lock/unlock and read barrier entrypoints to pull
the "jni" to the front and drop "quick" from their names.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Bug: 172332525
Change-Id: I20d059b07b308283db6c4e36a508480d91ad07fc
|
|
This reverts commit 02e0eb7eef35b03ae9eed60f02c889a6be400de9.
Reason for revert: Fixed the arm64 UNLOCK_OBJECT_FAST_PATH
macro to use the correct label for one branch to slow path.
Change-Id: I311687e877c54229af1613db2928e47b3ef0b6f2
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Bug: 172332525
|
|
This reverts commit c17656bcf477e57d59ff051037c96994fd0ac8f2.
Reason for revert: Broke tests.
At least the arm64 macro UNLOCK_OBJECT_FAST_PATH uses
an incorrect label for one branch to slow path.
Bug: 172332525
Bug: 207408813
Change-Id: I6764dcfcba3b3d780fc13a66d6e676a3e3946a0f
|
|
Lock and unlock in dedicated entrypoints instead of the
`JniMethodStart*()` and `JniMethodEnd*()` entrypoints.
Update x86 and x86-64 lock/unlock entrypoints to use the
same checks as arm and arm64.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Bug: 172332525
Change-Id: I82b5af211aa22479f8b0eec7f3a50bc92ec87eca
|
|
Add mutator lock pointer to `Thread`. This makes retrieving
the pointer faster on ARM and ARM64 and makes it accessible
for JNI stubs if we decide to inline `JniMethodStart()` and
`JniMethodEnd()`.
Pass the lock level `kMutatorLock` explicitly from the
`MutatorMutex` functions to let the compiler evaluate a lot
of the conditions statically and avoid unnecessary code.
Golem results for art-opt-cc (higher is better):
linux-armv7 before after
NativeDowncallStaticNormal 6.3694 7.2394 (+13.66%)
NativeDowncallStaticNormal6 6.0663 6.8527 (+12.96%)
NativeDowncallStaticNormalRefs6 5.7061 6.3945 (+12.06%)
NativeDowncallVirtualNormal 5.7088 7.2081 (+26.26%)
NativeDowncallVirtualNormal6 5.4563 6.7929 (+24.49%)
NativeDowncallVirtualNormalRefs6 5.1595 6.3415 (+22.91%)
linux-armv8 before after
NativeDowncallStaticNormal 6.4229 7.0423 (+9.642%)
NativeDowncallStaticNormal6 6.2651 6.8527 (+9.379%)
NativeDowncallStaticNormalRefs6 5.8824 6.3976 (+8.760%)
NativeDowncallVirtualNormal 6.2651 6.8527 (+9.379%)
NativeDowncallVirtualNormal6 6.0663 6.6163 (+9.066%)
NativeDowncallVirtualNormalRefs6 5.6630 6.1408 (+8.436%)
There does not seem to be a measurable difference for x86
and x86-64.
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Bug: 172332525
Change-Id: I2ad511a2fe7bac250549c43789cf3fb5e2de9e25
|
|
This reverts commit 72be14ed06b76cd0e83392145cec9025ff43d174.
Reason for revert: A reland of
commit 2d4feeb67912d64b9e980e6687794826a5c22f9d with a fix for no-image
tests
Change-Id: I79f719f0d4d9b903db301a1636fde5689da35a29
|
|
This reverts commit 2ca0900e98d826644960eefeb8a21c84850c9e04.
Reason for revert: Fixed instrumentation for suspend check
from JNI stub, added a commented-out DCHECK() and a test.
The commented-out DCHECK() was correctly catching the bug
with the original submission but it also exposed deeper
issues with the instrumentation framework, so we cannot
fully enable it - bug 204766614 has been filed for this.
Original message:
Inline suspend check from `GoToRunnableFast()` to JNI stubs.
The only remaining code in `JniMethodFast{Start,End}()` is a
debug mode check that the method is @FastNative, so remove
the call altogether as we prefer better performance over the
debug mode check. Replace `JniMethodFastEndWithReference()`
with a simple `JniDecodeReferenceResult()`.
Golem results for art-opt-cc (higher is better):
linux-ia32 before after
NativeDowncallStaticFast 149.00 226.77 (+52.20%)
NativeDowncallStaticFast6 107.39 140.29 (+30.63%)
NativeDowncallStaticFastRefs6 104.50 130.54 (+24.92%)
NativeDowncallVirtualFast 147.28 207.09 (+40.61%)
NativeDowncallVirtualFast6 106.39 136.93 (+28.70%)
NativeDowncallVirtualFastRefs6 104.50 130.54 (+24.92%)
linux-x64 before after
NativeDowncallStaticFast 133.10 173.50 (+30.35%)
NativeDowncallStaticFast6 109.12 135.73 (+24.39%)
NativeDowncallStaticFastRefs6 105.29 127.18 (+20.79%)
NativeDowncallVirtualFast 127.74 167.66 (+31.25%)
NativeDowncallVirtualFast6 106.39 128.12 (+20.42%)
NativeDowncallVirtualFastRefs6 105.29 127.18 (+20.79%)
linux-armv7 before after
NativeDowncallStaticFast 18.058 21.622 (+19.74%)
NativeDowncallStaticFast6 14.903 17.057 (+14.45%)
NativeDowncallStaticFastRefs6 13.006 14.620 (+12.41%)
NativeDowncallVirtualFast 17.848 21.027 (+17.81%)
NativeDowncallVirtualFast6 15.196 17.439 (+14.76%)
NativeDowncallVirtualFastRefs6 12.897 14.764 (+14.48%)
linux-armv8 before after
NativeDowncallStaticFast 19.183 23.610 (+23.08%)
NativeDowncallStaticFast6 16.161 19.183 (+18.71%)
NativeDowncallStaticFastRefs6 13.235 15.041 (+13.64%)
NativeDowncallVirtualFast 17.839 20.741 (+16.26%)
NativeDowncallVirtualFast6 15.500 18.272 (+17.88%)
NativeDowncallVirtualFastRefs6 12.481 14.209 (+13.84%)
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Test: testrunner.py --host --jit --no-image
Test: testrunner.py --host --optimizing --debuggable -t 2005
Bug: 172332525
Bug: 204766614
Change-Id: I9cc7583fc11c457a53fe2d1a24a8befc0f36410d
|
|
This reverts commit 2d4feeb67912d64b9e980e6687794826a5c22f9d.
Reason for revert: This breaks no-image tests. Example failure: https://android-build.googleplex.com/builds/submitted/7871904/art-no-image/latest/view/logs/build_error.log
Change-Id: I0f97c672c2d48f125931171ee1041a7c1cf20127
|
|
The idea of this CL is to avoid maintaining the instrumentation stack
and manipulating the return addresses on the stack to call the entry /
exit hooks. This Cl only addresses this for JITed code. In follow up
CLs, we will extend this to others (native, nterp). Once we have
everything in place we could remove the complexity of instrumentation
stack.
This CL introduces new nodes (HMethodEntry / HMethodExit(Void)) that
generate code to call the trace entry / exit hooks when
instrumentation_stubs are installed. Currently these are introduced for
JITed code in debuggable mode. The entry / exit hooks roughly do the
same this as instrumentation entry / exit points.
We also extend the JITed frame slots by adding a ShouldDeoptimize slot.
This will be used to force deoptimization of frames when requested by
jvmti (for ex: structural re-definition).
Test: art/testrunner.py
Change-Id: Id4aa439731d214a8d2b820a67e75415ca1d5424e
|
|
This reverts commit 64d6e187f19ed670429652020561887e6b220216.
Reason for revert: Breaks no-image JIT run tests (flaky).
Bug: 172332525
Change-Id: I7813d89283eff0f6266318d3fb02d1257471798d
|
|
Inline suspend check from `GoToRunnableFast()` to JNI stubs.
The only remaining code in `JniMethodFast{Start,End}()` is a
debug mode check that the method is @FastNative, so remove
the call altogether as we prefer better performance over the
debug mode check. Replace `JniMethodFastEndWithReference()`
with a simple `JniDecodeReferenceResult()`.
Golem results for art-opt-cc (higher is better):
linux-ia32 before after
NativeDowncallStaticFast 149.00 226.77 (+52.20%)
NativeDowncallStaticFast6 107.39 140.29 (+30.63%)
NativeDowncallStaticFastRefs6 104.50 130.54 (+24.92%)
NativeDowncallVirtualFast 147.28 207.09 (+40.61%)
NativeDowncallVirtualFast6 106.39 136.93 (+28.70%)
NativeDowncallVirtualFastRefs6 104.50 130.54 (+24.92%)
linux-x64 before after
NativeDowncallStaticFast 133.10 173.50 (+30.35%)
NativeDowncallStaticFast6 109.12 135.73 (+24.39%)
NativeDowncallStaticFastRefs6 105.29 127.18 (+20.79%)
NativeDowncallVirtualFast 127.74 167.66 (+31.25%)
NativeDowncallVirtualFast6 106.39 128.12 (+20.42%)
NativeDowncallVirtualFastRefs6 105.29 127.18 (+20.79%)
linux-armv7 before after
NativeDowncallStaticFast 18.058 21.622 (+19.74%)
NativeDowncallStaticFast6 14.903 17.057 (+14.45%)
NativeDowncallStaticFastRefs6 13.006 14.620 (+12.41%)
NativeDowncallVirtualFast 17.848 21.027 (+17.81%)
NativeDowncallVirtualFast6 15.196 17.439 (+14.76%)
NativeDowncallVirtualFastRefs6 12.897 14.764 (+14.48%)
linux-armv8 before after
NativeDowncallStaticFast 19.183 23.610 (+23.08%)
NativeDowncallStaticFast6 16.161 19.183 (+18.71%)
NativeDowncallStaticFastRefs6 13.235 15.041 (+13.64%)
NativeDowncallVirtualFast 17.839 20.741 (+16.26%)
NativeDowncallVirtualFast6 15.500 18.272 (+17.88%)
NativeDowncallVirtualFastRefs6 12.481 14.209 (+13.84%)
Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Bug: 172332525
Change-Id: I680aaeaa0c1a55796271328180e9d4ed7d89c0b8
|
|
Test: test.py
Change-Id: Iafc0be23eec86102844b127622be564f69c55eda
|