summaryrefslogtreecommitdiff
path: root/runtime/entrypoints_order_test.cc
AgeCommit message (Collapse)Author
2024-08-28x86_64: Add instrinsic for MethodHandle::invokeExact... Almaz Mingaleev
... which targets invoke-virtual methods. New entrypoint changes deliverException's offset, hence arm test change. Bug: 297147201 Test: ./art/test/testrunner/testrunner.py --host --64 -b --optimizing Test: ./art/test.py --host -g Change-Id: I636fc60c088bfdf9b695c92de47f1c539e3956f1
2024-08-01Rework exception delivery and deoptimization Chris Jones
Both exception delivery (various methods calling Thread::QuickDeliverException()) and deoptimization (via artDeoptimizeImpl) use QuickExceptionHandler to find the target context and do a long jump to it via QuickExceptionHandler::DoLongJump. The long jump is done directly from the C++ code, so the frames of the related C++ method are still on the stack before the change of the pc. Note that all those methods are marked as NO_RETURN to reflect that. This patch changes the approach; instead of having the long jump directly from the C++ methods related to exceptions and deoptimization, those methods now only prepare the long jump context and return. So their callers (mainly .S quick entry points and stubs) now need to do a long jump explicitly; thus there will be no C++ frames on the stack before the jump. This approach makes it possible to support exceptions and deoptimization in simulator mode; so we don't need to unwind native (C++ methods' frames) and simulated stacks at the same. Authors: Artem Serov <artem.serov@linaro.org>, Chris Jones <christopher.jones@arm.com> Test: test.py --host --target Change-Id: I5f90e6b5ba152fc2205728f1e814bbe3d609af9d
2024-07-31Add support for the experimental on-demand tracing Mythri Alle
This is to support on-demand tracing. This is behind a flag that is disabled by default. This is the initial CL that adds support to ART. There will be a followup CL to add an API that can be used from the frameworks to request a trace of dex methods that got executed. This is different from method tracing in two ways: 1. Method tracing is precise whereas this traces on a best effort basis. 2. Unlike method tracing this uses a circular buffer so can only trace a limited window into the past. Bug: 352518093 Test: art/test.py Change-Id: I8d958dd2ccefe8205a6c05b4daf339ea71b5dbc4
2024-07-15Revert^2 "Use a current entry pointer instead of index for the method trace ↵ Mythri Alle
buffer" This reverts commit 44b5204a81e9263a612af65f426e66395ae9426b. Reason for revert: Relanding after fix for failures. The curr_entry should use a register that isn't rax or rdx on x86 and x86_64. Change-Id: I9e19eae72b93b4c49c619a1b58a892040d975e3e
2024-07-11Revert "Use a current entry pointer instead of index for the method trace ↵ Nicolas Geoffray
buffer" This reverts commit b67495b6aa3f1253383938f2699da6eada1a0ead. Bug: 259258187 Reason for revert: Fails on bot Change-Id: If1f360a50d28e3166ca262463b80371c41f7404e
2024-07-11Use a current entry pointer instead of index for the method trace buffer Mythri Alle
For the thread local method trace buffers we used to store the pointer to the buffer and the current index of the next free space. We compute the address of the location to store the entry in the generated JITed code when storing the method trace entry. Instead we could use a pointer to the entry to avoid computing the address in the generated code. This doesn't have a noticeable impact for the regular method tracing but is better for the upcoming changes that will introduce an experimental feature for on-demand method tracing. Bug: 259258187 Test: art/test.py Change-Id: I7e39e686bee62ee91c651a1cb3ff242470010cb6
2024-06-20Replace `ScopedAssertNoNewTransactionRecords`... Vladimir Marko
... with `ScopedAssertNoTransactionChecks`. The new check is stronger than the old one but does not work during early setup when we do not have a `Thread` object yet. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Change-Id: Iba5a5cda0d97993ff324b4d11de02cb07f770699
2024-01-18Add visibility attributes in runtime/e* Dmitrii Ishcheikin
Bug: 260881207 Test: presubmit Test: abtd app_compat_drm Test: abtd app_compat_top_100 Test: abtd app_compat_banking Change-Id: I0eca5c4fd64ec61388eded9b9e066091581e0c7e
2023-12-20Revert^18 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit 8bc6a58df7046b4d6f4b51eb274c7e60fea396ff. PS1 is identical to https://android-review.git.corp.google.com/c/platform/art/+/2746640 PS2 makes the following changes: - Remove one DCHECK each from the two WaitForFlipFunction variants. The DCHECK could fail if another GC was started in the interim. - Break up the WaitForSuspendBarrier timeout into shorter ones. so we don't time out as easily if our process is frozen. - Include the thread name for ThreadSuspendByThreadIdWarning, since we don't get complete tombstones for some failures. Test: Treehugger, host tests. Bug: 240742796 Bug: 203363895 Bug: 238032384 Bug: 253671779 Bug: 276660630 Bug: 295880862 Bug: 294334417 Bug: 301090887 Bug: 313347640 (and more) Change-Id: I12c5c01b1e006baab4ee4148aadbc721723fb89e
2023-12-19Revert^17 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit c6371b52df0da31acc174a3526274417b7aac0a7. Reason for revert: This seems to have two remaining issues: 1. The second DCHECK in WaitForFlipFunction is not completely guaranteed to hold, resulting in failures for 658-fp-read-barrier. 2. WaitForSuspendBarrier seems to time out occasionally, possibly spuriously so. We fail when the futex times out once. That's probably incompatible with the app freezer. We should retry a few times. Change-Id: Ibd8909b31083fc29e6d4f1fcde003d08eb16fc0a
2023-12-19Revert^16 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit a43e67ea1a314e5c6faf77457ffc5ea39c24d4ca. PS1 is identical to aosp/2725875 . PS2 improves static and dynamic lock checking and makes the documentation more precise. ThreadList::Unregister for a thread that wasn't registered becomes fatal; I can't convince myself that any reasonable recovery is possible, and we could otherwise turn an obvious error into a very subtle and potentially dangerous one. Perhaps controversially, we now REQUIRE thread_list_lock_ for IncrementSuspendCount, eventhough that requires a kludge in the one case in which we legitimately don't have it. But after thinking about it, the extra checking and documentation outweighs the kludge, and we may want to consider this elsewhere as well. Added FakeMutexLock to enable the kludge here and elsewhere. PS3 adds some documentation for thread lifetime rules and enforces a sufficient, though in some cases overly strong, set of related restrictions for EnsureFlipFunctionStarted. It ensures that callers conform to this stronger restriction. This required a simplification in StackUtil::GetThreadListStackTraces. PS4 Fix lint issue. Add Thread lifetime DCHECK to WaitForFlipFunction. Rebase to adjust for the fact that aosp/2813551 effectively merged a small piece of this. PS5 Add Thread::VerifyState(). We previouslly checked for kTerminated in a couple of places. That doesn't make sense, since we have no way to tell whether the thread has been deallocated and reallocated at that point. Instead of checking that the state is not kTerminated, just check that it is a sane value. Address some old reviewer comments. Add more output for EnsureFlipFunctionStarted DCHECK. RequestSynchronousCheckpoint now aborts rather than returning false when invoked on a terminated thread. Seeing kTerminated would mean the thread could have been destroyed, and thus this call was unsafe. PS6 Add another VerifyState call to RequestSynchronousCheckpoint. PS7 Rebase and add more ThreadExitFlag tests. PS8 Rebase. Temporary workaround for compile error. PS9 Remove PS8 workaround. Add a version of GetPeerFromOtherThread that expects thread_list_lock_ to be initially held, and relies on ThreadExitFlag to detect terminated threads. Modify several jvmti clients to use this correctly. This effectively includes a fixed version of aosp/2847246. PS10 Work around the fact that GetReferenceKind in ti_heap.cc may call GetPeerFromOtherThread with or without thread_list_lock_. I think this is kind of benign, though it makes reasoning harder, and weakens our debug checking. PS11 Remove extra semicolon. PS12 Add another DCHECK in UnregisterThreadExitFlag. PS13 Minor tweaks to address reviewer comments. PS14 More tweaks to address reviewer comments. PS15 Do not report that a thread exited while its flip-function is still running. PS16 Fix comment typo. Test: Treehugger Bug: 240742796 Bug: 203363895 Bug: 238032384 Bug: 253671779 Bug: 276660630 Bug: 295880862 Bug: 294334417 Bug: 301090887 Bug: 313347640 (and more) Change-Id: I44caa30a0a4da8ab105fedd4d2238f59efc1d675
2023-09-09Revert "Revert^14 "Thread suspension cleanup and deadlock fix"" Hans Boehm
This reverts commit f9fdd3ce0180972dc8d4f0c8410ea7702828a703. Reason for revert: Very suspicious host-x86_64-debug failure on LUCI. Change-Id: Ia01dd3df8d64d6bc0d12319b06a8380f64a46785
2023-09-09Revert^14 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit 2a52e8a50c3bc9011713a085f058130bb13fd6a6. PS1 is identical to aosp/2710354. PS2 fixes a serious bug in the ThreadExitFlag list handling code. This didn't show up in presubmit because the list rarely contains more than a single element. Added an explicit gtest, and a bunch of DCHECKS around this. PS3 Rebase and fix oat version. Once more. PS4 Weaken CheckEmptyCheckpointFromWeakRefAccess check to allow weak reference access in a checkpoint. This happens via DumpLavaStack -> ... -> MonitorObjectsStackVisitor -> ... -> FindResolvedMethod -> ... -> IsDexFileRegistered. I haven't yet been able to convince myself that this is inherently broken, though it is trickier than I would like. PS5 Move cp_placeholder_mutex_ declaration higher in thread.h. Test: m test-art-host-gtest Change-Id: I66342ef1de27bfa0272702b5e1d3063ef8da7394
2023-08-24Revert "Revert^12 "Thread suspension cleanup and deadlock fix"" Hans Boehm
This reverts commit 996cbb566a5521ca3b0653007e7921469f58981a. Reason for revert: Some new intermittent master-art-host buildbot failures look related and need investigation. PS2: Fix oat.h merge conflict by not letting the revert touch it. PS3: Correct PS2 to actually bump the version once more instead. Change-Id: I70c46dc4494b585768f36e5074d34645d2fb562a
2023-08-23Revert^12 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit b6f3b439d4f12e89393ba8101eea8671c94ba237. PS1: Identical to aosp/2652371 . PS2: Introduce kSuspensionImmune to disable suspension of a thread that is being relied upon to execute ResumeAll(). This replaces the test in SuspendAll() to check whether the caller was being asked to suspend itself. That test was deadlock-prone, since a SuspendAll request from e.g. the GC to block, and GC progress might be required to resume the thread running the GC. Since SuspendAll() now only loops for a single reason, we no longer need to track why we looped. Reduce the number of iterations in each 129-ThreadGetId thread drammatically. PS3: Address reviewer comments, including fixing a newly introduced bug in CheckSuspend(). Fix 129-GetThreadId by drammatically reducing the iteration count when we appear to be running slowly, which is normally the case for gcstress. Earlier versions of this CL were apparently also failing on this test, but the failure was hidden by other failures. This mostly undoes the PS2 change to this test, now that the failure is better understood. PS4: Rebase. PS5: Fix 129-GetThreadId code formatting. PS6: Address more reviewer comments related to 129-GetThreadId. PS7: Remove DCHECK in EnsureFlipFunctionStarted. It was unsafe, since the thread may no longer be around. Test: Treehugger. Bug: 240742796 Bug: 203363895 Bug: 238032384 Bug: 253671779 Bug: 276660630 Bug: 295880862 Bug: 294334417 (and more) Change-Id: I99260fdc4feb9bcdc8b8b566e40912532f1a4937
2023-08-15Revert "Revert^10 "Thread suspension cleanup and deadlock fix"""" Hans Boehm
This reverts commit 2caa640269faabd2455ec29cfe6ad330d442b715. Reason for revert: It looks like there may be some new timeout failures on the master-art-host buildbot. I'll go ahead and generate a revert. Please submit once there are enough failures to investigate. Change-Id: I272e4ac5f4367a12a2eb027e456d789e8fd26ae6
2023-08-14Revert^10 "Thread suspension cleanup and deadlock fix""" Hans Boehm
This reverts commit 63af30b8fe8d4e1dc32db4dcb5e5dae1efdc7f31. master (aosp/2530206) PS1 is identical to aosp/2377951 . master (aosp/2530206) PS2 is a rebase. At this point, master branch was replaced by main, and this CL moved. PS1: Restructure documentation for the IncrementSuspendCount handshake to install a suspend barrier. Document a couple of additional mutator lock assumptions. Add some DCHECKs to check that suspended threads really are suspended. Weaken seq_cst memory order in a couple of places where it really didn't make sense here. Clearly not a correctness fix. Includes a rebase and merge with aosp/2587606. PS2: Another rebase. Fix thumb assembler test to compensate for Thread structure layout changes. PS3: Messy rebase, primarily to handle aosp/2670108, which included both new fixes around this and a few small snippets of this CL. Call EnsureFlipFunctionStarted without a state-and-flags argument only when we actually hold the mutator lock, as promised. PS4: Minor rebase, some lint fixes. PS5: Another minor lint fix that I had missed in PS4. PS6: Fix for RunCheckpoint bug introduced around PS3. Fix expectations in jni_cfi_test to compensate for thread structure layout changes. PS7: In PS3+, EnsureFlipFunctionStarted could access a destroyed "this" thread. Fix that, and make the function static to make this constraint more explicit. (And running a method on a potentially destroyed object just seemed unclean.) PS8: Address reviewer comments. The major issue was that we released the suspend_count_lock too early in FlipThreadRoots, potentially allowing an intervening SuspendAll to block us. The fix involved a very minor extention of the mutex API. PS9: Comment typo fix. PS10: Address new reviewer comments. Rebase. MUST_SLEEP for 129_ThreadGetId debug output. Test: Treehugger. Bug: 240742796 Bug: 203363895 Bug: 238032384 Bug: 253671779 Bug: 276660630 (and more) Change-Id: I0f2450e394c03c17eece3698286b2f3e45727967
2023-06-14Replace GcRoots in the verifier to use VariableSizedHandleScope. Nicolas Geoffray
This removes the cruft in creating static instances, and the need to explicitly visit verifier roots. Test: test.py Change-Id: Ia0f0a82cbc66bb57f30610587f080e75d4d32e92
2023-03-29Revert "Revert^8 "Thread suspension cleanup and deadlock fix"" Hans Boehm
This reverts commit 221b6c5fcd66d4b6f2626c311d03bde2fb1589f9. Reason for revert: Preemptive revert. Earlier versions have had a tendency to cause subtle breakage. Please do not submit unless something breaks. Change-Id: Iad2a7f920756f365789c422948632f5db5a28fd5
2023-03-29Revert^8 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit c85ae17f8267ac528e58892099dcefcc73bb8a26. PS1: Identical to aosp/2266238 PS2: Address the lint failure that was the primary cause of the revert. Don't print information about what caused a SuspendAll loop unless we actually gathered the information. Restructure thread flip to drop the assumption that certain threads will shortly become Runnable. That added a lot fo complexity and was deadlock-prone. We now simply try to run the flip function both in the target thread when it tries to become runnable again, and in the requesting thread, without paying attention to the target thread's state. The first attempt succeeds. Which means the originating thread will only succeed if the target is still suspended. This adds some complexity to deal with threads terminating in the meantime, but it avoids several issues: 1) RequestSynchronousCheckpoint blocked with thread_list_lock and suspend_count_lock, while waiting for flip_function to become non-null. This made it at best hard to reason about deadlock freedom. Several other functions also had to wait for a null flip_function, complicating the code. 2) Synchronization in FlipThreadRoots was questionable. In order to tell when to treat a thread as previously runnable, it looked at thread state bits that could change asynchronously. AFAICT, this was probably correct under sequential consistency, but not with the actual specified memory ordering. That code was deleted. 3) If a thread was intended to become runnable shortly after the start of the flip, we paused it until all thread flips were completed. This probably occasionally added latency that escaped our measurements. Weaken several assertions involving IsSuspended() to merely claim the thread is not runnable, to be consistent with the above change. The stringer assertion no longer holds in the flip function. Assert that we never suspend while running the GC, to ensure the GC never acts on a thread suspension request, which would likely result in deadlock. Update mutator_gc_coord.md to reflect additional insights about this code. Change the last parameter of DecrementSuspendCount to be a boolean rather than SuspendReason, since we mostly ignore the precise SuspendReason. Add NotifyOnThreadExit mechanism so that we can tell whether a thread exited, even if we release thread_list_lock_. Rewrite RequestSynchronousCheckpoint to take advantage of the above. The new thread-flip code also uses it. Remove now unnecessary checks that we do not suspend with a thread flip in progress. Various secondary changes and simplifications that follow from the above. Reduce DefaultThreadSuspendTimeout to something below ANR timeout. Explicitly ensure that when FlipThreadRoots() returns, all thread flips have completed. Previously that was mostly true, but actually guaranteed by barrier code in the collector. Remove that code. (The old version was hard to fix in light of potential exiting threads.) PS3: Rebase PS4: Fix and complete PS2 changes. PS5: Edit commit message. PS6: Update entry_points_order_test, again. PS7-8: Address many minor reviewer comments. Remove more dead code, including all the IsTransitioningToRunnable stuff. PS9: Slightly messy rebase PS10: Address comments. Most notably: SuspendAll now ensures that the caller is not left with a pending flip function. GetPeerFromOtherThread() sometimes runs the flip function instead of calling mark. The old way would not work for CMC. This makes it no longer const. PS11: Fix a PS10 oversight, mostly in the CMC collector code. PS12: Fix comment and documentation typos. Test: Run host run tests. TreeHugger. Bug: 240742796 Bug: 203363895 Bug: 238032384 Bug: 253671779 Change-Id: I81e366d4b739c5b48bd3c1509acb90d2a14e18d1
2023-01-16We no longer use instrumentation stubs remove the support code Mythri Alle
Remove the code to handle instrumentation stubs. We no longer use them. Bug: 206029744 Test: art/test.py Change-Id: I2b7eabf80bd34989314c0d2b299d7b1b35de0b85
2023-01-06Revert "Revert^6 "Thread suspension cleanup and deadlock fix"" Hans Boehm
This reverts commit fe9b34f845e8e439b4ae47ae999ef2cfdbd66462. Reason for revert: Breaks full-eng build Change-Id: I230b31809e274740b8fae9358c260787462efe4d
2023-01-06Revert^6 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit 0db9605ec8bd3158f4f0107a511dd09a889c9341. PS1: Identical to aosp/2255904 PS2: Detect and report excessive waiting for a suspend-friendly state. Add internal timeout to 129-ThreadGetId, so that we can print more state information on exit. We explicitly avoid suspending the HeapTaskDaemon to retrieve its stack trace. Fix a race that allowed this to happen anyway (with very low probability). Includes a slightly nontrivial rebase. PS3: Address a couple of minor comments. PS4: Reformatted, as suggested by the upload script, except for tls_ptr_sized_values, where it seemed too likely to cause unnecessary merge conflicts. PS5: SuspendAllInternal does not hold mutator lock, but may take a long time with suspend_all_count_ = 1. Another thread waiting for suspend_all_count_ could sleep many times. Explicitly wait on a condition variable instead. This intentionally has a low kMaxSuspendRetries so that we can see whether it is hit in presubmit. PS6: Adjust kMaxSuspendRetries to a bit lower than the PS3/PS4 version, but much higher than the PS5 debug version. Test: Build and boot AOSP, Treehugger Bug: 240742796 Bug: 203363895 Bug: 238032384 Bug: 253671779 Change-Id: I58d63f494a7454e00473b23864f8952abed7bf6f
2022-12-15Update method tracing to use per-thread buffer Mythri Alle
Method tracing in streaming mode uses a global buffer to record method entry / exit events and uses locks to synchronize across threads. Taking a lock for each event is expensive and makes the method tracing slow. This CL changes it to use a per-thread buffer so that each thread accesses its own buffer. This also allows us to fast path method trace events in JITed code in the future. The changes in this CL: 1. Add a per-thread buffer which is initialized lazily on the first method trace event. 2. When the per-thread buffer is initialized we record the information about the thread. This means we no longer need the bitmap we used to record the thread info when a new thread is seen. 3. The data from the buffer is flushed to file: 1. When a thread detaches, so we can flush any recorded data 2. When the buffer is full 3. When we stop tracing. The per-thread buffer is always accessed by the thread that owns it except when we record the method enter events for on stack methods. It is safe to access other thread's buffer since everything is suspended at that point. This CL also adds a test to check that the generated trace is in the expected format. Bug: 259258187 Test: art/testrunner.py -t 2246 Change-Id: I074bf2edb8c884dec0c9a7a9c37b4ef0ec7892a8
2022-10-22Revert "Revert^4 "Thread suspension cleanup and deadlock fix"" Hans Boehm
This reverts commit a23d325152c7cd81ccb426a407f6da280797e61d. Reason for revert: Triggered failures in org.apache.harmony.jpda.tests.jdwp.Events_CombinedEventsTest#testCombinedEvents_05 Change-Id: I0604a60f73a983c92e29827222bfa6158ee043aa
2022-10-21Revert^4 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit ebd76406bf5fa74185998bc29f0f27c20fa2e683. PS1 is identical to aosp/2216806. PS2 in addition converts the RunCheckpoint call used from StackUtil::GetAllStackTraces to RunCheckpointUnchecked to temporarily work around another checkpoint Run() function lock ordering issue. PS3 is a nontrivial rebase. Test: Build and boot AOSP, Treehugger Bug: 240742796 Bug: 203363895 Bug: 238032384 Bug: 253671779 Change-Id: I38385e41392652cc30e5e74fd8b93e22088827a5
2022-10-14ART: Speed up some gtests. Vladimir Marko
Avoid creating `Runtime` or create the `Runtime` with a boot image to make the test setup faster. Test: m test-art-host-gtest Test: run-gtests.sh Change-Id: I3f09de81491402442f1704d25bb06de995d8a3ca
2022-10-13Revert "Revert^2 "Thread suspension cleanup and deadlock fix"" Hans Boehm
This reverts commit fd20a745227aa7cae7a08728bb29e5bfce64ea87. Reason for revert: Lots of libartd failures due to new checkpoint lock level check. Change-Id: I0cf88ff893f8743a9a830a49489807d0921199a3
2022-10-12Revert^2 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit 7c8835df16147b9096dd4c9380ab4b5f700ea17d. PS1 is identical to aosp/2171862 . PS2 makes the following significant changes: 1) Avoid inflating locks from the thread dumping checkpoint Run() method. This violates the repeatedly stated claim that checkpoint Run() methods don't suspend threads. This requires that we print object addresses in thread dumps in some cases in which we were previously able to give hashcodes instead. 2) For debug builds, check that we do not acquire a higher level lock in checkpoint Run() methods, thus enforcing that previously stated property. (Lokesh suggested this, and I think it's a great idea. But it requires changes 4-6 below.) 3) Add a bit more justification that RunCheckpoint cannot result in circular suspend requests. 4) For now, allow an explicit override of (2) for ddms code in which it would otherwise fail. This should be fixed later. 5) Raise the level of monitor locks, to correctly reflect the fact that they may be held while much of the runtime is executed. 6) (5) was in conflict with monitor deflation acquiring a monitor lock after acquiring the monitor list lock. But this failure is spurious, both because it is a TryLock acquisition that can't possibly contributed to a deadlock, and secondly because it conflates all monitor locks, and an actual deadlock is probably not possible anyway. Leverage the former and add a facility to avoid checking for safe TryLock calls. (1) Should fix the one failure I managed to debug after the last submission attempt. Hopefully it also accounts for the others. PS3, PS5, PS6: Trivial corrections and cleanups PS4, PS7, PS8: Rebase Test: Build and boot AOSP, Treehugger Bug: 240742796 Bug: 203363895 Bug: 238032384 Change-Id: I80d441ebe21bb30b586131f7d22b7f2797f2c45f
2022-10-11Update java.lang.String* from jdk-11.0.13-ga Victor Chang
1. String(byte[], byte) constructor is added. 2. StringFactory.newStringFromBytes(byte[], int, int, int) is allowlisted to be called in the unstarted runtime. Bug: 247773125 Test: art/test/testrunner/testrunner.py -b --host Change-Id: I9386b3529a94a122654574e3110d08222be7f282
2022-09-10Revert "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit 7c39c86b17c91e651de1fcc0876fe5565e3f5204. Reason for revert: We're see a number of new, somewhat rare, timeouts on multiple tests. Change-Id: Ida9a4f80b64b6fedc16db176a8df9c2e985ef482
2022-09-09Thread suspension cleanup and deadlock fix Hans Boehm
Have SuspendAll check that no other SuspendAlls are running, and have it refuse to start if the invoking thread is also being suspended. This limits us to a single SuspendAll call at a time, but that was almost required anyway. It limits us to a single active SuspendAll-related suspend barrier, a large simplification. It appears to me that this avoids some cyclic suspension scenarios that were previously still possible. Move the deadlock-avoidance checks for flip_function to ModifySuspendCount callers instead of failing there. Make single-thread suspension use suspend barriers to avoid the complexity of having to reaccess the thread data structure from another thread while waiting for it to suspend. Add a new data structure to remember the single thread suspension barriers without allocating memory, This uses a linked list of stack allocated data structures, as in MCS locks. The combination of the above avoids a suspend_barrier data structure that can overflow, and removes any possibility of ModifySuspendCount needing to fail and retry. Recombine ModifySuspendCount and ModifySuspendCountInternal. Simplified barrier decrement in PassActiveSuspendBarriers. Strengthened the relaxed memory order, it was probably wrong. Fix the "ignored" logic in SuspendAllInternal. We only ever ignored self, and ResumeAll didn't support anything else anyway. Explicitly assume that the initiating thread, if not null, is registered. Have SuspendAll and friends only ignore self, which was the only actually used case anyway, and ResumeAll was otherwise wrong. Make flip_function atomic<>, since it could be read while being cleared. Remove the poorly used timed_out parameter from the SuspendThreadByX(). Make IsSuspended read with acquire semantics; we often count on having the target thread suspended after that, including having its prior effects on the Java state visible. The TransitionTo... functions already use acquire/release. Shrink the retry loop in RequestSynchronousCheckpoint. Retrying the whole loop appeared to have no benefit over the smaller loop. Clarify the behavior of RunCheckpoint with respect to the mutator lock. Split up ModifySuspendCount into IncrementSuspendCount and DecrementSuspendCount for improved clarity. This is not quite a semantic no-op since it eliminates some redundant work when decrementing a suspend count to a nonzero value. (Thanks to Mythri for the suggestion.) I could not convince myself that RequestCheckpoint returned false only if the target thread was already suspended; there seemed to be no easy way to preclude the state_and_flags compare-exchange failing for other reasons. Yet callers seemed to assume that property. Change the implementation to make that property clearly true. Various trivial cleanups. This hopefully reduces thread suspension deadlocks in general. We've seen a bunch of other bugs that may have been due to the cyclic suspension issues. At least this should make bug diagnosis easier. Test: ./art/test/testrunner/testrunner.py --host --64 -b Test: Build and boot AOSP Bug: 240742796 Bug: 203363895 Bug: 238032384 Change-Id: Ifc2358dd6489c0b92f4673886c98e45974134941
2022-07-06Reland^2 "Don't use instrumentation stubs for native methods in debuggable" Mythri Alle
This reverts commit 1d1d25eea72cf22aed802352a82588d97403f7b6. Reason for revert: Relanding after fix to failures: https://android-review.googlesource.com/c/platform/cts/+/2145979 Bug: 206029744 Change-Id: Id3c7508c86f9aeb0ddfc1c4792ed54f003b88e77
2022-07-04Revert "Reland "Don't use instrumentation stubs for native methods in ↵ Mythri Alle
debuggable"" This reverts commit 6fb0acc14459a856c35b642e3368aff853259260. Reason for revert: Breaks android.jvmti.cts.JvmtiHostTest https://buganizer.corp.google.com/issues/237991413 Change-Id: I00fb58080693ddebc03c7b62ea67c91150ef7a21
2022-07-04Reland "Don't use instrumentation stubs for native methods in debuggable" Mythri Alle
This reverts commit 5c9b55aa95295a287abd86f1e7fbe98c3f35ffd6. Reason for revert: Relanding with fixes for failure Fixes: 1. Arm64 needs to use 64-bit registers 2. We cannot deoptimize directly from GenericJniEndTrampoline since we only have a refs and args frame. So call the method exit hooks from art_quick_generic_jni_trampoline. Change-Id: If1f08eca69626f60f42f10205b482a3764610846
2022-06-24Revert "Don't use instrumentation stubs for native methods in debuggable" Mythri Alle
This reverts commit 90f12677f80169dc3ef919c2067349f94b943e7f. Reason for revert: Failures on device https://ci.chromium.org/ui/p/art/builders/ci/angler-armv7-ndebug/3058/overview https://ci.chromium.org/ui/p/art/builders/ci/angler-armv8-ndebug/3049/overview Change-Id: I43f943f9180b8c76db02a2a5c228a209a2f18a82
2022-06-24Don't use instrumentation stubs for native methods in debuggable Mythri Alle
Don't install instrumentation stubs for native methods in debuggable runtimes. The GenericJniTrampoline is updated to call method entry / exit hooks. When JITing JNI stubs in debuggable runtimes we also include calls to method entry / exit hooks when required. Bug: 206029744 Test: art/test.py Change-Id: I1d92ddb1d03daed74d88f5c70d38427dc6055446
2021-12-14JNI: Inline fast-path for `JniMethodEnd()`. Vladimir Marko
Golem results for art-opt-cc (higher is better): linux-ia32 before after NativeDowncallStaticNormal 46.766 51.016 (+9.086%) NativeDowncallStaticNormal6 42.268 45.748 (+8.235%) NativeDowncallStaticNormalRefs6 41.355 44.776 (+8.272%) NativeDowncallVirtualNormal 46.361 52.527 (+13.30%) NativeDowncallVirtualNormal6 41.812 45.206 (+8.118%) NativeDowncallVirtualNormalRefs6 40.500 44.169 (+9.059%) (The NativeDowncallVirtualNormal result for x86 is skewed by one extra good run as Golem reports the best result in the summary. Using the second best and most frequent result 50.5, the improvement is only around 8.9%.) linux-x64 before after NativeDowncallStaticNormal 44.169 47.976 (+8.620%) NativeDowncallStaticNormal6 43.198 46.836 (+8.423%) NativeDowncallStaticNormalRefs6 38.481 44.687 (+16.13%) NativeDowncallVirtualNormal 43.672 47.405 (+8.547%) NativeDowncallVirtualNormal6 42.268 45.726 (+8.182%) NativeDowncallVirtualNormalRefs6 41.355 44.687 (+8.057%) (The NativeDowncallStaticNormalRefs6 result for x86-64 is a bit inflated because recent results jump between ~38.5 and ~40.5. If we take the latter as the baseline, the improvements is only around 10.3%.) linux-armv7 before after NativeDowncallStaticNormal 10.659 14.620 (+37.16%) NativeDowncallStaticNormal6 9.8377 13.120 (+33.36%) NativeDowncallStaticNormalRefs6 8.8714 11.454 (+29.11%) NativeDowncallVirtualNormal 10.511 14.349 (+36.51%) NativeDowncallVirtualNormal6 9.9701 13.347 (+33.87%) NativeDowncallVirtualNormalRefs6 8.9241 11.454 (+28.35%) linux-armv8 before after NativeDowncallStaticNormal 10.608 16.329 (+53.93%) NativeDowncallStaticNormal6 10.179 15.347 (+50.76%) NativeDowncallStaticNormalRefs6 9.2457 13.705 (+48.23%) NativeDowncallVirtualNormal 9.9850 14.903 (+49.25%) NativeDowncallVirtualNormal6 9.9206 14.757 (+48.75%) NativeDowncallVirtualNormalRefs6 8.8235 12.789 (+44.94%) Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: run-gtests.sh Test: testrunner.py --target --optimizing Bug: 172332525 Change-Id: Ie144bc4f7f82be95790ea7d3123b81a3b6bfa603
2021-11-26Clean up JNI entrypoint assembly. Vladimir Marko
Move JNI entrypoints to `jni_entrypoints_<arch>.S` and shared helper macros to `asm_support_<arch>.S`. Introduce some new macros to reduce code duplication. Fix x86-64 using ESP in the JNI lock slow path. Rename JNI lock/unlock and read barrier entrypoints to pull the "jni" to the front and drop "quick" from their names. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: run-gtests.sh Test: testrunner.py --target --optimizing Bug: 172332525 Change-Id: I20d059b07b308283db6c4e36a508480d91ad07fc
2021-11-23Revert^2 "JNI: Rewrite locking for synchronized methods." Vladimir Marko
This reverts commit 02e0eb7eef35b03ae9eed60f02c889a6be400de9. Reason for revert: Fixed the arm64 UNLOCK_OBJECT_FAST_PATH macro to use the correct label for one branch to slow path. Change-Id: I311687e877c54229af1613db2928e47b3ef0b6f2 Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: run-gtests.sh Test: testrunner.py --target --optimizing Bug: 172332525
2021-11-23Revert "JNI: Rewrite locking for synchronized methods." Vladimir Marko
This reverts commit c17656bcf477e57d59ff051037c96994fd0ac8f2. Reason for revert: Broke tests. At least the arm64 macro UNLOCK_OBJECT_FAST_PATH uses an incorrect label for one branch to slow path. Bug: 172332525 Bug: 207408813 Change-Id: I6764dcfcba3b3d780fc13a66d6e676a3e3946a0f
2021-11-22JNI: Rewrite locking for synchronized methods. Vladimir Marko
Lock and unlock in dedicated entrypoints instead of the `JniMethodStart*()` and `JniMethodEnd*()` entrypoints. Update x86 and x86-64 lock/unlock entrypoints to use the same checks as arm and arm64. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: run-gtests.sh Test: testrunner.py --target --optimizing Bug: 172332525 Change-Id: I82b5af211aa22479f8b0eec7f3a50bc92ec87eca
2021-11-17JNI: Faster mutator locking during transition. Vladimir Marko
Add mutator lock pointer to `Thread`. This makes retrieving the pointer faster on ARM and ARM64 and makes it accessible for JNI stubs if we decide to inline `JniMethodStart()` and `JniMethodEnd()`. Pass the lock level `kMutatorLock` explicitly from the `MutatorMutex` functions to let the compiler evaluate a lot of the conditions statically and avoid unnecessary code. Golem results for art-opt-cc (higher is better): linux-armv7 before after NativeDowncallStaticNormal 6.3694 7.2394 (+13.66%) NativeDowncallStaticNormal6 6.0663 6.8527 (+12.96%) NativeDowncallStaticNormalRefs6 5.7061 6.3945 (+12.06%) NativeDowncallVirtualNormal 5.7088 7.2081 (+26.26%) NativeDowncallVirtualNormal6 5.4563 6.7929 (+24.49%) NativeDowncallVirtualNormalRefs6 5.1595 6.3415 (+22.91%) linux-armv8 before after NativeDowncallStaticNormal 6.4229 7.0423 (+9.642%) NativeDowncallStaticNormal6 6.2651 6.8527 (+9.379%) NativeDowncallStaticNormalRefs6 5.8824 6.3976 (+8.760%) NativeDowncallVirtualNormal 6.2651 6.8527 (+9.379%) NativeDowncallVirtualNormal6 6.0663 6.6163 (+9.066%) NativeDowncallVirtualNormalRefs6 5.6630 6.1408 (+8.436%) There does not seem to be a measurable difference for x86 and x86-64. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 172332525 Change-Id: I2ad511a2fe7bac250549c43789cf3fb5e2de9e25
2021-11-09Revert^2 "Add support for calling entry / exit hooks directly from JIT code"" Mythri Alle
This reverts commit 72be14ed06b76cd0e83392145cec9025ff43d174. Reason for revert: A reland of commit 2d4feeb67912d64b9e980e6687794826a5c22f9d with a fix for no-image tests Change-Id: I79f719f0d4d9b903db301a1636fde5689da35a29
2021-11-02Revert^2 "JNI: Remove `JniMethodFast{Start,End}()`." Vladimir Marko
This reverts commit 2ca0900e98d826644960eefeb8a21c84850c9e04. Reason for revert: Fixed instrumentation for suspend check from JNI stub, added a commented-out DCHECK() and a test. The commented-out DCHECK() was correctly catching the bug with the original submission but it also exposed deeper issues with the instrumentation framework, so we cannot fully enable it - bug 204766614 has been filed for this. Original message: Inline suspend check from `GoToRunnableFast()` to JNI stubs. The only remaining code in `JniMethodFast{Start,End}()` is a debug mode check that the method is @FastNative, so remove the call altogether as we prefer better performance over the debug mode check. Replace `JniMethodFastEndWithReference()` with a simple `JniDecodeReferenceResult()`. Golem results for art-opt-cc (higher is better): linux-ia32 before after NativeDowncallStaticFast 149.00 226.77 (+52.20%) NativeDowncallStaticFast6 107.39 140.29 (+30.63%) NativeDowncallStaticFastRefs6 104.50 130.54 (+24.92%) NativeDowncallVirtualFast 147.28 207.09 (+40.61%) NativeDowncallVirtualFast6 106.39 136.93 (+28.70%) NativeDowncallVirtualFastRefs6 104.50 130.54 (+24.92%) linux-x64 before after NativeDowncallStaticFast 133.10 173.50 (+30.35%) NativeDowncallStaticFast6 109.12 135.73 (+24.39%) NativeDowncallStaticFastRefs6 105.29 127.18 (+20.79%) NativeDowncallVirtualFast 127.74 167.66 (+31.25%) NativeDowncallVirtualFast6 106.39 128.12 (+20.42%) NativeDowncallVirtualFastRefs6 105.29 127.18 (+20.79%) linux-armv7 before after NativeDowncallStaticFast 18.058 21.622 (+19.74%) NativeDowncallStaticFast6 14.903 17.057 (+14.45%) NativeDowncallStaticFastRefs6 13.006 14.620 (+12.41%) NativeDowncallVirtualFast 17.848 21.027 (+17.81%) NativeDowncallVirtualFast6 15.196 17.439 (+14.76%) NativeDowncallVirtualFastRefs6 12.897 14.764 (+14.48%) linux-armv8 before after NativeDowncallStaticFast 19.183 23.610 (+23.08%) NativeDowncallStaticFast6 16.161 19.183 (+18.71%) NativeDowncallStaticFastRefs6 13.235 15.041 (+13.64%) NativeDowncallVirtualFast 17.839 20.741 (+16.26%) NativeDowncallVirtualFast6 15.500 18.272 (+17.88%) NativeDowncallVirtualFastRefs6 12.481 14.209 (+13.84%) Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: run-gtests.sh Test: testrunner.py --target --optimizing Test: testrunner.py --host --jit --no-image Test: testrunner.py --host --optimizing --debuggable -t 2005 Bug: 172332525 Bug: 204766614 Change-Id: I9cc7583fc11c457a53fe2d1a24a8befc0f36410d
2021-11-01Revert "Add support for calling entry / exit hooks directly from JIT code" Mythri Alle
This reverts commit 2d4feeb67912d64b9e980e6687794826a5c22f9d. Reason for revert: This breaks no-image tests. Example failure: https://android-build.googleplex.com/builds/submitted/7871904/art-no-image/latest/view/logs/build_error.log Change-Id: I0f97c672c2d48f125931171ee1041a7c1cf20127
2021-11-01Add support for calling entry / exit hooks directly from JIT code Mythri Alle
The idea of this CL is to avoid maintaining the instrumentation stack and manipulating the return addresses on the stack to call the entry / exit hooks. This Cl only addresses this for JITed code. In follow up CLs, we will extend this to others (native, nterp). Once we have everything in place we could remove the complexity of instrumentation stack. This CL introduces new nodes (HMethodEntry / HMethodExit(Void)) that generate code to call the trace entry / exit hooks when instrumentation_stubs are installed. Currently these are introduced for JITed code in debuggable mode. The entry / exit hooks roughly do the same this as instrumentation entry / exit points. We also extend the JITed frame slots by adding a ShouldDeoptimize slot. This will be used to force deoptimization of frames when requested by jvmti (for ex: structural re-definition). Test: art/testrunner.py Change-Id: Id4aa439731d214a8d2b820a67e75415ca1d5424e
2021-10-19Revert "JNI: Remove `JniMethodFast{Start,End}()`." Vladimir Marko
This reverts commit 64d6e187f19ed670429652020561887e6b220216. Reason for revert: Breaks no-image JIT run tests (flaky). Bug: 172332525 Change-Id: I7813d89283eff0f6266318d3fb02d1257471798d
2021-10-19JNI: Remove `JniMethodFast{Start,End}()`. Vladimir Marko
Inline suspend check from `GoToRunnableFast()` to JNI stubs. The only remaining code in `JniMethodFast{Start,End}()` is a debug mode check that the method is @FastNative, so remove the call altogether as we prefer better performance over the debug mode check. Replace `JniMethodFastEndWithReference()` with a simple `JniDecodeReferenceResult()`. Golem results for art-opt-cc (higher is better): linux-ia32 before after NativeDowncallStaticFast 149.00 226.77 (+52.20%) NativeDowncallStaticFast6 107.39 140.29 (+30.63%) NativeDowncallStaticFastRefs6 104.50 130.54 (+24.92%) NativeDowncallVirtualFast 147.28 207.09 (+40.61%) NativeDowncallVirtualFast6 106.39 136.93 (+28.70%) NativeDowncallVirtualFastRefs6 104.50 130.54 (+24.92%) linux-x64 before after NativeDowncallStaticFast 133.10 173.50 (+30.35%) NativeDowncallStaticFast6 109.12 135.73 (+24.39%) NativeDowncallStaticFastRefs6 105.29 127.18 (+20.79%) NativeDowncallVirtualFast 127.74 167.66 (+31.25%) NativeDowncallVirtualFast6 106.39 128.12 (+20.42%) NativeDowncallVirtualFastRefs6 105.29 127.18 (+20.79%) linux-armv7 before after NativeDowncallStaticFast 18.058 21.622 (+19.74%) NativeDowncallStaticFast6 14.903 17.057 (+14.45%) NativeDowncallStaticFastRefs6 13.006 14.620 (+12.41%) NativeDowncallVirtualFast 17.848 21.027 (+17.81%) NativeDowncallVirtualFast6 15.196 17.439 (+14.76%) NativeDowncallVirtualFastRefs6 12.897 14.764 (+14.48%) linux-armv8 before after NativeDowncallStaticFast 19.183 23.610 (+23.08%) NativeDowncallStaticFast6 16.161 19.183 (+18.71%) NativeDowncallStaticFastRefs6 13.235 15.041 (+13.64%) NativeDowncallVirtualFast 17.839 20.741 (+16.26%) NativeDowncallVirtualFast6 15.500 18.272 (+17.88%) NativeDowncallVirtualFastRefs6 12.481 14.209 (+13.84%) Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Test: run-gtests.sh Test: testrunner.py --target --optimizing Bug: 172332525 Change-Id: I680aaeaa0c1a55796271328180e9d4ed7d89c0b8
2021-10-08Remove unused fields in Thread. Nicolas Geoffray
Test: test.py Change-Id: Iafc0be23eec86102844b127622be564f69c55eda