summaryrefslogtreecommitdiff
path: root/runtime/base/mutex-inl.h
AgeCommit message (Collapse)Author
2024-06-07SYS_futex is available in all our linux libcs. Elliott Hughes
Change-Id: I7e03fe1e693779680645ba53eb37d44871ff810c
2024-01-08Add visibility attributes in runtime/base Dmitrii Ishcheikin
Bug: 260881207 Test: presubmit Test: abtd app_compat_drm Test: abtd app_compat_top_100 Test: abtd app_compat_banking Change-Id: I8f0d6548c890142ba03113095964dcc18abb9662
2023-12-20Revert^18 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit 8bc6a58df7046b4d6f4b51eb274c7e60fea396ff. PS1 is identical to https://android-review.git.corp.google.com/c/platform/art/+/2746640 PS2 makes the following changes: - Remove one DCHECK each from the two WaitForFlipFunction variants. The DCHECK could fail if another GC was started in the interim. - Break up the WaitForSuspendBarrier timeout into shorter ones. so we don't time out as easily if our process is frozen. - Include the thread name for ThreadSuspendByThreadIdWarning, since we don't get complete tombstones for some failures. Test: Treehugger, host tests. Bug: 240742796 Bug: 203363895 Bug: 238032384 Bug: 253671779 Bug: 276660630 Bug: 295880862 Bug: 294334417 Bug: 301090887 Bug: 313347640 (and more) Change-Id: I12c5c01b1e006baab4ee4148aadbc721723fb89e
2023-12-19Revert^17 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit c6371b52df0da31acc174a3526274417b7aac0a7. Reason for revert: This seems to have two remaining issues: 1. The second DCHECK in WaitForFlipFunction is not completely guaranteed to hold, resulting in failures for 658-fp-read-barrier. 2. WaitForSuspendBarrier seems to time out occasionally, possibly spuriously so. We fail when the futex times out once. That's probably incompatible with the app freezer. We should retry a few times. Change-Id: Ibd8909b31083fc29e6d4f1fcde003d08eb16fc0a
2023-12-19Revert^16 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit a43e67ea1a314e5c6faf77457ffc5ea39c24d4ca. PS1 is identical to aosp/2725875 . PS2 improves static and dynamic lock checking and makes the documentation more precise. ThreadList::Unregister for a thread that wasn't registered becomes fatal; I can't convince myself that any reasonable recovery is possible, and we could otherwise turn an obvious error into a very subtle and potentially dangerous one. Perhaps controversially, we now REQUIRE thread_list_lock_ for IncrementSuspendCount, eventhough that requires a kludge in the one case in which we legitimately don't have it. But after thinking about it, the extra checking and documentation outweighs the kludge, and we may want to consider this elsewhere as well. Added FakeMutexLock to enable the kludge here and elsewhere. PS3 adds some documentation for thread lifetime rules and enforces a sufficient, though in some cases overly strong, set of related restrictions for EnsureFlipFunctionStarted. It ensures that callers conform to this stronger restriction. This required a simplification in StackUtil::GetThreadListStackTraces. PS4 Fix lint issue. Add Thread lifetime DCHECK to WaitForFlipFunction. Rebase to adjust for the fact that aosp/2813551 effectively merged a small piece of this. PS5 Add Thread::VerifyState(). We previouslly checked for kTerminated in a couple of places. That doesn't make sense, since we have no way to tell whether the thread has been deallocated and reallocated at that point. Instead of checking that the state is not kTerminated, just check that it is a sane value. Address some old reviewer comments. Add more output for EnsureFlipFunctionStarted DCHECK. RequestSynchronousCheckpoint now aborts rather than returning false when invoked on a terminated thread. Seeing kTerminated would mean the thread could have been destroyed, and thus this call was unsafe. PS6 Add another VerifyState call to RequestSynchronousCheckpoint. PS7 Rebase and add more ThreadExitFlag tests. PS8 Rebase. Temporary workaround for compile error. PS9 Remove PS8 workaround. Add a version of GetPeerFromOtherThread that expects thread_list_lock_ to be initially held, and relies on ThreadExitFlag to detect terminated threads. Modify several jvmti clients to use this correctly. This effectively includes a fixed version of aosp/2847246. PS10 Work around the fact that GetReferenceKind in ti_heap.cc may call GetPeerFromOtherThread with or without thread_list_lock_. I think this is kind of benign, though it makes reasoning harder, and weakens our debug checking. PS11 Remove extra semicolon. PS12 Add another DCHECK in UnregisterThreadExitFlag. PS13 Minor tweaks to address reviewer comments. PS14 More tweaks to address reviewer comments. PS15 Do not report that a thread exited while its flip-function is still running. PS16 Fix comment typo. Test: Treehugger Bug: 240742796 Bug: 203363895 Bug: 238032384 Bug: 253671779 Bug: 276660630 Bug: 295880862 Bug: 294334417 Bug: 301090887 Bug: 313347640 (and more) Change-Id: I44caa30a0a4da8ab105fedd4d2238f59efc1d675
2023-09-09Revert "Revert^14 "Thread suspension cleanup and deadlock fix"" Hans Boehm
This reverts commit f9fdd3ce0180972dc8d4f0c8410ea7702828a703. Reason for revert: Very suspicious host-x86_64-debug failure on LUCI. Change-Id: Ia01dd3df8d64d6bc0d12319b06a8380f64a46785
2023-09-09Revert^14 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit 2a52e8a50c3bc9011713a085f058130bb13fd6a6. PS1 is identical to aosp/2710354. PS2 fixes a serious bug in the ThreadExitFlag list handling code. This didn't show up in presubmit because the list rarely contains more than a single element. Added an explicit gtest, and a bunch of DCHECKS around this. PS3 Rebase and fix oat version. Once more. PS4 Weaken CheckEmptyCheckpointFromWeakRefAccess check to allow weak reference access in a checkpoint. This happens via DumpLavaStack -> ... -> MonitorObjectsStackVisitor -> ... -> FindResolvedMethod -> ... -> IsDexFileRegistered. I haven't yet been able to convince myself that this is inherently broken, though it is trickier than I would like. PS5 Move cp_placeholder_mutex_ declaration higher in thread.h. Test: m test-art-host-gtest Change-Id: I66342ef1de27bfa0272702b5e1d3063ef8da7394
2023-08-24Revert "Revert^12 "Thread suspension cleanup and deadlock fix"" Hans Boehm
This reverts commit 996cbb566a5521ca3b0653007e7921469f58981a. Reason for revert: Some new intermittent master-art-host buildbot failures look related and need investigation. PS2: Fix oat.h merge conflict by not letting the revert touch it. PS3: Correct PS2 to actually bump the version once more instead. Change-Id: I70c46dc4494b585768f36e5074d34645d2fb562a
2023-08-23Revert^12 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit b6f3b439d4f12e89393ba8101eea8671c94ba237. PS1: Identical to aosp/2652371 . PS2: Introduce kSuspensionImmune to disable suspension of a thread that is being relied upon to execute ResumeAll(). This replaces the test in SuspendAll() to check whether the caller was being asked to suspend itself. That test was deadlock-prone, since a SuspendAll request from e.g. the GC to block, and GC progress might be required to resume the thread running the GC. Since SuspendAll() now only loops for a single reason, we no longer need to track why we looped. Reduce the number of iterations in each 129-ThreadGetId thread drammatically. PS3: Address reviewer comments, including fixing a newly introduced bug in CheckSuspend(). Fix 129-GetThreadId by drammatically reducing the iteration count when we appear to be running slowly, which is normally the case for gcstress. Earlier versions of this CL were apparently also failing on this test, but the failure was hidden by other failures. This mostly undoes the PS2 change to this test, now that the failure is better understood. PS4: Rebase. PS5: Fix 129-GetThreadId code formatting. PS6: Address more reviewer comments related to 129-GetThreadId. PS7: Remove DCHECK in EnsureFlipFunctionStarted. It was unsafe, since the thread may no longer be around. Test: Treehugger. Bug: 240742796 Bug: 203363895 Bug: 238032384 Bug: 253671779 Bug: 276660630 Bug: 295880862 Bug: 294334417 (and more) Change-Id: I99260fdc4feb9bcdc8b8b566e40912532f1a4937
2023-08-15Revert "Revert^10 "Thread suspension cleanup and deadlock fix"""" Hans Boehm
This reverts commit 2caa640269faabd2455ec29cfe6ad330d442b715. Reason for revert: It looks like there may be some new timeout failures on the master-art-host buildbot. I'll go ahead and generate a revert. Please submit once there are enough failures to investigate. Change-Id: I272e4ac5f4367a12a2eb027e456d789e8fd26ae6
2023-08-14Revert^10 "Thread suspension cleanup and deadlock fix""" Hans Boehm
This reverts commit 63af30b8fe8d4e1dc32db4dcb5e5dae1efdc7f31. master (aosp/2530206) PS1 is identical to aosp/2377951 . master (aosp/2530206) PS2 is a rebase. At this point, master branch was replaced by main, and this CL moved. PS1: Restructure documentation for the IncrementSuspendCount handshake to install a suspend barrier. Document a couple of additional mutator lock assumptions. Add some DCHECKs to check that suspended threads really are suspended. Weaken seq_cst memory order in a couple of places where it really didn't make sense here. Clearly not a correctness fix. Includes a rebase and merge with aosp/2587606. PS2: Another rebase. Fix thumb assembler test to compensate for Thread structure layout changes. PS3: Messy rebase, primarily to handle aosp/2670108, which included both new fixes around this and a few small snippets of this CL. Call EnsureFlipFunctionStarted without a state-and-flags argument only when we actually hold the mutator lock, as promised. PS4: Minor rebase, some lint fixes. PS5: Another minor lint fix that I had missed in PS4. PS6: Fix for RunCheckpoint bug introduced around PS3. Fix expectations in jni_cfi_test to compensate for thread structure layout changes. PS7: In PS3+, EnsureFlipFunctionStarted could access a destroyed "this" thread. Fix that, and make the function static to make this constraint more explicit. (And running a method on a potentially destroyed object just seemed unclean.) PS8: Address reviewer comments. The major issue was that we released the suspend_count_lock too early in FlipThreadRoots, potentially allowing an intervening SuspendAll to block us. The fix involved a very minor extention of the mutex API. PS9: Comment typo fix. PS10: Address new reviewer comments. Rebase. MUST_SLEEP for 129_ThreadGetId debug output. Test: Treehugger. Bug: 240742796 Bug: 203363895 Bug: 238032384 Bug: 253671779 Bug: 276660630 (and more) Change-Id: I0f2450e394c03c17eece3698286b2f3e45727967
2023-03-29Revert "Revert^8 "Thread suspension cleanup and deadlock fix"" Hans Boehm
This reverts commit 221b6c5fcd66d4b6f2626c311d03bde2fb1589f9. Reason for revert: Preemptive revert. Earlier versions have had a tendency to cause subtle breakage. Please do not submit unless something breaks. Change-Id: Iad2a7f920756f365789c422948632f5db5a28fd5
2023-03-29Revert^8 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit c85ae17f8267ac528e58892099dcefcc73bb8a26. PS1: Identical to aosp/2266238 PS2: Address the lint failure that was the primary cause of the revert. Don't print information about what caused a SuspendAll loop unless we actually gathered the information. Restructure thread flip to drop the assumption that certain threads will shortly become Runnable. That added a lot fo complexity and was deadlock-prone. We now simply try to run the flip function both in the target thread when it tries to become runnable again, and in the requesting thread, without paying attention to the target thread's state. The first attempt succeeds. Which means the originating thread will only succeed if the target is still suspended. This adds some complexity to deal with threads terminating in the meantime, but it avoids several issues: 1) RequestSynchronousCheckpoint blocked with thread_list_lock and suspend_count_lock, while waiting for flip_function to become non-null. This made it at best hard to reason about deadlock freedom. Several other functions also had to wait for a null flip_function, complicating the code. 2) Synchronization in FlipThreadRoots was questionable. In order to tell when to treat a thread as previously runnable, it looked at thread state bits that could change asynchronously. AFAICT, this was probably correct under sequential consistency, but not with the actual specified memory ordering. That code was deleted. 3) If a thread was intended to become runnable shortly after the start of the flip, we paused it until all thread flips were completed. This probably occasionally added latency that escaped our measurements. Weaken several assertions involving IsSuspended() to merely claim the thread is not runnable, to be consistent with the above change. The stringer assertion no longer holds in the flip function. Assert that we never suspend while running the GC, to ensure the GC never acts on a thread suspension request, which would likely result in deadlock. Update mutator_gc_coord.md to reflect additional insights about this code. Change the last parameter of DecrementSuspendCount to be a boolean rather than SuspendReason, since we mostly ignore the precise SuspendReason. Add NotifyOnThreadExit mechanism so that we can tell whether a thread exited, even if we release thread_list_lock_. Rewrite RequestSynchronousCheckpoint to take advantage of the above. The new thread-flip code also uses it. Remove now unnecessary checks that we do not suspend with a thread flip in progress. Various secondary changes and simplifications that follow from the above. Reduce DefaultThreadSuspendTimeout to something below ANR timeout. Explicitly ensure that when FlipThreadRoots() returns, all thread flips have completed. Previously that was mostly true, but actually guaranteed by barrier code in the collector. Remove that code. (The old version was hard to fix in light of potential exiting threads.) PS3: Rebase PS4: Fix and complete PS2 changes. PS5: Edit commit message. PS6: Update entry_points_order_test, again. PS7-8: Address many minor reviewer comments. Remove more dead code, including all the IsTransitioningToRunnable stuff. PS9: Slightly messy rebase PS10: Address comments. Most notably: SuspendAll now ensures that the caller is not left with a pending flip function. GetPeerFromOtherThread() sometimes runs the flip function instead of calling mark. The old way would not work for CMC. This makes it no longer const. PS11: Fix a PS10 oversight, mostly in the CMC collector code. PS12: Fix comment and documentation typos. Test: Run host run tests. TreeHugger. Bug: 240742796 Bug: 203363895 Bug: 238032384 Bug: 253671779 Change-Id: I81e366d4b739c5b48bd3c1509acb90d2a14e18d1
2023-01-06Revert "Revert^6 "Thread suspension cleanup and deadlock fix"" Hans Boehm
This reverts commit fe9b34f845e8e439b4ae47ae999ef2cfdbd66462. Reason for revert: Breaks full-eng build Change-Id: I230b31809e274740b8fae9358c260787462efe4d
2023-01-06Revert^6 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit 0db9605ec8bd3158f4f0107a511dd09a889c9341. PS1: Identical to aosp/2255904 PS2: Detect and report excessive waiting for a suspend-friendly state. Add internal timeout to 129-ThreadGetId, so that we can print more state information on exit. We explicitly avoid suspending the HeapTaskDaemon to retrieve its stack trace. Fix a race that allowed this to happen anyway (with very low probability). Includes a slightly nontrivial rebase. PS3: Address a couple of minor comments. PS4: Reformatted, as suggested by the upload script, except for tls_ptr_sized_values, where it seemed too likely to cause unnecessary merge conflicts. PS5: SuspendAllInternal does not hold mutator lock, but may take a long time with suspend_all_count_ = 1. Another thread waiting for suspend_all_count_ could sleep many times. Explicitly wait on a condition variable instead. This intentionally has a low kMaxSuspendRetries so that we can see whether it is hit in presubmit. PS6: Adjust kMaxSuspendRetries to a bit lower than the PS3/PS4 version, but much higher than the PS5 debug version. Test: Build and boot AOSP, Treehugger Bug: 240742796 Bug: 203363895 Bug: 238032384 Bug: 253671779 Change-Id: I58d63f494a7454e00473b23864f8952abed7bf6f
2022-10-22Revert "Revert^4 "Thread suspension cleanup and deadlock fix"" Hans Boehm
This reverts commit a23d325152c7cd81ccb426a407f6da280797e61d. Reason for revert: Triggered failures in org.apache.harmony.jpda.tests.jdwp.Events_CombinedEventsTest#testCombinedEvents_05 Change-Id: I0604a60f73a983c92e29827222bfa6158ee043aa
2022-10-21Revert^4 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit ebd76406bf5fa74185998bc29f0f27c20fa2e683. PS1 is identical to aosp/2216806. PS2 in addition converts the RunCheckpoint call used from StackUtil::GetAllStackTraces to RunCheckpointUnchecked to temporarily work around another checkpoint Run() function lock ordering issue. PS3 is a nontrivial rebase. Test: Build and boot AOSP, Treehugger Bug: 240742796 Bug: 203363895 Bug: 238032384 Bug: 253671779 Change-Id: I38385e41392652cc30e5e74fd8b93e22088827a5
2022-10-13Revert "Revert^2 "Thread suspension cleanup and deadlock fix"" Hans Boehm
This reverts commit fd20a745227aa7cae7a08728bb29e5bfce64ea87. Reason for revert: Lots of libartd failures due to new checkpoint lock level check. Change-Id: I0cf88ff893f8743a9a830a49489807d0921199a3
2022-10-12Revert^2 "Thread suspension cleanup and deadlock fix" Hans Boehm
This reverts commit 7c8835df16147b9096dd4c9380ab4b5f700ea17d. PS1 is identical to aosp/2171862 . PS2 makes the following significant changes: 1) Avoid inflating locks from the thread dumping checkpoint Run() method. This violates the repeatedly stated claim that checkpoint Run() methods don't suspend threads. This requires that we print object addresses in thread dumps in some cases in which we were previously able to give hashcodes instead. 2) For debug builds, check that we do not acquire a higher level lock in checkpoint Run() methods, thus enforcing that previously stated property. (Lokesh suggested this, and I think it's a great idea. But it requires changes 4-6 below.) 3) Add a bit more justification that RunCheckpoint cannot result in circular suspend requests. 4) For now, allow an explicit override of (2) for ddms code in which it would otherwise fail. This should be fixed later. 5) Raise the level of monitor locks, to correctly reflect the fact that they may be held while much of the runtime is executed. 6) (5) was in conflict with monitor deflation acquiring a monitor lock after acquiring the monitor list lock. But this failure is spurious, both because it is a TryLock acquisition that can't possibly contributed to a deadlock, and secondly because it conflates all monitor locks, and an actual deadlock is probably not possible anyway. Leverage the former and add a facility to avoid checking for safe TryLock calls. (1) Should fix the one failure I managed to debug after the last submission attempt. Hopefully it also accounts for the others. PS3, PS5, PS6: Trivial corrections and cleanups PS4, PS7, PS8: Rebase Test: Build and boot AOSP, Treehugger Bug: 240742796 Bug: 203363895 Bug: 238032384 Change-Id: I80d441ebe21bb30b586131f7d22b7f2797f2c45f
2021-11-17JNI: Faster mutator locking during transition. Vladimir Marko
Add mutator lock pointer to `Thread`. This makes retrieving the pointer faster on ARM and ARM64 and makes it accessible for JNI stubs if we decide to inline `JniMethodStart()` and `JniMethodEnd()`. Pass the lock level `kMutatorLock` explicitly from the `MutatorMutex` functions to let the compiler evaluate a lot of the conditions statically and avoid unnecessary code. Golem results for art-opt-cc (higher is better): linux-armv7 before after NativeDowncallStaticNormal 6.3694 7.2394 (+13.66%) NativeDowncallStaticNormal6 6.0663 6.8527 (+12.96%) NativeDowncallStaticNormalRefs6 5.7061 6.3945 (+12.06%) NativeDowncallVirtualNormal 5.7088 7.2081 (+26.26%) NativeDowncallVirtualNormal6 5.4563 6.7929 (+24.49%) NativeDowncallVirtualNormalRefs6 5.1595 6.3415 (+22.91%) linux-armv8 before after NativeDowncallStaticNormal 6.4229 7.0423 (+9.642%) NativeDowncallStaticNormal6 6.2651 6.8527 (+9.379%) NativeDowncallStaticNormalRefs6 5.8824 6.3976 (+8.760%) NativeDowncallVirtualNormal 6.2651 6.8527 (+9.379%) NativeDowncallVirtualNormal6 6.0663 6.6163 (+9.066%) NativeDowncallVirtualNormalRefs6 5.6630 6.1408 (+8.436%) There does not seem to be a measurable difference for x86 and x86-64. Test: m test-art-host-gtest Test: testrunner.py --host --optimizing Bug: 172332525 Change-Id: I2ad511a2fe7bac250549c43789cf3fb5e2de9e25
2020-07-24Update language to comply with Android’s inclusive language guidance Ian Pedowitz
See https://source.android.com/setup/contribute/respectful-code for reference Bug: 161896447 Bug: 161850439 Bug: 161336379 Test: m -j checkbuild cts docs tests Change-Id: I32d869c274a5d9a3dac63221e25874fe685d38c4
2019-06-14ART: Correctly handle an abort from an unattached thread Andreas Gampe
Libbase may be shared with other platform components. In that case, if the aborting thread is attached to the runtime, ART will print its usual dump to be helpful. If the thread is unattached, this must not be done as it would violate mutex invariants. Bug: 135056249 Test: m test-art-host-gtest-runtime_test Change-Id: I61c3df5fdbc8ddaf279f39dc653738016986dcd9
2019-04-25Speed up and slightly simplify Mutex Hans Boehm
Eliminate the separate seqentially consistent contention counter load in Mutex::ExclusiveUnlock by putting the contenion counter in the Mutex's state word. Replace some CHECK_GE checks with CHECK_GT. We were checking quantities intended to be non-negative against >= 0 just before decrementing them. Remove a pointless volatile declaration. Introduce constants for the first FUTEX_WAKE argument. Remove all uses of -1 as that argument, everywhere in ART. It appears to work, but the documentation says it's wrong. This does not yet address the ReaderWriterMutex issue, which is handled in a different way in a separate CL. Benchmark runs with and without this CL weakly suggest a tiny, not statistically significant, improvement in both time and space with this CL. Bug: 111835365 Test: Build and boot AOSP. TreeHugger. Change-Id: Ie53c65f2ce774a8cb4d224e2c1b3a110eb880f0c
2019-04-24Use single contention counter for rw mutexes Hans Boehm
This halves the number of sequentially consistent contention counter loads when we release a mutex. This is more expensive than keeping the contention counter as part of the state word. But schemes to do that either risk overflow, or seem more subtle than we we would like at this stage. This gets us half-way to where we would like to be. And, since we were previously always looking at both counters anyway, it seems like low risk change. Bug: 111835365 Test: Build and boot AOSP, TreeHugger Change-Id: I752859f6c51283de3ba5078233d2873a733ad4a1
2018-11-09Revert^2 "Notify waiters when releasing the monitor" Charles Munger
This reverts commit 9cec9658ec0b7a6c715a154ec834faba853188e3. Reason for revert: Changed lock ordering to not require reacquiring the monitor lock while holding the wait lock. Tested: 1000 iterations of ThreadStress Bug: 117842465 Change-Id: I7b54943052c5eba367eac86da9646bfc81bc1163
2018-11-06Revert "Notify waiters when releasing the monitor" Roland Levillain
This reverts commit 1ebb52ca700b6d9f9c27c3ee3e688ed17a43d358. Reason for revert: Break ART run-test ThreadStress by failing this assertion: dalvikvm32 E 11-06 09:43:27 27100 27851 mutex-inl.h:134] Lock level violation: holding "a thread wait mutex" (level ThreadWaitLock - 9) while locking "a monitor lock" (level MonitorLock - 54) dalvikvm32 F 11-06 09:43:27 27100 27851 mutex-inl.h:145] Check failed: !bad_mutexes_held Bug: 117842465 Change-Id: I888201bf5c252c8366618d9169a37e4a4cc29734
2018-11-02Notify waiters when releasing the monitor Charles Munger
This avoids a ping-pong thread scheduling issue, where a waiter immediately tries to acquire the monitor held by the notifier. Bug: 117842465 Change-Id: I33b91b066c9412b031fd6432bcb61273fb8d8fea
2018-10-31Use _PRIVATE versions of futex ops. Charles Munger
This flag allows some performance optimizations in the kernel for futex words that are only used in one process. Tested: art$ grep FUTEX_ **/*.cc **/*.h Change-Id: I490b9592ca0f0ab5ab5431682e8b2104f5c917ca
2018-07-27Ensure seq_cst memory ordering for num_contenders Hyangseok Chae
Problem. Mutexes and ReaderWriterMutexes can lose wakeups due to weak memory ordering. An unlocking thread may overlook waiters. Thread A 0. ExclusiveLock 1. increase num_contenders as default ordering. (fetch_add, std::memory_order_seq_cst) 2. futex waiting ...permently waiting 3. wakeup 4. decrease num_contenders 5. running Thread B 0. Reset lock state to unlocked using seq_cst CAS. 1. load num_contenders with LoadRelaxed (std::memory_order_relaxed) 2. if num_contenders is bigger than 0, wakeup waiters. Thread B's load of num_contenders may be reordered with the store in the preceding CAS (step 0). We can then get the following interleaving: A.0 (fails: lock held.) B.0a (CAS load acquire sees lock as held) B.1 (sees num_contenders = 0) A.1 num_contenders++; A.2 futex starts waiting (state unchanged) B.0b (CAS store release sets state to unlocked) B.2 (does nothing since num_contenders was 0) We observed this hang with state_ = 0, exclusive_owner_ = 0, num_contenders_ = 1 Indeed, the preceding comment strongly suggests that the num_contenders load should not be relaxed. Test: test-art-host, test-art-target Change-Id: I912bcd3a186d9c36fb3da8a41c1f9aa1f7b39be5 Signed-off-by: Hyangseok Chae <neo.chae@lge.com>
2018-06-25ART: Use clang-tidy to warn on RAII issue Andreas Gampe
Remove the macro hack to detect likely-incorrect usage of RAII wrappers. Instead make the clang-tidy pattern bugprone-unused-raii fatal. Test: mmma art Change-Id: I9d0eb1c5c3f469b2907111af9d38d947b36c4878
2018-03-23ART: Simplify atomic.h Orion Hodson
Prefer std::atomic operations over wrappers in atomic.h. Exceptions are cases that relate to the Java data memory operations and CAS operations. Bug: 71621075 Test: art/test.py --host -j32 Test: art/test.py --target --64 -j4 Change-Id: I9a157e9dede852c1b2aa67d22e3e604a68a9ef1c
2018-03-05Move most of runtime/base to libartbase/base David Sehr
Enforce the layering that code in runtime/base should not depend on runtime by separating it into libartbase. Some of the code in runtime/base depends on the Runtime class, so it cannot be moved yet. Also, some of the tests depend on CommonRuntimeTest, which itself needs to be factored (in a subsequent CL). Bug: 22322814 Test: make -j 50 checkbuild make -j 50 test-art-host Change-Id: I8b096c1e2542f829eb456b4b057c71421b77d7e2
2018-01-03ART: Rename Atomic::CompareExchange methods Orion Hodson
Renames Atomic::CompareExchange methods to Atomic::CompareAndSet equivalents. These methods return a boolean and do not get the witness value. This makes space for Atomic::CompareAndExchange methods in a later commit that will return a boolean and get the witness value. This is pre-work for VarHandle accessors which require both forms. Bug: 65872996 Test: art/test.py --host -j32 Change-Id: I9c691250e5556cbfde7811381b06d2920247f1a1
2017-11-20Revert "Revert "Make JVMTI DisposeEnvironment and GetEnv thread safe."" Alex Light
This reverts commit af9341087aab0146b8323ece156bde8130948465. We needed to allow TopLockLevel locks to be acquired when the mutator_lock_ is exclusive held. This is required for spec conformance. To ensure there are no deadlocks the mutator_lock_ is the only lock level with this exception and one cannot acquire the mutator_lock_ when one holds any kTopLockLevel locks. Reason for revert: Fixed issue causing test 913 failure in art-gc-gss-tlab Test: ART_DEFAULT_GC_TYPE=GSS \ ART_USE_TLAB=true \ ART_USE_READ_BARRIER=false ./test.py --host -j50 Bug: 69465262 Change-Id: Ic1a4d9bb3ff64382ba7ae22ba27a4f44628ed095
2017-11-20Revert "Make JVMTI DisposeEnvironment and GetEnv thread safe." Alex Light
This reverts commit e5a2ae30bdbe379695dc886861b23dce57de0825. Reason for revert: fails art-gc-gss-tlab column. Test: None Bug: 69465262 Change-Id: I70af77297bc7870d281ed8ffb319d144ddb12838
2017-11-20Make JVMTI DisposeEnvironment and GetEnv thread safe. Alex Light
Previously we were relying on the mutator lock to keep these safe but it turns out this was not sufficient. We give the list of active jvmtiEnv's it's own lock to synchronize access. We also changed it so that during events we would collect all the environments and callbacks prior to actually calling any of them. This is required for making sure that we don't hold locks across user code or potentially miss any environments. This does have implications for when one is last able to prevent an environment from getting an event but since the spec is vague about this anyway this is not an issue. Doing this required a major re-write of our event-dispatch system. Test: ./test.py --host -j50 Test: ./art/tools/run-libjdwp-tests.sh --mode=host Bug: 69465262 Change-Id: I170950db6c6e43b5f3c8bdca1b8d087937070496
2017-09-13Shrink ART Mutex exclusive_owner_ field to Atomic<pid_t> Hans Boehm
The old volatile uint64_t version had a data race, and was thus technically incorrect. Since it's unclear whether volatile uint64_t updates are actually atomic on 32-bit platforms, even the informal correctness argument here already effectively assumed that the upper 32 bits were zero. Don't store them. Explicitly complain if a pid_t might be too big to support lock-free atomic operations. Remove many explicit references to exclusive_owner to avoid littering the code with LoadRelaxed calls. The return convention for GetExclusiveOwnerTid() was unclear for the shared ownership case. It was previously treated inconsistently as 0 (pthread locks), (uint64_t)(-1U) and (uint64_t)(-1). Make it as consistent as easily possible, and document remaining weirdness. Bug: 65171052 Test: AOSP builds. Host tests pass. Change-Id: Ia99aca268952597a90b3c798b714cddbdc2c365e
2017-06-02ART: Introduce thread-current-inl.h Andreas Gampe
Factor out Thread::Current() code into its own -inl file to remove transitive includes. This requires at the same time correcting mutex.h, i.e., moving some functions into mutex-inl.h. Test: m test-art-host Change-Id: I88f888b604e0897368d9b483edce6ce4332dd9c9
2017-04-17ART: Make less lock-level noise on abort Andreas Gampe
The lock-level violations with the abort lock aren't really all that interesting. Test: m test-art-host Change-Id: I8a5fc687009db914ec8f60d86068d87e71f8a894
2016-12-15ART: Move to libbase StringPrintf Andreas Gampe
Remove ART's StringPrintf implementation. Fix up clients. Add missing includes where necessary. Test: m test-art-host Change-Id: I564038d5868595ac3bb88d641af1000cea940e5a
2016-10-20Remove mutex dependency on art::Runtime David Sehr
Breaks the cyclic dependency between mutex and the runtime. This allows the use of mutexes without instantiating a runtime. Bug: 22322814 Test: test-art Change-Id: Ia642e515937068d385e5bb1e10bbd3e50a6e36d2
2016-06-29Special case the suspend to runnable transition when locking. Nicolas Geoffray
The runtime may be shutting down in parallel, and for daemons that could lead to failed locking assertions. bug:27378067 Change-Id: I53785cad537a3d4846661a7b0780543226ea3928
2015-07-13ART: JNI thread state transition optimization Yu Li
This patch improves the JNI performance by removing the explicit acquiring and releasing the mutator lock when a thread state transits between suspended and runnable states. The functions responsible for changing the state were found to be the costliest part of the JNI. Originally, a thread needs to acquire a shared mutator lock by a CAS instruction when entering the runnable state and also needs to release the lock by a CAS when entering the native state from runnable. This patch removes these CAS operations when a thread state transits between suspended and runnable. A thread in the runnable state is considered to have shared ownership of the mutator lock and therefore transitions in and out of the runnable state have associated implication on the mutator lock ownership. Meanwhile, a barrier is added to control suspending all threads from running. JNI transition overhead was reduced by 25% on IA platform and by 17% on ARM platform by this patch, while it has little impact on GC pause time (measured with "suspend all histogram"). Change-Id: Icee95d8ffff1bbfc95309a41cc48836536fec689 Signed-off-by: Yu, Li <yu.l.li@intel.com> Signed-off-by: Haitao, Feng <haitao.feng@intel.com> Signed-off-by: Lei, Li <lei.l.li@intel.com>
2015-05-26ART: Clean up arm64 kNumberOfXRegisters usage. Vladimir Marko
Avoid undefined behavior for arm64 stemming from 1u << 32 in loops with upper bound kNumberOfXRegisters. Create iterators for enumerating bits in an integer either from high to low or from low to high and use them for <arch>Context::FillCalleeSaves() on all architectures. Refactor runtime/utils.{h,cc} by moving all bit-fiddling functions to runtime/base/bit_utils.{h,cc} (together with the new bit iterators) and all time-related functions to runtime/base/time_utils.{h,cc}. Improve test coverage and fix some corner cases for the bit-fiddling functions. Bug: 13925192 Change-Id: I704884dab15b41ecf7a1c47d397ab1c3fc7ee0f7
2015-04-22Replace NULL with nullptr Mathieu Chartier
Also fixed some lines that were too long, and a few other minor details. Change-Id: I6efba5fb6e03eb5d0a300fddb2a75bf8e2f175cb
2014-12-09Revert "Tidy gAborting." Nicolas Geoffray
Creates infinite loop: b/18674776. This reverts commit 015b137efb434528173779bc3ec8d72494456254. Change-Id: I67fe310d2e95ee2ec37bec842be06fb1123b6f4e
2014-12-04Tidy gAborting. Ian Rogers
Reduce scope to Runtime::Abort and short-cut recursive case earlier. gAborting remains global to avoid two fatal errors in thread and the verifier. Change-Id: Ibc893f891ffee9a763c65cde9507d99083d47b3f
2014-11-21Avoid some recursive aborting. Ian Rogers
Bug: 18469797 Change-Id: Ic1889a605a041bdec679ff54f8dce3842d85f2e1
2014-11-06Mac host doesn't define ART_USE_FUTEXES. Ian Rogers
Change-Id: Ic2c23d267cfd56db58754f45154436a085eeaa78
2014-11-06Move include of system headers outside namesapce. Chih-Hung Hsieh
This happened to work with old system header files. But with newer glibc 2.15 header files, typedef names such as __u32 and __u64 are included into a namespace and could not be used in other system header files. BUG: 18275923 Change-Id: I7c61270d08a7b1c69cee55a6a23b00372f0f51c8