Revert^8 "Thread suspension cleanup and deadlock fix"
This reverts commit c85ae17f8267ac528e58892099dcefcc73bb8a26.
PS1: Identical to aosp/2266238
PS2: Address the lint failure that was the primary cause of the revert.
Don't print information about what caused a SuspendAll loop
unless we actually gathered the information.
Restructure thread flip to drop the assumption that certain threads
will shortly become Runnable. That added a lot fo complexity and
was deadlock-prone. We now simply try to run the flip function both
in the target thread when it tries to become runnable again, and in
the requesting thread, without paying attention to the target
thread's state. The first attempt succeeds. Which means the
originating thread will only succeed if the target is still
suspended.
This adds some complexity to deal with threads terminating in the
meantime, but it avoids several issues:
1) RequestSynchronousCheckpoint blocked with thread_list_lock
and suspend_count_lock, while waiting for flip_function to become
non-null. This made it at best hard to reason about deadlock
freedom. Several other functions also had to wait for a null
flip_function, complicating the code.
2) Synchronization in FlipThreadRoots was questionable. In order to
tell when to treat a thread as previously runnable, it looked at
thread state bits that could change asynchronously. AFAICT, this
was probably correct under sequential consistency, but not with
the actual specified memory ordering. That code was deleted.
3) If a thread was intended to become runnable shortly after
the start of the flip, we paused it until all thread flips were
completed. This probably occasionally added latency that
escaped our measurements.
Weaken several assertions involving IsSuspended() to merely
claim the thread is not runnable, to be consistent with the
above change. The stringer assertion no longer holds in the
flip function.
Assert that we never suspend while running the GC, to ensure
the GC never acts on a thread suspension request, which would
likely result in deadlock.
Update mutator_gc_coord.md to reflect additional insights about
this code.
Change the last parameter of DecrementSuspendCount to be a boolean
rather than SuspendReason, since we mostly ignore the precise
SuspendReason.
Add NotifyOnThreadExit mechanism so that we can tell whether
a thread exited, even if we release thread_list_lock_.
Rewrite RequestSynchronousCheckpoint to take advantage of the
above. The new thread-flip code also uses it.
Remove now unnecessary checks that we do not suspend with a thread
flip in progress.
Various secondary changes and simplifications that follow from the
above.
Reduce DefaultThreadSuspendTimeout to something below ANR timeout.
Explicitly ensure that when FlipThreadRoots() returns, all thread
flips have completed. Previously that was mostly true, but actually
guaranteed by barrier code in the collector. Remove that code.
(The old version was hard to fix in light of potential exiting
threads.)
PS3: Rebase
PS4: Fix and complete PS2 changes.
PS5: Edit commit message.
PS6: Update entry_points_order_test, again.
PS7-8: Address many minor reviewer comments. Remove more dead code,
including all the IsTransitioningToRunnable stuff.
PS9: Slightly messy rebase
PS10: Address comments. Most notably:
SuspendAll now ensures that the caller is not left with a pending
flip function.
GetPeerFromOtherThread() sometimes runs the flip function instead
of calling mark. The old way would not work for CMC. This makes it
no longer const.
PS11: Fix a PS10 oversight, mostly in the CMC collector code.
PS12: Fix comment and documentation typos.
Test: Run host run tests. TreeHugger.
Bug: 240742796
Bug: 203363895
Bug: 238032384
Bug: 253671779
Change-Id: I81e366d4b739c5b48bd3c1509acb90d2a14e18d1
29 files changed