Wait sooner for non-daemon threads

When the main thread returns, we attempt to shut down the runtime.
Sometime during that process we always waited for non-daemon threads
to complete as required. But previously we only did so after
the runtime was partially shut down, potentially causing the
remaining threads to deadlock.

This explicitly waits before we start destroying the runtime.

Add test to make sure that a long running child thread finishes
properly.

Bug: 148126377
Bug: 147619421
Test: New test fails without waiting call, passes with.
Change-Id: Ic60d695c8a03543b51d8532156f19fff00a58edc
6 files changed