Do fewer GCs shortly after zygote fork

After zygote fork, increase heap limit (max_allowed_footprint_)
to maximum growth limit and concurrent GC threshold (concurrent_start_bytes_)
to half of heap limit for 2s. This means there will be, most likely, no GCs
that happen during launch for most apps. This should reduce startup time
of apps, as well as, save some power.

After the 2s windows is done, a concurrent GC is done to free up
RAM and adjust the counters back to normal, if no GC took place
so far.

Not measured: Boot time, transient RAM usage increase.

-----------------------------------------------------
App     | Avg speed     | Heap size       | GC count
-----------------------------------------------------
Camera  | 567ms / 588ms |  4.3MB / 2.7MB  | 2 / 4
-----------------------------------------------------
Chrome  | 350ms / 394ms |  2.5MB / 1.5MB  | 0 / 2
-----------------------------------------------------
Photos  | 447ms / 516ms |    6MB / 4MB    | 0 / 3
-----------------------------------------------------
Maps    |1419ms / 1440ms| 19.5MB / 11MB   | 0 / 5
-----------------------------------------------------
Gmail   | 148ms / 156ms |  3.5MB / 2.5MB  | 0 / 1
-----------------------------------------------------
Youtube | 721ms / 761ms |    8MB / 4.5MB  | 0 / 3
-----------------------------------------------------
Notes:
1) Format: <with change / without change>
2) For Camera app, 2 GCs are caused by native allocations
3) Speed is averaged over 100 runs
4) Heap size is at end of the launch

Bug: 36727951
Test: test-art-host
Change-Id: I4ca9b5be7433097851560f8738fbc8cae733d85e
diff --git a/runtime/gc/heap.cc b/runtime/gc/heap.cc
index b1932d1..9c21cba 100644
--- a/runtime/gc/heap.cc
+++ b/runtime/gc/heap.cc
@@ -143,6 +143,10 @@
 static constexpr size_t kPartialTlabSize = 16 * KB;
 static constexpr bool kUsePartialTlabs = true;
 
+// Use Max heap for 2 seconds, this is smaller than the usual 5s window since we don't want to leave
+// allocate with relaxed ergonomics for that long.
+static constexpr size_t kPostForkMaxHeapDurationMS = 2000;
+
 #if defined(__LP64__) || !defined(ADDRESS_SANITIZER)
 // 300 MB (0x12c00000) - (default non-moving space capacity).
 static uint8_t* const kPreferredAllocSpaceBegin =
@@ -3523,6 +3527,12 @@
 }
 
 void Heap::ClearGrowthLimit() {
+  if (max_allowed_footprint_ == growth_limit_ && growth_limit_ < capacity_) {
+    max_allowed_footprint_ = capacity_;
+    concurrent_start_bytes_ =
+         std::max(max_allowed_footprint_, kMinConcurrentRemainingBytes) -
+         kMinConcurrentRemainingBytes;
+  }
   growth_limit_ = capacity_;
   ScopedObjectAccess soa(Thread::Current());
   for (const auto& space : continuous_spaces_) {
@@ -4094,5 +4104,32 @@
              << PrettySize(new_footprint) << " for a " << PrettySize(alloc_size) << " allocation";
 }
 
+class Heap::TriggerPostForkCCGcTask : public HeapTask {
+ public:
+  explicit TriggerPostForkCCGcTask(uint64_t target_time) : HeapTask(target_time) {}
+  void Run(Thread* self) OVERRIDE {
+    gc::Heap* heap = Runtime::Current()->GetHeap();
+    // Trigger a GC, if not already done. The first GC after fork, whenever
+    // takes place, will adjust the thresholds to normal levels.
+    if (heap->max_allowed_footprint_ == heap->growth_limit_) {
+      heap->RequestConcurrentGC(self, kGcCauseBackground, false);
+    }
+  }
+};
+
+void Heap::PostForkChildAction(Thread* self) {
+  // Temporarily increase max_allowed_footprint_ and concurrent_start_bytes_ to
+  // max values to avoid GC during app launch.
+  if (collector_type_ == kCollectorTypeCC && !IsLowMemoryMode()) {
+    // Set max_allowed_footprint_ to the largest allowed value.
+    SetIdealFootprint(growth_limit_);
+    // Set concurrent_start_bytes_ to half of the heap size.
+    concurrent_start_bytes_ = std::max(max_allowed_footprint_ / 2, GetBytesAllocated());
+
+    GetTaskProcessor()->AddTask(
+        self, new TriggerPostForkCCGcTask(NanoTime() + MsToNs(kPostForkMaxHeapDurationMS)));
+  }
+}
+
 }  // namespace gc
 }  // namespace art