Faster allocation fast path

Added a new object size field to class, this field contains the
aligned object size if the object is not finalizable and is
initialized. If the object is finalizable or uninitialized the field
is set to some large value that forces the ASM allocators to go slow
path.

Only implemented for region/normal TLAB for now, will add the to
RosAlloc stubs soon.

CC N6P MemAllocTest: 1067 -> 1039 (25 samples)
CC N6P EAAC: 1281 -> 1260 (25 samples)

RAM overhead technically 0 since mirror::Class was not 8 byte aligned
previously. Since the allocators require 8 byte allignment, there
would have been 1 word of padding at the end of the class. If there
was actually 4 extra bytes per class, the system overhead would be
36000 * 4 = 120KB based on old N6P numbers for the number of loaded
classes after boot.

Bug: 9986565

Test: test-art-host CC baker, N6P phone boot and EAAC runs.

Change-Id: I119a87b8cc6c980bff980a0c62f42610dab5e531
diff --git a/runtime/mirror/class.cc b/runtime/mirror/class.cc
index 96b3345..b60c573 100644
--- a/runtime/mirror/class.cc
+++ b/runtime/mirror/class.cc
@@ -100,9 +100,21 @@
   }
   static_assert(sizeof(Status) == sizeof(uint32_t), "Size of status not equal to uint32");
   if (Runtime::Current()->IsActiveTransaction()) {
-    h_this->SetField32Volatile<true>(OFFSET_OF_OBJECT_MEMBER(Class, status_), new_status);
+    h_this->SetField32Volatile<true>(StatusOffset(), new_status);
   } else {
-    h_this->SetField32Volatile<false>(OFFSET_OF_OBJECT_MEMBER(Class, status_), new_status);
+    h_this->SetField32Volatile<false>(StatusOffset(), new_status);
+  }
+
+  // Setting the object size alloc fast path needs to be after the status write so that if the
+  // alloc path sees a valid object size, we would know that it's initialized as long as it has a
+  // load-acquire/fake dependency.
+  if (new_status == kStatusInitialized && !h_this->IsVariableSize()) {
+    uint32_t object_size = RoundUp(h_this->GetObjectSize(), kObjectAlignment);
+    if (h_this->IsFinalizable()) {
+      // Finalizable objects must always go slow path.
+      object_size = std::numeric_limits<int32_t>::max();
+    }
+    h_this->SetObjectSizeAllocFastPath(object_size);
   }
 
   if (!class_linker_initialized) {
@@ -1209,5 +1221,13 @@
   return flags;
 }
 
+void Class::SetObjectSizeAllocFastPath(uint32_t new_object_size) {
+  if (Runtime::Current()->IsActiveTransaction()) {
+    SetField32Volatile<true>(ObjectSizeAllocFastPathOffset(), new_object_size);
+  } else {
+    SetField32Volatile<false>(ObjectSizeAllocFastPathOffset(), new_object_size);
+  }
+}
+
 }  // namespace mirror
 }  // namespace art