optimizing: Build HConstructorFence for HNewArray/HNewInstance nodes

Also fixes:
* LSE, code_sinking to keep optimizing new-instance if it did so before
* Various tests to expect constructor fences after new-instance

Sidenote: new-instance String does not get a ConstructorFence; the
special StringFactory calls are assumed to be self-fencing.

Metric changes on go/lem:
* CodeSize -0.262% in ART-Compile (ARMv8)
* RunTime -0.747% for all (linux-armv8)

(No changes expected to x86, constructor fences are no-op).

The RunTime regression is temporary until art_quick_alloc_* entrypoints have their
DMBs removed in a follow up CL.

Test: art/test.py
Bug: 36656456
Change-Id: I6a936a6e51c623e1c6b5b22eee5c3c72bebbed35
diff --git a/compiler/optimizing/nodes.cc b/compiler/optimizing/nodes.cc
index f250c1a..1460b26 100644
--- a/compiler/optimizing/nodes.cc
+++ b/compiler/optimizing/nodes.cc
@@ -1234,6 +1234,20 @@
   }
 }
 
+HInstruction* HConstructorFence::GetAssociatedAllocation() {
+  HInstruction* new_instance_inst = GetPrevious();
+  // Check if the immediately preceding instruction is a new-instance/new-array.
+  // Otherwise this fence is for protecting final fields.
+  if (new_instance_inst != nullptr &&
+      (new_instance_inst->IsNewInstance() || new_instance_inst->IsNewArray())) {
+    // TODO: Need to update this code to handle multiple inputs.
+    DCHECK_EQ(InputCount(), 1u);
+    return new_instance_inst;
+  } else {
+    return nullptr;
+  }
+}
+
 #define DEFINE_ACCEPT(name, super)                                             \
 void H##name::Accept(HGraphVisitor* visitor) {                                 \
   visitor->Visit##name(this);                                                  \