optimizing: add block-scoped constructor fence merging pass
Introduce a new "Constructor Fence Redundancy Elimination" pass.
The pass currently performs local optimization only, i.e. within instructions
in the same basic block.
All constructor fences preceding a publish (e.g. store, invoke) get
merged into one instruction.
==============
OptStat#ConstructorFenceGeneratedNew: 43825
OptStat#ConstructorFenceGeneratedFinal: 17631 <+++
OptStat#ConstructorFenceRemovedLSE: 164
OptStat#ConstructorFenceRemovedPFRA: 9391
OptStat#ConstructorFenceRemovedCFRE: 16133 <---
Removes ~91.5% of the 'final' constructor fences in RitzBenchmark:
(We do not distinguish the exact reason that a fence was created, so
it's possible some "new" fences were also removed.)
==============
Test: art/test/run-test --host --optimizing 476-checker-ctor-fence-redun-elim
Bug: 36656456
Change-Id: I8020217b448ad96ce9b7640aa312ae784690ad99
diff --git a/compiler/optimizing/nodes.h b/compiler/optimizing/nodes.h
index 29be8ac..3e4928b 100644
--- a/compiler/optimizing/nodes.h
+++ b/compiler/optimizing/nodes.h
@@ -6634,13 +6634,24 @@
// Returns how many HConstructorFence instructions were removed from graph.
static size_t RemoveConstructorFences(HInstruction* instruction);
+ // Combine all inputs of `this` and `other` instruction and remove
+ // `other` from the graph.
+ //
+ // Inputs are unique after the merge.
+ //
+ // Requirement: `this` must not be the same as `other.
+ void Merge(HConstructorFence* other);
+
// Check if this constructor fence is protecting
// an HNewInstance or HNewArray that is also the immediate
// predecessor of `this`.
//
+ // If `ignore_inputs` is true, then the immediate predecessor doesn't need
+ // to be one of the inputs of `this`.
+ //
// Returns the associated HNewArray or HNewInstance,
// or null otherwise.
- HInstruction* GetAssociatedAllocation();
+ HInstruction* GetAssociatedAllocation(bool ignore_inputs = false);
DECLARE_INSTRUCTION(ConstructorFence);