ART: Simplify HRem to reuse existing HDiv
A pattern seen in libcore and SPECjvm2008 workloads is a pair of HRem/HDiv
having the same dividend and divisor. The code generator processes
them separately and generates duplicated instructions calculating HDiv.
This CL adds detection of such a pattern to the instruction simplifier.
This optimization affects HInductionVarAnalysis and HLoopOptimization
preventing some loop optimizations. To avoid this the instruction simplifier
has the loop_friendly mode which means not to optimize HRems if they are in a loop.
A microbenchmark run on Pixel 3 shows the following improvements:
| little cores | big cores
arm32 Int32 | +21% | +40%
arm32 Int64 | +46% | +44%
arm64 Int32 | +27% | +14%
arm64 Int64 | +33% | +27%
Test: 411-checker-instruct-simplifier-hrem
Test: test.py --host --optimizing --jit --gtest --interpreter
Test: test.py --target --optimizing --jit --interpreter
Test: run-gtests.sh
Change-Id: I376a1bd299d7fe10acad46771236edd5f85dfe56
diff --git a/compiler/optimizing/optimizing_compiler.cc b/compiler/optimizing/optimizing_compiler.cc
index 45d31ba..8d4aa9f 100644
--- a/compiler/optimizing/optimizing_compiler.cc
+++ b/compiler/optimizing/optimizing_compiler.cc
@@ -643,7 +643,7 @@
// Simplification.
OptDef(OptimizationPass::kConstantFolding,
"constant_folding$after_bce"),
- OptDef(OptimizationPass::kInstructionSimplifier,
+ OptDef(OptimizationPass::kAggressiveInstructionSimplifier,
"instruction_simplifier$after_bce"),
// Other high-level optimizations.
OptDef(OptimizationPass::kSideEffectsAnalysis,
@@ -656,7 +656,7 @@
// The codegen has a few assumptions that only the instruction simplifier
// can satisfy. For example, the code generator does not expect to see a
// HTypeConversion from a type to the same type.
- OptDef(OptimizationPass::kInstructionSimplifier,
+ OptDef(OptimizationPass::kAggressiveInstructionSimplifier,
"instruction_simplifier$before_codegen"),
// Eliminate constructor fences after code sinking to avoid
// complicated sinking logic to split a fence with many inputs.