ARM64: Combine LSR+ADD into ADD_shift for Int32 HDiv/HRem
HDiv/HRem having a constant divisor are optimized by using
multiplication of the dividend by a sort of reciprocal of the divisor.
In case of Int32 the multiplication is done into a 64-bit register
high 32 bits of which are only used.
The multiplication result might need some ADD/SUB corrections.
Currently it is done by extracting high 32 bits with LSR and applying
ADD/SUB. However we can do correcting ADD/SUB on high 32 bits and extracting
those bits with the final right shift. This will eliminate the
extracting LSR instruction.
This CL implements this optimization.
Test: test.py --host --optimizing --jit
Test: test.py --target --optimizing --jit
Change-Id: I5ba557aa283291fd76d61ac0eb733cf6ea975116
diff --git a/compiler/optimizing/code_generator_arm64.h b/compiler/optimizing/code_generator_arm64.h
index 1e1c2a9..8349732 100644
--- a/compiler/optimizing/code_generator_arm64.h
+++ b/compiler/optimizing/code_generator_arm64.h
@@ -342,22 +342,14 @@
vixl::aarch64::Label* false_target);
void DivRemOneOrMinusOne(HBinaryOperation* instruction);
void DivRemByPowerOfTwo(HBinaryOperation* instruction);
-
- // Helper to generate code producing the final result of HDiv/HRem with a constant divisor.
- // 'temp_result' holds the result of multiplication of the dividend by a sort of reciprocal
- // of the divisor (magic_number). Based on magic_number and divisor, temp_result might need
- // to be corrected before applying final_right_shift.
- // If the code is generated for HRem the final temp_result is used for producing the
- // remainder.
- void GenerateResultDivRemWithAnyConstant(bool is_rem,
- int final_right_shift,
- int64_t magic_number,
- int64_t divisor,
- vixl::aarch64::Register dividend,
- vixl::aarch64::Register temp_result,
- vixl::aarch64::Register out,
- // This function may acquire a scratch register.
- vixl::aarch64::UseScratchRegisterScope* temps_scope);
+ void GenerateIncrementNegativeByOne(vixl::aarch64::Register out,
+ vixl::aarch64::Register in, bool use_cond_inc);
+ void GenerateResultRemWithAnyConstant(vixl::aarch64::Register out,
+ vixl::aarch64::Register dividend,
+ vixl::aarch64::Register quotient,
+ int64_t divisor,
+ // This function may acquire a scratch register.
+ vixl::aarch64::UseScratchRegisterScope* temps_scope);
void GenerateInt64DivRemWithAnyConstant(HBinaryOperation* instruction);
void GenerateInt32DivRemWithAnyConstant(HBinaryOperation* instruction);
void GenerateDivRemWithAnyConstant(HBinaryOperation* instruction);