ARM64: FP16.compare() intrinsic for ARMv8
This CL implements an intrinsic for compare() method with
ARMv8.2 FP16 instructions.
The performance improvements using timeCompareFP16 FP16Intrinsic
micro intrinsic benchmark on pixel4:
- Java implementation libcore.util.FP16.compare:
- big cluster only: 742
- little cluster only: 2286
- arm64 compare Intrinisic implementation:
- big cluster only: 492 (~34% faster)
- little cluster only: 1535 (~33% faster)
The benchmark can be found in the following patch:
https://android-review.linaro.org/c/linaro/art-testing/+/21039
Authors: Usama Arif, Edward Pickup, Joel Goddard
Test: 580-checker-fp16
Test: art/test/testrunner/run_build_test_target.py -j80 art-test-javac
Change-Id: Idbe9f56f964f044e6d725bd696459fb04d2ac76c
9 files changed