ART: ARM64: Support DotProd SIMD idiom.

Implement support for vectorization idiom which performs dot
product of two vectors and adds the result to wider precision
components in the accumulator.

viz. DOT_PRODUCT([ a1, .. , am], [ x1, .. , xn ], [ y1, .. , yn ]) =
                 [ a1 + sum(xi * yi), .. , am + sum(xj * yj) ],
     for m <= n, non-overlapping sums,
     for either both signed or both unsigned operands x, y.

The patch shows up to 7x performance improvement on a micro
benchmark on Cortex-A57.

Test: 684-checker-simd-dotprod.
Test: test-art-host, test-art-target.

Change-Id: Ibab0d51f537fdecd1d84033197be3ebf5ec4e455
18 files changed