Allow LSA to run with acquire/release operations

LSA will run in graphs with acquire loads (i.e. monitor enter and
volatile load) and release stores (i.e. monitor exit and volatile
stores).

Helps both LSE and the Scheduler, and brings code size and memory use
reductions. For example, ~40KB (~0.1%) reduction in memory use when
compiling android framework in armv8.

Code size gains (locally run on Pixel 5 w/ AOSP):
  Android Google Search App (AGSA): 209KB
  System server: 44KB
  System UI: 20KB
which is ~0.1% for each compile.

Bug: 227283233
Test: art/test/testrunner/testrunner.py --host --64 --optimizing -b
Change-Id: I9ac79cf2324348414186f95e531c98b4215b28ea
8 files changed