Make suspend check test specific flags.

Make 20 bits in `Thread.tls32_.state_and_flags` available
for new uses.

Code size changes per suspend check:
  - x86/x86-64: +3B (CMP r/m32, imm8 -> TST r/m32, imm32)
  - arm: none (CMP -> TST, both 32-bit with high register)
  - arm64: +4B (CBNZ/CBZ -> TST+BNE/BEQ)

Note: Using implicit suspend checks on arm64 would sidestep
this code size increase entirely.

Test: m test-art-host-gtest
Test: testrunner.py --host --optimizing
Test: run-gtests.sh
Test: testrunner.py --target --optimizing
Bug: 172332525
Change-Id: If5b0be0183efba3f397596b22e03a8b7afb87f85
13 files changed