runtime: Bitstring implementation for subtype checking (4/4).

Integrate the previous CLs into ART Runtime. Subsequent CLs to add
optimizing compiler support.

Use spare 24-bits from "Class#status_" field to
implement faster subtype checking in the runtime. Does not incur any extra memory overhead,
and (when in compiled code) this is always as fast or faster than the original check.

The new subtype checking is O(1) of the form:

  src <: target :=
    (*src).status >> #imm_target_mask == #imm_target_shifted

Based on the original prototype CL by Zhengkai Wu:
https://android-review.googlesource.com/#/c/platform/art/+/440996/

Test: art/test.py -b -j32 --host
Bug: 64692057
Change-Id: Iec3c54af529055a7f6147eebe5611d9ecd46942b
diff --git a/compiler/optimizing/code_generator_mips.cc b/compiler/optimizing/code_generator_mips.cc
index 9d0826f..8517711 100644
--- a/compiler/optimizing/code_generator_mips.cc
+++ b/compiler/optimizing/code_generator_mips.cc
@@ -1983,7 +1983,7 @@
 
 void InstructionCodeGeneratorMIPS::GenerateClassInitializationCheck(SlowPathCodeMIPS* slow_path,
                                                                     Register class_reg) {
-  __ LoadFromOffset(kLoadWord, TMP, class_reg, mirror::Class::StatusOffset().Int32Value());
+  __ LoadFromOffset(kLoadSignedByte, TMP, class_reg, mirror::Class::StatusOffset().Int32Value());
   __ LoadConst32(AT, mirror::Class::kStatusInitialized);
   __ Blt(TMP, AT, slow_path->GetEntryLabel());
   // Even if the initialized flag is set, we need to ensure consistent memory ordering.