Fast Art interpreter

Add a Dalvik-style fast interpreter to Art.
Three primary deficiencies in the existing Art interpreter
will be addressed:

1.  Structural inefficiencies (primarily the bloated
    fetch/decode/execute overhead of the C++ interpreter
    implementation).
2.  Stack memory wastage.  Each managed-language invoke
    adds a full copy of the interpreter's compiler-generated
    locals on the shared stack.  We're at the mercy of
    the compiler now in how much memory is wasted here.  An
    assembly based interpreter can manage memory usage more
    effectively.
3.  Shadow frame model, which not only spends twice the memory
    to store the Dalvik virtual registers, but causes vreg stores
    to happen twice.

This CL mostly deals with #1 (but does provide some stack memory
savings).  Subsequent CLs will address the other issues.

Current status:
   Passes all run-tests.
   Phone boots interpret-only.
   2.5x faster than Clang-compiled Art goto interpreter on fetch/decode/execute
       microbenchmark, 5x faster than gcc-compiled goto interpreter.
   1.6x faster than Clang goto on Caffeinemark overall
   2.0x faster than Clang switch on Caffeinemark overall
   68% of Dalvik interpreter performance on Caffeinemark (still much slower,
       primarily because of poor invoke performance and lack of execute-inline)
   Still nearly an order of magnitude slower than Dalvik on invokes
       (but slightly better than Art Clang goto interpreter.
   Importantly, saves ~200 bytes of stack memory per invoke (but still
       wastes ~400 relative to Dalvik).

What's needed:
   Remove the (large quantity of) bring-up hackery in place.
   Integrate into the build mechanism.  I'm still using the old Dalvik manual
       build step to generate assembly code from the stub files.
   Remove the suspend check hack.  For bring-up purposes, I'm using an explicit
       suspend check (like the other Art interpreters).  However, we should be
       doing a Dalvik style suspend check via the table base switch mechanism.
       This should be done during the alternative interpreter activation.
   General cleanup.
   Add CFI info.
   Update the new target bring-up README documentation.
   Add other targets.

In later CLs:
   Consolidate mterp handlers for expensive operations (such as new-instance) with
       the code used by the switch interpreter.  No need to duplicate the code for
       heavyweight operations (but will need some refactoring to align).
   Tuning - some fast paths needs to be moved down to the assembly handlers,
       rather than being dealt with in the out-of-line code.
   JIT profiling.  Currently, the fast interpreter is used only in the fast
       case - no instrumentation, no transactions and no access checks. We
       will want to implement fast + JIT-profiling as the alternate fast
       interpreter.  All other cases can still fall back to the reference
       interpreter.
   Improve invoke performance.  We're nearly an order of magnitude slower than
       Dalvik here.  Some of that is unavoidable, but I suspect we can do
       better.
   Add support for our other targets.

Change-Id: I43e25dc3d786fb87245705ac74a87274ad34fedc
diff --git a/runtime/interpreter/mterp/arm/alt_stub.S b/runtime/interpreter/mterp/arm/alt_stub.S
new file mode 100644
index 0000000..92ae0c6
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/alt_stub.S
@@ -0,0 +1,12 @@
+/*
+ * Inter-instruction transfer stub.  Call out to MterpCheckBefore to handle
+ * any interesting requests and then jump to the real instruction
+ * handler.  Note that the call to MterpCheckBefore is done as a tail call.
+ */
+    .extern MterpCheckBefore
+    EXPORT_PC
+    ldr    rIBASE, [rSELF, #THREAD_CURRENT_IBASE_OFFSET]            @ refresh IBASE.
+    adrl   lr, artMterpAsmInstructionStart + (${opnum} * 128)       @ Addr of primary handler.
+    mov    r0, rSELF
+    add    r1, rFP, #OFF_FP_SHADOWFRAME
+    b      MterpCheckBefore     @ (self, shadow_frame)              @ Tail call.
diff --git a/runtime/interpreter/mterp/arm/bincmp.S b/runtime/interpreter/mterp/arm/bincmp.S
new file mode 100644
index 0000000..474bc3c
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/bincmp.S
@@ -0,0 +1,36 @@
+    /*
+     * Generic two-operand compare-and-branch operation.  Provide a "revcmp"
+     * fragment that specifies the *reverse* comparison to perform, e.g.
+     * for "if-le" you would use "gt".
+     *
+     * For: if-eq, if-ne, if-lt, if-ge, if-gt, if-le
+     */
+    /* if-cmp vA, vB, +CCCC */
+#if MTERP_SUSPEND
+    mov     r1, rINST, lsr #12          @ r1<- B
+    ubfx    r0, rINST, #8, #4           @ r0<- A
+    GET_VREG r3, r1                     @ r3<- vB
+    GET_VREG r2, r0                     @ r2<- vA
+    FETCH_S r1, 1                       @ r1<- branch offset, in code units
+    cmp     r2, r3                      @ compare (vA, vB)
+    mov${revcmp} r1, #2                 @ r1<- BYTE branch dist for not-taken
+    adds    r2, r1, r1                  @ convert to bytes, check sign
+    FETCH_ADVANCE_INST_RB r2            @ update rPC, load rINST
+    ldrmi   rIBASE, [rSELF, #THREAD_CURRENT_IBASE_OFFSET]  @ refresh rIBASE
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+#else
+    mov     r1, rINST, lsr #12          @ r1<- B
+    ubfx    r0, rINST, #8, #4           @ r0<- A
+    GET_VREG r3, r1                     @ r3<- vB
+    GET_VREG r2, r0                     @ r2<- vA
+    ldr     lr, [rSELF, #THREAD_FLAGS_OFFSET]
+    FETCH_S r1, 1                       @ r1<- branch offset, in code units
+    cmp     r2, r3                      @ compare (vA, vB)
+    mov${revcmp} r1, #2                 @ r1<- BYTE branch dist for not-taken
+    adds    r2, r1, r1                  @ convert to bytes, check sign
+    FETCH_ADVANCE_INST_RB r2            @ update rPC, load rINST
+    bmi     MterpCheckSuspendAndContinue
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+#endif
diff --git a/runtime/interpreter/mterp/arm/binop.S b/runtime/interpreter/mterp/arm/binop.S
new file mode 100644
index 0000000..eeb72ef
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/binop.S
@@ -0,0 +1,35 @@
+%default {"preinstr":"", "result":"r0", "chkzero":"0"}
+    /*
+     * Generic 32-bit binary operation.  Provide an "instr" line that
+     * specifies an instruction that performs "result = r0 op r1".
+     * This could be an ARM instruction or a function call.  (If the result
+     * comes back in a register other than r0, you can override "result".)
+     *
+     * If "chkzero" is set to 1, we perform a divide-by-zero check on
+     * vCC (r1).  Useful for integer division and modulus.  Note that we
+     * *don't* check for (INT_MIN / -1) here, because the ARM math lib
+     * handles it correctly.
+     *
+     * For: add-int, sub-int, mul-int, div-int, rem-int, and-int, or-int,
+     *      xor-int, shl-int, shr-int, ushr-int, add-float, sub-float,
+     *      mul-float, div-float, rem-float
+     */
+    /* binop vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    mov     r3, r0, lsr #8              @ r3<- CC
+    and     r2, r0, #255                @ r2<- BB
+    GET_VREG r1, r3                     @ r1<- vCC
+    GET_VREG r0, r2                     @ r0<- vBB
+    .if $chkzero
+    cmp     r1, #0                      @ is second operand zero?
+    beq     common_errDivideByZero
+    .endif
+
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    $preinstr                           @ optional op; may set condition codes
+    $instr                              @ $result<- op, r0-r3 changed
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG $result, r9                @ vAA<- $result
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 11-14 instructions */
diff --git a/runtime/interpreter/mterp/arm/binop2addr.S b/runtime/interpreter/mterp/arm/binop2addr.S
new file mode 100644
index 0000000..d09a43a
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/binop2addr.S
@@ -0,0 +1,32 @@
+%default {"preinstr":"", "result":"r0", "chkzero":"0"}
+    /*
+     * Generic 32-bit "/2addr" binary operation.  Provide an "instr" line
+     * that specifies an instruction that performs "result = r0 op r1".
+     * This could be an ARM instruction or a function call.  (If the result
+     * comes back in a register other than r0, you can override "result".)
+     *
+     * If "chkzero" is set to 1, we perform a divide-by-zero check on
+     * vCC (r1).  Useful for integer division and modulus.
+     *
+     * For: add-int/2addr, sub-int/2addr, mul-int/2addr, div-int/2addr,
+     *      rem-int/2addr, and-int/2addr, or-int/2addr, xor-int/2addr,
+     *      shl-int/2addr, shr-int/2addr, ushr-int/2addr, add-float/2addr,
+     *      sub-float/2addr, mul-float/2addr, div-float/2addr, rem-float/2addr
+     */
+    /* binop/2addr vA, vB */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    GET_VREG r1, r3                     @ r1<- vB
+    GET_VREG r0, r9                     @ r0<- vA
+    .if $chkzero
+    cmp     r1, #0                      @ is second operand zero?
+    beq     common_errDivideByZero
+    .endif
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+
+    $preinstr                           @ optional op; may set condition codes
+    $instr                              @ $result<- op, r0-r3 changed
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG $result, r9                @ vAA<- $result
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 10-13 instructions */
diff --git a/runtime/interpreter/mterp/arm/binopLit16.S b/runtime/interpreter/mterp/arm/binopLit16.S
new file mode 100644
index 0000000..065394e
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/binopLit16.S
@@ -0,0 +1,29 @@
+%default {"result":"r0", "chkzero":"0"}
+    /*
+     * Generic 32-bit "lit16" binary operation.  Provide an "instr" line
+     * that specifies an instruction that performs "result = r0 op r1".
+     * This could be an ARM instruction or a function call.  (If the result
+     * comes back in a register other than r0, you can override "result".)
+     *
+     * If "chkzero" is set to 1, we perform a divide-by-zero check on
+     * vCC (r1).  Useful for integer division and modulus.
+     *
+     * For: add-int/lit16, rsub-int, mul-int/lit16, div-int/lit16,
+     *      rem-int/lit16, and-int/lit16, or-int/lit16, xor-int/lit16
+     */
+    /* binop/lit16 vA, vB, #+CCCC */
+    FETCH_S r1, 1                       @ r1<- ssssCCCC (sign-extended)
+    mov     r2, rINST, lsr #12          @ r2<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    GET_VREG r0, r2                     @ r0<- vB
+    .if $chkzero
+    cmp     r1, #0                      @ is second operand zero?
+    beq     common_errDivideByZero
+    .endif
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+
+    $instr                              @ $result<- op, r0-r3 changed
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG $result, r9                @ vAA<- $result
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 10-13 instructions */
diff --git a/runtime/interpreter/mterp/arm/binopLit8.S b/runtime/interpreter/mterp/arm/binopLit8.S
new file mode 100644
index 0000000..ec0b3c4
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/binopLit8.S
@@ -0,0 +1,32 @@
+%default {"preinstr":"", "result":"r0", "chkzero":"0"}
+    /*
+     * Generic 32-bit "lit8" binary operation.  Provide an "instr" line
+     * that specifies an instruction that performs "result = r0 op r1".
+     * This could be an ARM instruction or a function call.  (If the result
+     * comes back in a register other than r0, you can override "result".)
+     *
+     * If "chkzero" is set to 1, we perform a divide-by-zero check on
+     * vCC (r1).  Useful for integer division and modulus.
+     *
+     * For: add-int/lit8, rsub-int/lit8, mul-int/lit8, div-int/lit8,
+     *      rem-int/lit8, and-int/lit8, or-int/lit8, xor-int/lit8,
+     *      shl-int/lit8, shr-int/lit8, ushr-int/lit8
+     */
+    /* binop/lit8 vAA, vBB, #+CC */
+    FETCH_S r3, 1                       @ r3<- ssssCCBB (sign-extended for CC
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    and     r2, r3, #255                @ r2<- BB
+    GET_VREG r0, r2                     @ r0<- vBB
+    movs    r1, r3, asr #8              @ r1<- ssssssCC (sign extended)
+    .if $chkzero
+    @cmp     r1, #0                     @ is second operand zero?
+    beq     common_errDivideByZero
+    .endif
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+
+    $preinstr                           @ optional op; may set condition codes
+    $instr                              @ $result<- op, r0-r3 changed
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG $result, r9                @ vAA<- $result
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 10-12 instructions */
diff --git a/runtime/interpreter/mterp/arm/binopWide.S b/runtime/interpreter/mterp/arm/binopWide.S
new file mode 100644
index 0000000..57d43c6
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/binopWide.S
@@ -0,0 +1,38 @@
+%default {"preinstr":"", "result0":"r0", "result1":"r1", "chkzero":"0"}
+    /*
+     * Generic 64-bit binary operation.  Provide an "instr" line that
+     * specifies an instruction that performs "result = r0-r1 op r2-r3".
+     * This could be an ARM instruction or a function call.  (If the result
+     * comes back in a register other than r0, you can override "result".)
+     *
+     * If "chkzero" is set to 1, we perform a divide-by-zero check on
+     * vCC (r1).  Useful for integer division and modulus.
+     *
+     * for: add-long, sub-long, div-long, rem-long, and-long, or-long,
+     *      xor-long, add-double, sub-double, mul-double, div-double,
+     *      rem-double
+     *
+     * IMPORTANT: you may specify "chkzero" or "preinstr" but not both.
+     */
+    /* binop vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    and     r2, r0, #255                @ r2<- BB
+    mov     r3, r0, lsr #8              @ r3<- CC
+    add     r9, rFP, r9, lsl #2         @ r9<- &fp[AA]
+    add     r2, rFP, r2, lsl #2         @ r2<- &fp[BB]
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[CC]
+    ldmia   r2, {r0-r1}                 @ r0/r1<- vBB/vBB+1
+    ldmia   r3, {r2-r3}                 @ r2/r3<- vCC/vCC+1
+    .if $chkzero
+    orrs    ip, r2, r3                  @ second arg (r2-r3) is zero?
+    beq     common_errDivideByZero
+    .endif
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+
+    $preinstr                           @ optional op; may set condition codes
+    $instr                              @ result<- op, r0-r3 changed
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r9, {$result0,$result1}     @ vAA/vAA+1<- $result0/$result1
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 14-17 instructions */
diff --git a/runtime/interpreter/mterp/arm/binopWide2addr.S b/runtime/interpreter/mterp/arm/binopWide2addr.S
new file mode 100644
index 0000000..4e855f2
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/binopWide2addr.S
@@ -0,0 +1,34 @@
+%default {"preinstr":"", "result0":"r0", "result1":"r1", "chkzero":"0"}
+    /*
+     * Generic 64-bit "/2addr" binary operation.  Provide an "instr" line
+     * that specifies an instruction that performs "result = r0-r1 op r2-r3".
+     * This could be an ARM instruction or a function call.  (If the result
+     * comes back in a register other than r0, you can override "result".)
+     *
+     * If "chkzero" is set to 1, we perform a divide-by-zero check on
+     * vCC (r1).  Useful for integer division and modulus.
+     *
+     * For: add-long/2addr, sub-long/2addr, div-long/2addr, rem-long/2addr,
+     *      and-long/2addr, or-long/2addr, xor-long/2addr, add-double/2addr,
+     *      sub-double/2addr, mul-double/2addr, div-double/2addr,
+     *      rem-double/2addr
+     */
+    /* binop/2addr vA, vB */
+    mov     r1, rINST, lsr #12          @ r1<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    add     r1, rFP, r1, lsl #2         @ r1<- &fp[B]
+    add     r9, rFP, r9, lsl #2         @ r9<- &fp[A]
+    ldmia   r1, {r2-r3}                 @ r2/r3<- vBB/vBB+1
+    ldmia   r9, {r0-r1}                 @ r0/r1<- vAA/vAA+1
+    .if $chkzero
+    orrs    ip, r2, r3                  @ second arg (r2-r3) is zero?
+    beq     common_errDivideByZero
+    .endif
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+
+    $preinstr                           @ optional op; may set condition codes
+    $instr                              @ result<- op, r0-r3 changed
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r9, {$result0,$result1}     @ vAA/vAA+1<- $result0/$result1
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 12-15 instructions */
diff --git a/runtime/interpreter/mterp/arm/entry.S b/runtime/interpreter/mterp/arm/entry.S
new file mode 100644
index 0000000..4c5ffc5
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/entry.S
@@ -0,0 +1,64 @@
+/*
+ * Copyright (C) 2016 The Android Open Source Project
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+/*
+ * Interpreter entry point.
+ */
+
+    .text
+    .align  2
+    .global ExecuteMterpImpl
+    .type   ExecuteMterpImpl, %function
+
+/*
+ * On entry:
+ *  r0  Thread* self/
+ *  r1  code_item
+ *  r2  ShadowFrame
+ *  r3  JValue* result_register
+ *
+ */
+
+ExecuteMterpImpl:
+    .fnstart
+    .save {r4-r10,fp,lr}
+    stmfd   sp!, {r4-r10,fp,lr}         @ save 9 regs
+    .pad    #4
+    sub     sp, sp, #4                  @ align 64
+
+    /* Remember the return register */
+    str     r3, [r2, #SHADOWFRAME_RESULT_REGISTER_OFFSET]
+
+    /* Remember the code_item */
+    str     r1, [r2, #SHADOWFRAME_CODE_ITEM_OFFSET]
+
+    /* set up "named" registers */
+    mov     rSELF, r0
+    ldr     r0, [r2, #SHADOWFRAME_NUMBER_OF_VREGS_OFFSET]
+    add     rFP, r2, #SHADOWFRAME_VREGS_OFFSET     @ point to insns[] (i.e. - the dalivk byte code).
+    add     rREFS, rFP, r0, lsl #2                 @ point to reference array in shadow frame
+    ldr     r0, [r2, #SHADOWFRAME_DEX_PC_OFFSET]   @ Get starting dex_pc.
+    add     rPC, r1, #CODEITEM_INSNS_OFFSET        @ Point to base of insns[]
+    add     rPC, rPC, r0, lsl #1                   @ Create direct pointer to 1st dex opcode
+    EXPORT_PC
+
+    /* Starting ibase */
+    ldr     rIBASE, [rSELF, #THREAD_CURRENT_IBASE_OFFSET]
+
+    /* start executing the instruction at rPC */
+    FETCH_INST                          @ load rINST from rPC
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* NOTE: no fallthrough */
diff --git a/runtime/interpreter/mterp/arm/fallback.S b/runtime/interpreter/mterp/arm/fallback.S
new file mode 100644
index 0000000..44e7e12
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/fallback.S
@@ -0,0 +1,3 @@
+/* Transfer stub to alternate interpreter */
+    b    MterpFallback
+
diff --git a/runtime/interpreter/mterp/arm/fbinop.S b/runtime/interpreter/mterp/arm/fbinop.S
new file mode 100644
index 0000000..594ee03
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/fbinop.S
@@ -0,0 +1,23 @@
+    /*
+     * Generic 32-bit floating-point operation.  Provide an "instr" line that
+     * specifies an instruction that performs "s2 = s0 op s1".  Because we
+     * use the "softfp" ABI, this must be an instruction, not a function call.
+     *
+     * For: add-float, sub-float, mul-float, div-float
+     */
+    /* floatop vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    mov     r3, r0, lsr #8              @ r3<- CC
+    and     r2, r0, #255                @ r2<- BB
+    VREG_INDEX_TO_ADDR r3, r3           @ r3<- &vCC
+    VREG_INDEX_TO_ADDR r2, r2           @ r2<- &vBB
+    flds    s1, [r3]                    @ s1<- vCC
+    flds    s0, [r2]                    @ s0<- vBB
+
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    $instr                              @ s2<- op
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    VREG_INDEX_TO_ADDR r9, r9           @ r9<- &vAA
+    fsts    s2, [r9]                    @ vAA<- s2
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/fbinop2addr.S b/runtime/interpreter/mterp/arm/fbinop2addr.S
new file mode 100644
index 0000000..b052a29
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/fbinop2addr.S
@@ -0,0 +1,21 @@
+    /*
+     * Generic 32-bit floating point "/2addr" binary operation.  Provide
+     * an "instr" line that specifies an instruction that performs
+     * "s2 = s0 op s1".
+     *
+     * For: add-float/2addr, sub-float/2addr, mul-float/2addr, div-float/2addr
+     */
+    /* binop/2addr vA, vB */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    mov     r9, rINST, lsr #8           @ r9<- A+
+    VREG_INDEX_TO_ADDR r3, r3           @ r3<- &vB
+    and     r9, r9, #15                 @ r9<- A
+    flds    s1, [r3]                    @ s1<- vB
+    VREG_INDEX_TO_ADDR r9, r9           @ r9<- &vA
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    flds    s0, [r9]                    @ s0<- vA
+
+    $instr                              @ s2<- op
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    fsts    s2, [r9]                    @ vAA<- s2
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/fbinopWide.S b/runtime/interpreter/mterp/arm/fbinopWide.S
new file mode 100644
index 0000000..1bed817
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/fbinopWide.S
@@ -0,0 +1,23 @@
+    /*
+     * Generic 64-bit double-precision floating point binary operation.
+     * Provide an "instr" line that specifies an instruction that performs
+     * "d2 = d0 op d1".
+     *
+     * for: add-double, sub-double, mul-double, div-double
+     */
+    /* doubleop vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    mov     r3, r0, lsr #8              @ r3<- CC
+    and     r2, r0, #255                @ r2<- BB
+    VREG_INDEX_TO_ADDR r3, r3           @ r3<- &vCC
+    VREG_INDEX_TO_ADDR r2, r2           @ r2<- &vBB
+    fldd    d1, [r3]                    @ d1<- vCC
+    fldd    d0, [r2]                    @ d0<- vBB
+
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    $instr                              @ s2<- op
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    VREG_INDEX_TO_ADDR r9, r9           @ r9<- &vAA
+    fstd    d2, [r9]                    @ vAA<- d2
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/fbinopWide2addr.S b/runtime/interpreter/mterp/arm/fbinopWide2addr.S
new file mode 100644
index 0000000..9f56986
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/fbinopWide2addr.S
@@ -0,0 +1,22 @@
+    /*
+     * Generic 64-bit floating point "/2addr" binary operation.  Provide
+     * an "instr" line that specifies an instruction that performs
+     * "d2 = d0 op d1".
+     *
+     * For: add-double/2addr, sub-double/2addr, mul-double/2addr,
+     *      div-double/2addr
+     */
+    /* binop/2addr vA, vB */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    mov     r9, rINST, lsr #8           @ r9<- A+
+    VREG_INDEX_TO_ADDR r3, r3           @ r3<- &vB
+    and     r9, r9, #15                 @ r9<- A
+    fldd    d1, [r3]                    @ d1<- vB
+    VREG_INDEX_TO_ADDR r9, r9           @ r9<- &vA
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    fldd    d0, [r9]                    @ d0<- vA
+
+    $instr                              @ d2<- op
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    fstd    d2, [r9]                    @ vAA<- d2
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/footer.S b/runtime/interpreter/mterp/arm/footer.S
new file mode 100644
index 0000000..75e0037
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/footer.S
@@ -0,0 +1,168 @@
+/*
+ * ===========================================================================
+ *  Common subroutines and data
+ * ===========================================================================
+ */
+
+    .text
+    .align  2
+
+/*
+ * We've detected a condition that will result in an exception, but the exception
+ * has not yet been thrown.  Just bail out to the reference interpreter to deal with it.
+ * TUNING: for consistency, we may want to just go ahead and handle these here.
+ */
+#define MTERP_LOGGING 0
+common_errDivideByZero:
+    EXPORT_PC
+#if MTERP_LOGGING
+    mov  r0, rSELF
+    add  r1, rFP, #OFF_FP_SHADOWFRAME
+    bl MterpLogDivideByZeroException
+#endif
+    b MterpCommonFallback
+
+common_errArrayIndex:
+    EXPORT_PC
+#if MTERP_LOGGING
+    mov  r0, rSELF
+    add  r1, rFP, #OFF_FP_SHADOWFRAME
+    bl MterpLogArrayIndexException
+#endif
+    b MterpCommonFallback
+
+common_errNegativeArraySize:
+    EXPORT_PC
+#if MTERP_LOGGING
+    mov  r0, rSELF
+    add  r1, rFP, #OFF_FP_SHADOWFRAME
+    bl MterpLogNegativeArraySizeException
+#endif
+    b MterpCommonFallback
+
+common_errNoSuchMethod:
+    EXPORT_PC
+#if MTERP_LOGGING
+    mov  r0, rSELF
+    add  r1, rFP, #OFF_FP_SHADOWFRAME
+    bl MterpLogNoSuchMethodException
+#endif
+    b MterpCommonFallback
+
+common_errNullObject:
+    EXPORT_PC
+#if MTERP_LOGGING
+    mov  r0, rSELF
+    add  r1, rFP, #OFF_FP_SHADOWFRAME
+    bl MterpLogNullObjectException
+#endif
+    b MterpCommonFallback
+
+common_exceptionThrown:
+    EXPORT_PC
+#if MTERP_LOGGING
+    mov  r0, rSELF
+    add  r1, rFP, #OFF_FP_SHADOWFRAME
+    bl MterpLogExceptionThrownException
+#endif
+    b MterpCommonFallback
+
+MterpSuspendFallback:
+    EXPORT_PC
+#if MTERP_LOGGING
+    mov  r0, rSELF
+    add  r1, rFP, #OFF_FP_SHADOWFRAME
+    ldr  r2, [rSELF, #THREAD_FLAGS_OFFSET]
+    bl MterpLogSuspendFallback
+#endif
+    b MterpCommonFallback
+
+/*
+ * If we're here, something is out of the ordinary.  If there is a pending
+ * exception, handle it.  Otherwise, roll back and retry with the reference
+ * interpreter.
+ */
+MterpPossibleException:
+    ldr     r0, [rSELF, #THREAD_EXCEPTION_OFFSET]
+    cmp     r0, #0                                  @ Exception pending?
+    beq     MterpFallback                           @ If not, fall back to reference interpreter.
+    /* intentional fallthrough - handle pending exception. */
+/*
+ * On return from a runtime helper routine, we've found a pending exception.
+ * Can we handle it here - or need to bail out to caller?
+ *
+ */
+MterpException:
+    mov     r0, rSELF
+    add     r1, rFP, #OFF_FP_SHADOWFRAME
+    bl      MterpHandleException                    @ (self, shadow_frame)
+    cmp     r0, #0
+    beq     MterpExceptionReturn                    @ no local catch, back to caller.
+    ldr     r0, [rFP, #OFF_FP_CODE_ITEM]
+    ldr     r1, [rFP, #OFF_FP_DEX_PC]
+    ldr     rIBASE, [rSELF, #THREAD_CURRENT_IBASE_OFFSET]
+    add     rPC, r0, #CODEITEM_INSNS_OFFSET
+    add     rPC, rPC, r1, lsl #1                    @ generate new dex_pc_ptr
+    str     rPC, [rFP, #OFF_FP_DEX_PC_PTR]
+    /* resume execution at catch block */
+    FETCH_INST
+    GET_INST_OPCODE ip
+    GOTO_OPCODE ip
+    /* NOTE: no fallthrough */
+
+/*
+ * Check for suspend check request.  Assumes rINST already loaded, rPC advanced and
+ * still needs to get the opcode and branch to it, and flags are in lr.
+ */
+MterpCheckSuspendAndContinue:
+    ldr     rIBASE, [rSELF, #THREAD_CURRENT_IBASE_OFFSET]  @ refresh rIBASE
+    EXPORT_PC
+    mov     r0, rSELF
+    ands    lr, #(THREAD_SUSPEND_REQUEST | THREAD_CHECKPOINT_REQUEST)
+    blne    MterpSuspendCheck           @ (self)
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+
+/*
+ * Bail out to reference interpreter.
+ */
+MterpFallback:
+    EXPORT_PC
+    mov  r0, rSELF
+    add  r1, rFP, #OFF_FP_SHADOWFRAME
+    bl MterpLogFallback
+MterpCommonFallback:
+    mov     r0, #0                                  @ signal retry with reference interpreter.
+    b       MterpDone
+
+/*
+ * We pushed some registers on the stack in ExecuteMterpImpl, then saved
+ * SP and LR.  Here we restore SP, restore the registers, and then restore
+ * LR to PC.
+ *
+ * On entry:
+ *  uint32_t* rFP  (should still be live, pointer to base of vregs)
+ */
+MterpExceptionReturn:
+    ldr     r2, [rFP, #OFF_FP_RESULT_REGISTER]
+    str     r0, [r2]
+    str     r1, [r2, #4]
+    mov     r0, #1                                  @ signal return to caller.
+    b MterpDone
+MterpReturn:
+    ldr     r2, [rFP, #OFF_FP_RESULT_REGISTER]
+    ldr     lr, [rSELF, #THREAD_FLAGS_OFFSET]
+    str     r0, [r2]
+    str     r1, [r2, #4]
+    mov     r0, rSELF
+    ands    lr, #(THREAD_SUSPEND_REQUEST | THREAD_CHECKPOINT_REQUEST)
+    blne    MterpSuspendCheck                       @ (self)
+    mov     r0, #1                                  @ signal return to caller.
+MterpDone:
+    add     sp, sp, #4                              @ un-align 64
+    ldmfd   sp!, {r4-r10,fp,pc}                     @ restore 9 regs and return
+
+
+    .fnend
+    .size   ExecuteMterpImpl, .-ExecuteMterpImpl
+
diff --git a/runtime/interpreter/mterp/arm/funop.S b/runtime/interpreter/mterp/arm/funop.S
new file mode 100644
index 0000000..d7a0859
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/funop.S
@@ -0,0 +1,18 @@
+    /*
+     * Generic 32-bit unary floating-point operation.  Provide an "instr"
+     * line that specifies an instruction that performs "s1 = op s0".
+     *
+     * for: int-to-float, float-to-int
+     */
+    /* unop vA, vB */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    mov     r9, rINST, lsr #8           @ r9<- A+
+    VREG_INDEX_TO_ADDR r3, r3           @ r3<- &vB
+    flds    s0, [r3]                    @ s0<- vB
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    and     r9, r9, #15                 @ r9<- A
+    $instr                              @ s1<- op
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    VREG_INDEX_TO_ADDR r9, r9           @ r9<- &vA
+    fsts    s1, [r9]                    @ vA<- s1
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/funopNarrower.S b/runtime/interpreter/mterp/arm/funopNarrower.S
new file mode 100644
index 0000000..9daec28
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/funopNarrower.S
@@ -0,0 +1,18 @@
+    /*
+     * Generic 64bit-to-32bit unary floating point operation.  Provide an
+     * "instr" line that specifies an instruction that performs "s0 = op d0".
+     *
+     * For: double-to-int, double-to-float
+     */
+    /* unop vA, vB */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    mov     r9, rINST, lsr #8           @ r9<- A+
+    VREG_INDEX_TO_ADDR r3, r3           @ r3<- &vB
+    fldd    d0, [r3]                    @ d0<- vB
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    and     r9, r9, #15                 @ r9<- A
+    $instr                              @ s0<- op
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    VREG_INDEX_TO_ADDR r9, r9           @ r9<- &vA
+    fsts    s0, [r9]                    @ vA<- s0
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/funopWider.S b/runtime/interpreter/mterp/arm/funopWider.S
new file mode 100644
index 0000000..087a1f2
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/funopWider.S
@@ -0,0 +1,18 @@
+    /*
+     * Generic 32bit-to-64bit floating point unary operation.  Provide an
+     * "instr" line that specifies an instruction that performs "d0 = op s0".
+     *
+     * For: int-to-double, float-to-double
+     */
+    /* unop vA, vB */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    mov     r9, rINST, lsr #8           @ r9<- A+
+    VREG_INDEX_TO_ADDR r3, r3           @ r3<- &vB
+    flds    s0, [r3]                    @ s0<- vB
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    and     r9, r9, #15                 @ r9<- A
+    $instr                              @ d0<- op
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    VREG_INDEX_TO_ADDR r9, r9           @ r9<- &vA
+    fstd    d0, [r9]                    @ vA<- d0
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/header.S b/runtime/interpreter/mterp/arm/header.S
new file mode 100644
index 0000000..14319d9
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/header.S
@@ -0,0 +1,279 @@
+/*
+ * Copyright (C) 2016 The Android Open Source Project
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+  Art assembly interpreter notes:
+
+  First validate assembly code by implementing ExecuteXXXImpl() style body (doesn't
+  handle invoke, allows higher-level code to create frame & shadow frame.
+
+  Once that's working, support direct entry code & eliminate shadow frame (and
+  excess locals allocation.
+
+  Some (hopefully) temporary ugliness.  We'll treat rFP as pointing to the
+  base of the vreg array within the shadow frame.  Access the other fields,
+  dex_pc_, method_ and number_of_vregs_ via negative offsets.  For now, we'll continue
+  the shadow frame mechanism of double-storing object references - via rFP &
+  number_of_vregs_.
+
+ */
+
+/*
+ARM EABI general notes:
+
+r0-r3 hold first 4 args to a method; they are not preserved across method calls
+r4-r8 are available for general use
+r9 is given special treatment in some situations, but not for us
+r10 (sl) seems to be generally available
+r11 (fp) is used by gcc (unless -fomit-frame-pointer is set)
+r12 (ip) is scratch -- not preserved across method calls
+r13 (sp) should be managed carefully in case a signal arrives
+r14 (lr) must be preserved
+r15 (pc) can be tinkered with directly
+
+r0 holds returns of <= 4 bytes
+r0-r1 hold returns of 8 bytes, low word in r0
+
+Callee must save/restore r4+ (except r12) if it modifies them.  If VFP
+is present, registers s16-s31 (a/k/a d8-d15, a/k/a q4-q7) must be preserved,
+s0-s15 (d0-d7, q0-a3) do not need to be.
+
+Stack is "full descending".  Only the arguments that don't fit in the first 4
+registers are placed on the stack.  "sp" points at the first stacked argument
+(i.e. the 5th arg).
+
+VFP: single-precision results in s0, double-precision results in d0.
+
+In the EABI, "sp" must be 64-bit aligned on entry to a function, and any
+64-bit quantities (long long, double) must be 64-bit aligned.
+*/
+
+/*
+Mterp and ARM notes:
+
+The following registers have fixed assignments:
+
+  reg nick      purpose
+  r4  rPC       interpreted program counter, used for fetching instructions
+  r5  rFP       interpreted frame pointer, used for accessing locals and args
+  r6  rSELF     self (Thread) pointer
+  r7  rINST     first 16-bit code unit of current instruction
+  r8  rIBASE    interpreted instruction base pointer, used for computed goto
+  r11 rREFS	base of object references in shadow frame  (ideally, we'll get rid of this later).
+
+Macros are provided for common operations.  Each macro MUST emit only
+one instruction to make instruction-counting easier.  They MUST NOT alter
+unspecified registers or condition codes.
+*/
+
+/*
+ * This is a #include, not a %include, because we want the C pre-processor
+ * to expand the macros into assembler assignment statements.
+ */
+#include "asm_support.h"
+
+/* During bringup, we'll use the shadow frame model instead of rFP */
+/* single-purpose registers, given names for clarity */
+#define rPC     r4
+#define rFP     r5
+#define rSELF   r6
+#define rINST   r7
+#define rIBASE  r8
+#define rREFS   r11
+
+/*
+ * Instead of holding a pointer to the shadow frame, we keep rFP at the base of the vregs.  So,
+ * to access other shadow frame fields, we need to use a backwards offset.  Define those here.
+ */
+#define OFF_FP(a) (a - SHADOWFRAME_VREGS_OFFSET)
+#define OFF_FP_NUMBER_OF_VREGS OFF_FP(SHADOWFRAME_NUMBER_OF_VREGS_OFFSET)
+#define OFF_FP_DEX_PC OFF_FP(SHADOWFRAME_DEX_PC_OFFSET)
+#define OFF_FP_LINK OFF_FP(SHADOWFRAME_LINK_OFFSET)
+#define OFF_FP_METHOD OFF_FP(SHADOWFRAME_METHOD_OFFSET)
+#define OFF_FP_RESULT_REGISTER OFF_FP(SHADOWFRAME_RESULT_REGISTER_OFFSET)
+#define OFF_FP_DEX_PC_PTR OFF_FP(SHADOWFRAME_DEX_PC_PTR_OFFSET)
+#define OFF_FP_CODE_ITEM OFF_FP(SHADOWFRAME_CODE_ITEM_OFFSET)
+#define OFF_FP_SHADOWFRAME (-SHADOWFRAME_VREGS_OFFSET)
+
+/*
+ *
+ * The reference interpreter performs explicit suspect checks, which is somewhat wasteful.
+ * Dalvik's interpreter folded suspend checks into the jump table mechanism, and eventually
+ * mterp should do so as well.
+ */
+#define MTERP_SUSPEND 0
+
+/*
+ * "export" the PC to dex_pc field in the shadow frame, f/b/o future exception objects.  Must
+ * be done *before* something throws.
+ *
+ * It's okay to do this more than once.
+ *
+ * NOTE: the fast interpreter keeps track of dex pc as a direct pointer to the mapped
+ * dex byte codes.  However, the rest of the runtime expects dex pc to be an instruction
+ * offset into the code_items_[] array.  For effiency, we will "export" the
+ * current dex pc as a direct pointer using the EXPORT_PC macro, and rely on GetDexPC
+ * to convert to a dex pc when needed.
+ */
+.macro EXPORT_PC
+    str  rPC, [rFP, #OFF_FP_DEX_PC_PTR]
+.endm
+
+.macro EXPORT_DEX_PC tmp
+    ldr  \tmp, [rFP, #OFF_FP_CODE_ITEM]
+    str  rPC, [rFP, #OFF_FP_DEX_PC_PTR]
+    add  \tmp, #CODEITEM_INSNS_OFFSET
+    sub  \tmp, rPC, \tmp
+    asr  \tmp, #1
+    str  \tmp, [rFP, #OFF_FP_DEX_PC]
+.endm
+
+/*
+ * Fetch the next instruction from rPC into rINST.  Does not advance rPC.
+ */
+.macro FETCH_INST
+    ldrh    rINST, [rPC]
+.endm
+
+/*
+ * Fetch the next instruction from the specified offset.  Advances rPC
+ * to point to the next instruction.  "_count" is in 16-bit code units.
+ *
+ * Because of the limited size of immediate constants on ARM, this is only
+ * suitable for small forward movements (i.e. don't try to implement "goto"
+ * with this).
+ *
+ * This must come AFTER anything that can throw an exception, or the
+ * exception catch may miss.  (This also implies that it must come after
+ * EXPORT_PC.)
+ */
+.macro FETCH_ADVANCE_INST count
+    ldrh    rINST, [rPC, #((\count)*2)]!
+.endm
+
+/*
+ * The operation performed here is similar to FETCH_ADVANCE_INST, except the
+ * src and dest registers are parameterized (not hard-wired to rPC and rINST).
+ */
+.macro PREFETCH_ADVANCE_INST dreg, sreg, count
+    ldrh    \dreg, [\sreg, #((\count)*2)]!
+.endm
+
+/*
+ * Similar to FETCH_ADVANCE_INST, but does not update rPC.  Used to load
+ * rINST ahead of possible exception point.  Be sure to manually advance rPC
+ * later.
+ */
+.macro PREFETCH_INST count
+    ldrh    rINST, [rPC, #((\count)*2)]
+.endm
+
+/* Advance rPC by some number of code units. */
+.macro ADVANCE count
+  add  rPC, #((\count)*2)
+.endm
+
+/*
+ * Fetch the next instruction from an offset specified by _reg.  Updates
+ * rPC to point to the next instruction.  "_reg" must specify the distance
+ * in bytes, *not* 16-bit code units, and may be a signed value.
+ *
+ * We want to write "ldrh rINST, [rPC, _reg, lsl #1]!", but some of the
+ * bits that hold the shift distance are used for the half/byte/sign flags.
+ * In some cases we can pre-double _reg for free, so we require a byte offset
+ * here.
+ */
+.macro FETCH_ADVANCE_INST_RB reg
+    ldrh    rINST, [rPC, \reg]!
+.endm
+
+/*
+ * Fetch a half-word code unit from an offset past the current PC.  The
+ * "_count" value is in 16-bit code units.  Does not advance rPC.
+ *
+ * The "_S" variant works the same but treats the value as signed.
+ */
+.macro FETCH reg, count
+    ldrh    \reg, [rPC, #((\count)*2)]
+.endm
+
+.macro FETCH_S reg, count
+    ldrsh   \reg, [rPC, #((\count)*2)]
+.endm
+
+/*
+ * Fetch one byte from an offset past the current PC.  Pass in the same
+ * "_count" as you would for FETCH, and an additional 0/1 indicating which
+ * byte of the halfword you want (lo/hi).
+ */
+.macro FETCH_B reg, count, byte
+    ldrb     \reg, [rPC, #((\count)*2+(\byte))]
+.endm
+
+/*
+ * Put the instruction's opcode field into the specified register.
+ */
+.macro GET_INST_OPCODE reg
+    and     \reg, rINST, #255
+.endm
+
+/*
+ * Put the prefetched instruction's opcode field into the specified register.
+ */
+.macro GET_PREFETCHED_OPCODE oreg, ireg
+    and     \oreg, \ireg, #255
+.endm
+
+/*
+ * Begin executing the opcode in _reg.  Because this only jumps within the
+ * interpreter, we don't have to worry about pre-ARMv5 THUMB interwork.
+ */
+.macro GOTO_OPCODE reg
+    add     pc, rIBASE, \reg, lsl #${handler_size_bits}
+.endm
+.macro GOTO_OPCODE_BASE base,reg
+    add     pc, \base, \reg, lsl #${handler_size_bits}
+.endm
+
+/*
+ * Get/set the 32-bit value from a Dalvik register.
+ */
+.macro GET_VREG reg, vreg
+    ldr     \reg, [rFP, \vreg, lsl #2]
+.endm
+.macro SET_VREG reg, vreg
+    str     \reg, [rFP, \vreg, lsl #2]
+    mov     \reg, #0
+    str     \reg, [rREFS, \vreg, lsl #2]
+.endm
+.macro SET_VREG_OBJECT reg, vreg, tmpreg
+    str     \reg, [rFP, \vreg, lsl #2]
+    str     \reg, [rREFS, \vreg, lsl #2]
+.endm
+
+/*
+ * Convert a virtual register index into an address.
+ */
+.macro VREG_INDEX_TO_ADDR reg, vreg
+    add     \reg, rFP, \vreg, lsl #2   /* WARNING/FIXME: handle shadow frame vreg zero if store */
+.endm
+
+/*
+ * Refresh handler table.
+ */
+.macro REFRESH_IBASE
+  ldr     rIBASE, [rSELF, #THREAD_CURRENT_IBASE_OFFSET]
+.endm
diff --git a/runtime/interpreter/mterp/arm/invoke.S b/runtime/interpreter/mterp/arm/invoke.S
new file mode 100644
index 0000000..7575865
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/invoke.S
@@ -0,0 +1,19 @@
+%default { "helper":"UndefinedInvokeHandler" }
+    /*
+     * Generic invoke handler wrapper.
+     */
+    /* op vB, {vD, vE, vF, vG, vA}, class@CCCC */
+    /* op {vCCCC..v(CCCC+AA-1)}, meth@BBBB */
+    .extern $helper
+    EXPORT_PC
+    mov     r0, rSELF
+    add     r1, rFP, #OFF_FP_SHADOWFRAME
+    mov     r2, rPC
+    mov     r3, rINST
+    bl      $helper
+    cmp     r0, #0
+    beq     MterpException
+    FETCH_ADVANCE_INST 3
+    GET_INST_OPCODE ip
+    GOTO_OPCODE ip
+
diff --git a/runtime/interpreter/mterp/arm/op_add_double.S b/runtime/interpreter/mterp/arm/op_add_double.S
new file mode 100644
index 0000000..9332bf2
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_add_double.S
@@ -0,0 +1 @@
+%include "arm/fbinopWide.S" {"instr":"faddd   d2, d0, d1"}
diff --git a/runtime/interpreter/mterp/arm/op_add_double_2addr.S b/runtime/interpreter/mterp/arm/op_add_double_2addr.S
new file mode 100644
index 0000000..3242c53
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_add_double_2addr.S
@@ -0,0 +1 @@
+%include "arm/fbinopWide2addr.S" {"instr":"faddd   d2, d0, d1"}
diff --git a/runtime/interpreter/mterp/arm/op_add_float.S b/runtime/interpreter/mterp/arm/op_add_float.S
new file mode 100644
index 0000000..afb7967
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_add_float.S
@@ -0,0 +1 @@
+%include "arm/fbinop.S" {"instr":"fadds   s2, s0, s1"}
diff --git a/runtime/interpreter/mterp/arm/op_add_float_2addr.S b/runtime/interpreter/mterp/arm/op_add_float_2addr.S
new file mode 100644
index 0000000..0067b6a
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_add_float_2addr.S
@@ -0,0 +1 @@
+%include "arm/fbinop2addr.S" {"instr":"fadds   s2, s0, s1"}
diff --git a/runtime/interpreter/mterp/arm/op_add_int.S b/runtime/interpreter/mterp/arm/op_add_int.S
new file mode 100644
index 0000000..1dcae7e
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_add_int.S
@@ -0,0 +1 @@
+%include "arm/binop.S" {"instr":"add     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_add_int_2addr.S b/runtime/interpreter/mterp/arm/op_add_int_2addr.S
new file mode 100644
index 0000000..9ea98f1
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_add_int_2addr.S
@@ -0,0 +1 @@
+%include "arm/binop2addr.S" {"instr":"add     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_add_int_lit16.S b/runtime/interpreter/mterp/arm/op_add_int_lit16.S
new file mode 100644
index 0000000..5763ab8
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_add_int_lit16.S
@@ -0,0 +1 @@
+%include "arm/binopLit16.S" {"instr":"add     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_add_int_lit8.S b/runtime/interpreter/mterp/arm/op_add_int_lit8.S
new file mode 100644
index 0000000..b84684a
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_add_int_lit8.S
@@ -0,0 +1 @@
+%include "arm/binopLit8.S" {"instr":"add     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_add_long.S b/runtime/interpreter/mterp/arm/op_add_long.S
new file mode 100644
index 0000000..093223e
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_add_long.S
@@ -0,0 +1 @@
+%include "arm/binopWide.S" {"preinstr":"adds    r0, r0, r2", "instr":"adc     r1, r1, r3"}
diff --git a/runtime/interpreter/mterp/arm/op_add_long_2addr.S b/runtime/interpreter/mterp/arm/op_add_long_2addr.S
new file mode 100644
index 0000000..c11e0af
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_add_long_2addr.S
@@ -0,0 +1 @@
+%include "arm/binopWide2addr.S" {"preinstr":"adds    r0, r0, r2", "instr":"adc     r1, r1, r3"}
diff --git a/runtime/interpreter/mterp/arm/op_aget.S b/runtime/interpreter/mterp/arm/op_aget.S
new file mode 100644
index 0000000..2cc4d66
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_aget.S
@@ -0,0 +1,33 @@
+%default { "load":"ldr", "shift":"2", "is_object":"0", "data_offset":"MIRROR_INT_ARRAY_DATA_OFFSET" }
+    /*
+     * Array get, 32 bits or less.  vAA <- vBB[vCC].
+     *
+     * Note: using the usual FETCH/and/shift stuff, this fits in exactly 17
+     * instructions.  We use a pair of FETCH_Bs instead.
+     *
+     * for: aget, aget-object, aget-boolean, aget-byte, aget-char, aget-short
+     *
+     * NOTE: assumes data offset for arrays is the same for all non-wide types.
+     * If this changes, specialize.
+     */
+    /* op vAA, vBB, vCC */
+    FETCH_B r2, 1, 0                    @ r2<- BB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    FETCH_B r3, 1, 1                    @ r3<- CC
+    GET_VREG r0, r2                     @ r0<- vBB (array object)
+    GET_VREG r1, r3                     @ r1<- vCC (requested index)
+    cmp     r0, #0                      @ null array object?
+    beq     common_errNullObject        @ yes, bail
+    ldr     r3, [r0, #MIRROR_ARRAY_LENGTH_OFFSET]    @ r3<- arrayObj->length
+    add     r0, r0, r1, lsl #$shift     @ r0<- arrayObj + index*width
+    cmp     r1, r3                      @ compare unsigned index, length
+    bcs     common_errArrayIndex        @ index >= length, bail
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    $load   r2, [r0, #$data_offset]     @ r2<- vBB[vCC]
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    .if $is_object
+    SET_VREG_OBJECT r2, r9              @ vAA<- r2
+    .else
+    SET_VREG r2, r9                     @ vAA<- r2
+    .endif
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_aget_boolean.S b/runtime/interpreter/mterp/arm/op_aget_boolean.S
new file mode 100644
index 0000000..8f678dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_aget_boolean.S
@@ -0,0 +1 @@
+%include "arm/op_aget.S" { "load":"ldrb", "shift":"0", "data_offset":"MIRROR_BOOLEAN_ARRAY_DATA_OFFSET" }
diff --git a/runtime/interpreter/mterp/arm/op_aget_byte.S b/runtime/interpreter/mterp/arm/op_aget_byte.S
new file mode 100644
index 0000000..a304650
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_aget_byte.S
@@ -0,0 +1 @@
+%include "arm/op_aget.S" { "load":"ldrsb", "shift":"0", "data_offset":"MIRROR_BYTE_ARRAY_DATA_OFFSET" }
diff --git a/runtime/interpreter/mterp/arm/op_aget_char.S b/runtime/interpreter/mterp/arm/op_aget_char.S
new file mode 100644
index 0000000..4908306
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_aget_char.S
@@ -0,0 +1 @@
+%include "arm/op_aget.S" { "load":"ldrh", "shift":"1", "data_offset":"MIRROR_CHAR_ARRAY_DATA_OFFSET" }
diff --git a/runtime/interpreter/mterp/arm/op_aget_object.S b/runtime/interpreter/mterp/arm/op_aget_object.S
new file mode 100644
index 0000000..4e0aab5
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_aget_object.S
@@ -0,0 +1,21 @@
+    /*
+     * Array object get.  vAA <- vBB[vCC].
+     *
+     * for: aget-object
+     */
+    /* op vAA, vBB, vCC */
+    FETCH_B r2, 1, 0                    @ r2<- BB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    FETCH_B r3, 1, 1                    @ r3<- CC
+    EXPORT_PC
+    GET_VREG r0, r2                     @ r0<- vBB (array object)
+    GET_VREG r1, r3                     @ r1<- vCC (requested index)
+    bl       artAGetObjectFromMterp     @ (array, index)
+    ldr      r1, [rSELF, #THREAD_EXCEPTION_OFFSET]
+    PREFETCH_INST 2
+    cmp      r1, #0
+    bne      MterpException
+    SET_VREG_OBJECT r0, r9
+    ADVANCE 2
+    GET_INST_OPCODE ip
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_aget_short.S b/runtime/interpreter/mterp/arm/op_aget_short.S
new file mode 100644
index 0000000..b71e659
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_aget_short.S
@@ -0,0 +1 @@
+%include "arm/op_aget.S" { "load":"ldrsh", "shift":"1", "data_offset":"MIRROR_SHORT_ARRAY_DATA_OFFSET" }
diff --git a/runtime/interpreter/mterp/arm/op_aget_wide.S b/runtime/interpreter/mterp/arm/op_aget_wide.S
new file mode 100644
index 0000000..caaec71
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_aget_wide.S
@@ -0,0 +1,24 @@
+    /*
+     * Array get, 64 bits.  vAA <- vBB[vCC].
+     *
+     * Arrays of long/double are 64-bit aligned, so it's okay to use LDRD.
+     */
+    /* aget-wide vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    and     r2, r0, #255                @ r2<- BB
+    mov     r3, r0, lsr #8              @ r3<- CC
+    GET_VREG r0, r2                     @ r0<- vBB (array object)
+    GET_VREG r1, r3                     @ r1<- vCC (requested index)
+    cmp     r0, #0                      @ null array object?
+    beq     common_errNullObject        @ yes, bail
+    ldr     r3, [r0, #MIRROR_ARRAY_LENGTH_OFFSET]    @ r3<- arrayObj->length
+    add     r0, r0, r1, lsl #3          @ r0<- arrayObj + index*width
+    cmp     r1, r3                      @ compare unsigned index, length
+    bcs     common_errArrayIndex        @ index >= length, bail
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    ldrd    r2, [r0, #MIRROR_WIDE_ARRAY_DATA_OFFSET]  @ r2/r3<- vBB[vCC]
+    add     r9, rFP, r9, lsl #2         @ r9<- &fp[AA]
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r9, {r2-r3}                 @ vAA/vAA+1<- r2/r3
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_and_int.S b/runtime/interpreter/mterp/arm/op_and_int.S
new file mode 100644
index 0000000..7c16d37
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_and_int.S
@@ -0,0 +1 @@
+%include "arm/binop.S" {"instr":"and     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_and_int_2addr.S b/runtime/interpreter/mterp/arm/op_and_int_2addr.S
new file mode 100644
index 0000000..0fbab02
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_and_int_2addr.S
@@ -0,0 +1 @@
+%include "arm/binop2addr.S" {"instr":"and     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_and_int_lit16.S b/runtime/interpreter/mterp/arm/op_and_int_lit16.S
new file mode 100644
index 0000000..541e9b7
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_and_int_lit16.S
@@ -0,0 +1 @@
+%include "arm/binopLit16.S" {"instr":"and     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_and_int_lit8.S b/runtime/interpreter/mterp/arm/op_and_int_lit8.S
new file mode 100644
index 0000000..d5783e5
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_and_int_lit8.S
@@ -0,0 +1 @@
+%include "arm/binopLit8.S" {"instr":"and     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_and_long.S b/runtime/interpreter/mterp/arm/op_and_long.S
new file mode 100644
index 0000000..4ad5158
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_and_long.S
@@ -0,0 +1 @@
+%include "arm/binopWide.S" {"preinstr":"and     r0, r0, r2", "instr":"and     r1, r1, r3"}
diff --git a/runtime/interpreter/mterp/arm/op_and_long_2addr.S b/runtime/interpreter/mterp/arm/op_and_long_2addr.S
new file mode 100644
index 0000000..e23ea44
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_and_long_2addr.S
@@ -0,0 +1 @@
+%include "arm/binopWide2addr.S" {"preinstr":"and     r0, r0, r2", "instr":"and     r1, r1, r3"}
diff --git a/runtime/interpreter/mterp/arm/op_aput.S b/runtime/interpreter/mterp/arm/op_aput.S
new file mode 100644
index 0000000..a511fa5
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_aput.S
@@ -0,0 +1,29 @@
+%default { "store":"str", "shift":"2", "data_offset":"MIRROR_INT_ARRAY_DATA_OFFSET" }
+    /*
+     * Array put, 32 bits or less.  vBB[vCC] <- vAA.
+     *
+     * Note: using the usual FETCH/and/shift stuff, this fits in exactly 17
+     * instructions.  We use a pair of FETCH_Bs instead.
+     *
+     * for: aput, aput-boolean, aput-byte, aput-char, aput-short
+     *
+     * NOTE: this assumes data offset for arrays is the same for all non-wide types.
+     * If this changes, specialize.
+     */
+    /* op vAA, vBB, vCC */
+    FETCH_B r2, 1, 0                    @ r2<- BB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    FETCH_B r3, 1, 1                    @ r3<- CC
+    GET_VREG r0, r2                     @ r0<- vBB (array object)
+    GET_VREG r1, r3                     @ r1<- vCC (requested index)
+    cmp     r0, #0                      @ null array object?
+    beq     common_errNullObject        @ yes, bail
+    ldr     r3, [r0, #MIRROR_ARRAY_LENGTH_OFFSET]     @ r3<- arrayObj->length
+    add     r0, r0, r1, lsl #$shift     @ r0<- arrayObj + index*width
+    cmp     r1, r3                      @ compare unsigned index, length
+    bcs     common_errArrayIndex        @ index >= length, bail
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    GET_VREG r2, r9                     @ r2<- vAA
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    $store  r2, [r0, #$data_offset]     @ vBB[vCC]<- r2
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_aput_boolean.S b/runtime/interpreter/mterp/arm/op_aput_boolean.S
new file mode 100644
index 0000000..e86663f
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_aput_boolean.S
@@ -0,0 +1 @@
+%include "arm/op_aput.S" { "store":"strb", "shift":"0", "data_offset":"MIRROR_BOOLEAN_ARRAY_DATA_OFFSET" }
diff --git a/runtime/interpreter/mterp/arm/op_aput_byte.S b/runtime/interpreter/mterp/arm/op_aput_byte.S
new file mode 100644
index 0000000..83694b7
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_aput_byte.S
@@ -0,0 +1 @@
+%include "arm/op_aput.S" { "store":"strb", "shift":"0", "data_offset":"MIRROR_BYTE_ARRAY_DATA_OFFSET" }
diff --git a/runtime/interpreter/mterp/arm/op_aput_char.S b/runtime/interpreter/mterp/arm/op_aput_char.S
new file mode 100644
index 0000000..3551cac
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_aput_char.S
@@ -0,0 +1 @@
+%include "arm/op_aput.S" { "store":"strh", "shift":"1", "data_offset":"MIRROR_CHAR_ARRAY_DATA_OFFSET" }
diff --git a/runtime/interpreter/mterp/arm/op_aput_object.S b/runtime/interpreter/mterp/arm/op_aput_object.S
new file mode 100644
index 0000000..c539916
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_aput_object.S
@@ -0,0 +1,14 @@
+    /*
+     * Store an object into an array.  vBB[vCC] <- vAA.
+     */
+    /* op vAA, vBB, vCC */
+    EXPORT_PC
+    add     r0, rFP, #OFF_FP_SHADOWFRAME
+    mov     r1, rPC
+    mov     r2, rINST
+    bl      MterpAputObject
+    cmp     r0, #0
+    beq     MterpPossibleException
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_aput_short.S b/runtime/interpreter/mterp/arm/op_aput_short.S
new file mode 100644
index 0000000..0a0590e
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_aput_short.S
@@ -0,0 +1 @@
+%include "arm/op_aput.S" { "store":"strh", "shift":"1", "data_offset":"MIRROR_SHORT_ARRAY_DATA_OFFSET" }
diff --git a/runtime/interpreter/mterp/arm/op_aput_wide.S b/runtime/interpreter/mterp/arm/op_aput_wide.S
new file mode 100644
index 0000000..49839d1
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_aput_wide.S
@@ -0,0 +1,24 @@
+    /*
+     * Array put, 64 bits.  vBB[vCC] <- vAA.
+     *
+     * Arrays of long/double are 64-bit aligned, so it's okay to use STRD.
+     */
+    /* aput-wide vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    and     r2, r0, #255                @ r2<- BB
+    mov     r3, r0, lsr #8              @ r3<- CC
+    GET_VREG r0, r2                     @ r0<- vBB (array object)
+    GET_VREG r1, r3                     @ r1<- vCC (requested index)
+    cmp     r0, #0                      @ null array object?
+    beq     common_errNullObject        @ yes, bail
+    ldr     r3, [r0, #MIRROR_ARRAY_LENGTH_OFFSET]    @ r3<- arrayObj->length
+    add     r0, r0, r1, lsl #3          @ r0<- arrayObj + index*width
+    cmp     r1, r3                      @ compare unsigned index, length
+    add     r9, rFP, r9, lsl #2         @ r9<- &fp[AA]
+    bcs     common_errArrayIndex        @ index >= length, bail
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    ldmia   r9, {r2-r3}                 @ r2/r3<- vAA/vAA+1
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    strd    r2, [r0, #MIRROR_WIDE_ARRAY_DATA_OFFSET]  @ r2/r3<- vBB[vCC]
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_array_length.S b/runtime/interpreter/mterp/arm/op_array_length.S
new file mode 100644
index 0000000..43b1682
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_array_length.S
@@ -0,0 +1,13 @@
+    /*
+     * Return the length of an array.
+     */
+    mov     r1, rINST, lsr #12          @ r1<- B
+    ubfx    r2, rINST, #8, #4           @ r2<- A
+    GET_VREG r0, r1                     @ r0<- vB (object ref)
+    cmp     r0, #0                      @ is object null?
+    beq     common_errNullObject        @ yup, fail
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    ldr     r3, [r0, #MIRROR_ARRAY_LENGTH_OFFSET]    @ r3<- array length
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG r3, r2                     @ vB<- length
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_check_cast.S b/runtime/interpreter/mterp/arm/op_check_cast.S
new file mode 100644
index 0000000..3e3ac70
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_check_cast.S
@@ -0,0 +1,17 @@
+    /*
+     * Check to see if a cast from one class to another is allowed.
+     */
+    /* check-cast vAA, class@BBBB */
+    EXPORT_PC
+    FETCH    r0, 1                      @ r0<- BBBB
+    mov      r1, rINST, lsr #8          @ r1<- AA
+    GET_VREG r1, r1                     @ r1<- object
+    ldr      r2, [rFP, #OFF_FP_METHOD]  @ r2<- method
+    mov      r3, rSELF                  @ r3<- self
+    bl       MterpCheckCast             @ (index, obj, method, self)
+    PREFETCH_INST 2
+    cmp      r0, #0
+    bne      MterpPossibleException
+    ADVANCE  2
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_cmp_long.S b/runtime/interpreter/mterp/arm/op_cmp_long.S
new file mode 100644
index 0000000..2b4c0ea
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_cmp_long.S
@@ -0,0 +1,56 @@
+    /*
+     * Compare two 64-bit values.  Puts 0, 1, or -1 into the destination
+     * register based on the results of the comparison.
+     *
+     * We load the full values with LDM, but in practice many values could
+     * be resolved by only looking at the high word.  This could be made
+     * faster or slower by splitting the LDM into a pair of LDRs.
+     *
+     * If we just wanted to set condition flags, we could do this:
+     *  subs    ip, r0, r2
+     *  sbcs    ip, r1, r3
+     *  subeqs  ip, r0, r2
+     * Leaving { <0, 0, >0 } in ip.  However, we have to set it to a specific
+     * integer value, which we can do with 2 conditional mov/mvn instructions
+     * (set 1, set -1; if they're equal we already have 0 in ip), giving
+     * us a constant 5-cycle path plus a branch at the end to the
+     * instruction epilogue code.  The multi-compare approach below needs
+     * 2 or 3 cycles + branch if the high word doesn't match, 6 + branch
+     * in the worst case (the 64-bit values are equal).
+     */
+    /* cmp-long vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    and     r2, r0, #255                @ r2<- BB
+    mov     r3, r0, lsr #8              @ r3<- CC
+    add     r2, rFP, r2, lsl #2         @ r2<- &fp[BB]
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[CC]
+    ldmia   r2, {r0-r1}                 @ r0/r1<- vBB/vBB+1
+    ldmia   r3, {r2-r3}                 @ r2/r3<- vCC/vCC+1
+    cmp     r1, r3                      @ compare (vBB+1, vCC+1)
+    blt     .L${opcode}_less            @ signed compare on high part
+    bgt     .L${opcode}_greater
+    subs    r1, r0, r2                  @ r1<- r0 - r2
+    bhi     .L${opcode}_greater         @ unsigned compare on low part
+    bne     .L${opcode}_less
+    b       .L${opcode}_finish          @ equal; r1 already holds 0
+%break
+
+.L${opcode}_less:
+    mvn     r1, #0                      @ r1<- -1
+    @ Want to cond code the next mov so we can avoid branch, but don't see it;
+    @ instead, we just replicate the tail end.
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    SET_VREG r1, r9                     @ vAA<- r1
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+
+.L${opcode}_greater:
+    mov     r1, #1                      @ r1<- 1
+    @ fall through to _finish
+
+.L${opcode}_finish:
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    SET_VREG r1, r9                     @ vAA<- r1
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_cmpg_double.S b/runtime/interpreter/mterp/arm/op_cmpg_double.S
new file mode 100644
index 0000000..4b05c44
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_cmpg_double.S
@@ -0,0 +1,34 @@
+    /*
+     * Compare two floating-point values.  Puts 0, 1, or -1 into the
+     * destination register based on the results of the comparison.
+     *
+     * int compare(x, y) {
+     *     if (x == y) {
+     *         return 0;
+     *     } else if (x < y) {
+     *         return -1;
+     *     } else if (x > y) {
+     *         return 1;
+     *     } else {
+     *         return 1;
+     *     }
+     * }
+     */
+    /* op vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    and     r2, r0, #255                @ r2<- BB
+    mov     r3, r0, lsr #8              @ r3<- CC
+    VREG_INDEX_TO_ADDR r2, r2           @ r2<- &vBB
+    VREG_INDEX_TO_ADDR r3, r3           @ r3<- &vCC
+    fldd    d0, [r2]                    @ d0<- vBB
+    fldd    d1, [r3]                    @ d1<- vCC
+    fcmped  d0, d1                      @ compare (vBB, vCC)
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    mov     r0, #1                      @ r0<- 1 (default)
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    fmstat                              @ export status flags
+    mvnmi   r0, #0                      @ (less than) r1<- -1
+    moveq   r0, #0                      @ (equal) r1<- 0
+    SET_VREG r0, r9                     @ vAA<- r0
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_cmpg_float.S b/runtime/interpreter/mterp/arm/op_cmpg_float.S
new file mode 100644
index 0000000..d5d2df2
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_cmpg_float.S
@@ -0,0 +1,34 @@
+    /*
+     * Compare two floating-point values.  Puts 0, 1, or -1 into the
+     * destination register based on the results of the comparison.
+     *
+     * int compare(x, y) {
+     *     if (x == y) {
+     *         return 0;
+     *     } else if (x < y) {
+     *         return -1;
+     *     } else if (x > y) {
+     *         return 1;
+     *     } else {
+     *         return 1;
+     *     }
+     * }
+     */
+    /* op vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    and     r2, r0, #255                @ r2<- BB
+    mov     r3, r0, lsr #8              @ r3<- CC
+    VREG_INDEX_TO_ADDR r2, r2           @ r2<- &vBB
+    VREG_INDEX_TO_ADDR r3, r3           @ r3<- &vCC
+    flds    s0, [r2]                    @ s0<- vBB
+    flds    s1, [r3]                    @ s1<- vCC
+    fcmpes  s0, s1                      @ compare (vBB, vCC)
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    mov     r0, #1                      @ r0<- 1 (default)
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    fmstat                              @ export status flags
+    mvnmi   r0, #0                      @ (less than) r1<- -1
+    moveq   r0, #0                      @ (equal) r1<- 0
+    SET_VREG r0, r9                     @ vAA<- r0
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_cmpl_double.S b/runtime/interpreter/mterp/arm/op_cmpl_double.S
new file mode 100644
index 0000000..6ee53b3
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_cmpl_double.S
@@ -0,0 +1,34 @@
+    /*
+     * Compare two floating-point values.  Puts 0, 1, or -1 into the
+     * destination register based on the results of the comparison.
+     *
+     * int compare(x, y) {
+     *     if (x == y) {
+     *         return 0;
+     *     } else if (x > y) {
+     *         return 1;
+     *     } else if (x < y) {
+     *         return -1;
+     *     } else {
+     *         return -1;
+     *     }
+     * }
+     */
+    /* op vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    and     r2, r0, #255                @ r2<- BB
+    mov     r3, r0, lsr #8              @ r3<- CC
+    VREG_INDEX_TO_ADDR r2, r2           @ r2<- &vBB
+    VREG_INDEX_TO_ADDR r3, r3           @ r3<- &vCC
+    fldd    d0, [r2]                    @ d0<- vBB
+    fldd    d1, [r3]                    @ d1<- vCC
+    fcmped  d0, d1                      @ compare (vBB, vCC)
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    mvn     r0, #0                      @ r0<- -1 (default)
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    fmstat                              @ export status flags
+    movgt   r0, #1                      @ (greater than) r1<- 1
+    moveq   r0, #0                      @ (equal) r1<- 0
+    SET_VREG r0, r9                     @ vAA<- r0
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_cmpl_float.S b/runtime/interpreter/mterp/arm/op_cmpl_float.S
new file mode 100644
index 0000000..64535b6
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_cmpl_float.S
@@ -0,0 +1,34 @@
+    /*
+     * Compare two floating-point values.  Puts 0, 1, or -1 into the
+     * destination register based on the results of the comparison.
+     *
+     * int compare(x, y) {
+     *     if (x == y) {
+     *         return 0;
+     *     } else if (x > y) {
+     *         return 1;
+     *     } else if (x < y) {
+     *         return -1;
+     *     } else {
+     *         return -1;
+     *     }
+     * }
+     */
+    /* op vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    and     r2, r0, #255                @ r2<- BB
+    mov     r3, r0, lsr #8              @ r3<- CC
+    VREG_INDEX_TO_ADDR r2, r2           @ r2<- &vBB
+    VREG_INDEX_TO_ADDR r3, r3           @ r3<- &vCC
+    flds    s0, [r2]                    @ s0<- vBB
+    flds    s1, [r3]                    @ s1<- vCC
+    fcmpes  s0, s1                      @ compare (vBB, vCC)
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    mvn     r0, #0                      @ r0<- -1 (default)
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    fmstat                              @ export status flags
+    movgt   r0, #1                      @ (greater than) r1<- 1
+    moveq   r0, #0                      @ (equal) r1<- 0
+    SET_VREG r0, r9                     @ vAA<- r0
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_const.S b/runtime/interpreter/mterp/arm/op_const.S
new file mode 100644
index 0000000..de3e3c3
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_const.S
@@ -0,0 +1,9 @@
+    /* const vAA, #+BBBBbbbb */
+    mov     r3, rINST, lsr #8           @ r3<- AA
+    FETCH r0, 1                         @ r0<- bbbb (low
+    FETCH r1, 2                         @ r1<- BBBB (high
+    FETCH_ADVANCE_INST 3                @ advance rPC, load rINST
+    orr     r0, r0, r1, lsl #16         @ r0<- BBBBbbbb
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG r0, r3                     @ vAA<- r0
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_const_16.S b/runtime/interpreter/mterp/arm/op_const_16.S
new file mode 100644
index 0000000..59c6dac
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_const_16.S
@@ -0,0 +1,7 @@
+    /* const/16 vAA, #+BBBB */
+    FETCH_S r0, 1                       @ r0<- ssssBBBB (sign-extended
+    mov     r3, rINST, lsr #8           @ r3<- AA
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    SET_VREG r0, r3                     @ vAA<- r0
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_const_4.S b/runtime/interpreter/mterp/arm/op_const_4.S
new file mode 100644
index 0000000..c177bb9
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_const_4.S
@@ -0,0 +1,8 @@
+    /* const/4 vA, #+B */
+    mov     r1, rINST, lsl #16          @ r1<- Bxxx0000
+    ubfx    r0, rINST, #8, #4           @ r0<- A
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    mov     r1, r1, asr #28             @ r1<- sssssssB (sign-extended)
+    GET_INST_OPCODE ip                  @ ip<- opcode from rINST
+    SET_VREG r1, r0                     @ fp[A]<- r1
+    GOTO_OPCODE ip                      @ execute next instruction
diff --git a/runtime/interpreter/mterp/arm/op_const_class.S b/runtime/interpreter/mterp/arm/op_const_class.S
new file mode 100644
index 0000000..0b111f4
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_const_class.S
@@ -0,0 +1,13 @@
+    /* const/class vAA, Class@BBBB */
+    EXPORT_PC
+    FETCH   r0, 1                       @ r0<- BBBB
+    mov     r1, rINST, lsr #8           @ r1<- AA
+    add     r2, rFP, #OFF_FP_SHADOWFRAME
+    mov     r3, rSELF
+    bl      MterpConstClass             @ (index, tgt_reg, shadow_frame, self)
+    PREFETCH_INST 2
+    cmp     r0, #0
+    bne     MterpPossibleException
+    ADVANCE 2
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_const_high16.S b/runtime/interpreter/mterp/arm/op_const_high16.S
new file mode 100644
index 0000000..460d546
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_const_high16.S
@@ -0,0 +1,8 @@
+    /* const/high16 vAA, #+BBBB0000 */
+    FETCH r0, 1                         @ r0<- 0000BBBB (zero-extended
+    mov     r3, rINST, lsr #8           @ r3<- AA
+    mov     r0, r0, lsl #16             @ r0<- BBBB0000
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    SET_VREG r0, r3                     @ vAA<- r0
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_const_string.S b/runtime/interpreter/mterp/arm/op_const_string.S
new file mode 100644
index 0000000..4b8302a
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_const_string.S
@@ -0,0 +1,13 @@
+    /* const/string vAA, String@BBBB */
+    EXPORT_PC
+    FETCH r0, 1                         @ r0<- BBBB
+    mov     r1, rINST, lsr #8           @ r1<- AA
+    add     r2, rFP, #OFF_FP_SHADOWFRAME
+    mov     r3, rSELF
+    bl      MterpConstString            @ (index, tgt_reg, shadow_frame, self)
+    PREFETCH_INST 2                     @ load rINST
+    cmp     r0, #0                      @ fail?
+    bne     MterpPossibleException      @ let reference interpreter deal with it.
+    ADVANCE 2                           @ advance rPC
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_const_string_jumbo.S b/runtime/interpreter/mterp/arm/op_const_string_jumbo.S
new file mode 100644
index 0000000..1a3d0b2
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_const_string_jumbo.S
@@ -0,0 +1,15 @@
+    /* const/string vAA, String@BBBBBBBB */
+    EXPORT_PC
+    FETCH r0, 1                         @ r0<- bbbb (low
+    FETCH r2, 2                         @ r2<- BBBB (high
+    mov     r1, rINST, lsr #8           @ r1<- AA
+    orr     r0, r0, r2, lsl #16         @ r1<- BBBBbbbb
+    add     r2, rFP, #OFF_FP_SHADOWFRAME
+    mov     r3, rSELF
+    bl      MterpConstString            @ (index, tgt_reg, shadow_frame, self)
+    PREFETCH_INST 3                     @ advance rPC
+    cmp     r0, #0                      @ fail?
+    bne     MterpPossibleException      @ let reference interpreter deal with it.
+    ADVANCE 3                           @ advance rPC
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_const_wide.S b/runtime/interpreter/mterp/arm/op_const_wide.S
new file mode 100644
index 0000000..2cdc426
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_const_wide.S
@@ -0,0 +1,13 @@
+    /* const-wide vAA, #+HHHHhhhhBBBBbbbb */
+    FETCH r0, 1                         @ r0<- bbbb (low)
+    FETCH r1, 2                         @ r1<- BBBB (low middle)
+    FETCH r2, 3                         @ r2<- hhhh (high middle)
+    orr     r0, r0, r1, lsl #16         @ r0<- BBBBbbbb (low word)
+    FETCH r3, 4                         @ r3<- HHHH (high)
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    orr     r1, r2, r3, lsl #16         @ r1<- HHHHhhhh (high word)
+    FETCH_ADVANCE_INST 5                @ advance rPC, load rINST
+    add     r9, rFP, r9, lsl #2         @ r9<- &fp[AA]
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r9, {r0-r1}                 @ vAA<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_const_wide_16.S b/runtime/interpreter/mterp/arm/op_const_wide_16.S
new file mode 100644
index 0000000..56bfc17
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_const_wide_16.S
@@ -0,0 +1,9 @@
+    /* const-wide/16 vAA, #+BBBB */
+    FETCH_S r0, 1                       @ r0<- ssssBBBB (sign-extended
+    mov     r3, rINST, lsr #8           @ r3<- AA
+    mov     r1, r0, asr #31             @ r1<- ssssssss
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[AA]
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r3, {r0-r1}                 @ vAA<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_const_wide_32.S b/runtime/interpreter/mterp/arm/op_const_wide_32.S
new file mode 100644
index 0000000..36d4628
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_const_wide_32.S
@@ -0,0 +1,11 @@
+    /* const-wide/32 vAA, #+BBBBbbbb */
+    FETCH r0, 1                         @ r0<- 0000bbbb (low)
+    mov     r3, rINST, lsr #8           @ r3<- AA
+    FETCH_S r2, 2                       @ r2<- ssssBBBB (high)
+    FETCH_ADVANCE_INST 3                @ advance rPC, load rINST
+    orr     r0, r0, r2, lsl #16         @ r0<- BBBBbbbb
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[AA]
+    mov     r1, r0, asr #31             @ r1<- ssssssss
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r3, {r0-r1}                 @ vAA<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_const_wide_high16.S b/runtime/interpreter/mterp/arm/op_const_wide_high16.S
new file mode 100644
index 0000000..bee592d
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_const_wide_high16.S
@@ -0,0 +1,10 @@
+    /* const-wide/high16 vAA, #+BBBB000000000000 */
+    FETCH r1, 1                         @ r1<- 0000BBBB (zero-extended)
+    mov     r3, rINST, lsr #8           @ r3<- AA
+    mov     r0, #0                      @ r0<- 00000000
+    mov     r1, r1, lsl #16             @ r1<- BBBB0000
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[AA]
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r3, {r0-r1}                 @ vAA<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_div_double.S b/runtime/interpreter/mterp/arm/op_div_double.S
new file mode 100644
index 0000000..5147550
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_div_double.S
@@ -0,0 +1 @@
+%include "arm/fbinopWide.S" {"instr":"fdivd   d2, d0, d1"}
diff --git a/runtime/interpreter/mterp/arm/op_div_double_2addr.S b/runtime/interpreter/mterp/arm/op_div_double_2addr.S
new file mode 100644
index 0000000..b812f17
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_div_double_2addr.S
@@ -0,0 +1 @@
+%include "arm/fbinopWide2addr.S" {"instr":"fdivd   d2, d0, d1"}
diff --git a/runtime/interpreter/mterp/arm/op_div_float.S b/runtime/interpreter/mterp/arm/op_div_float.S
new file mode 100644
index 0000000..0f24d11
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_div_float.S
@@ -0,0 +1 @@
+%include "arm/fbinop.S" {"instr":"fdivs   s2, s0, s1"}
diff --git a/runtime/interpreter/mterp/arm/op_div_float_2addr.S b/runtime/interpreter/mterp/arm/op_div_float_2addr.S
new file mode 100644
index 0000000..a1dbf01
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_div_float_2addr.S
@@ -0,0 +1 @@
+%include "arm/fbinop2addr.S" {"instr":"fdivs   s2, s0, s1"}
diff --git a/runtime/interpreter/mterp/arm/op_div_int.S b/runtime/interpreter/mterp/arm/op_div_int.S
new file mode 100644
index 0000000..251064b
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_div_int.S
@@ -0,0 +1,30 @@
+%default {}
+    /*
+     * Specialized 32-bit binary operation
+     *
+     * Performs "r0 = r0 div r1". The selection between sdiv or the gcc helper
+     * depends on the compile time value of __ARM_ARCH_EXT_IDIV__ (defined for
+     * ARMv7 CPUs that have hardware division support).
+     *
+     * div-int
+     *
+     */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    mov     r3, r0, lsr #8              @ r3<- CC
+    and     r2, r0, #255                @ r2<- BB
+    GET_VREG r1, r3                     @ r1<- vCC
+    GET_VREG r0, r2                     @ r0<- vBB
+    cmp     r1, #0                      @ is second operand zero?
+    beq     common_errDivideByZero
+
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+#ifdef __ARM_ARCH_EXT_IDIV__
+    sdiv    r0, r0, r1                  @ r0<- op
+#else
+    bl    __aeabi_idiv                  @ r0<- op, r0-r3 changed
+#endif
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG r0, r9                     @ vAA<- r0
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 11-14 instructions */
diff --git a/runtime/interpreter/mterp/arm/op_div_int_2addr.S b/runtime/interpreter/mterp/arm/op_div_int_2addr.S
new file mode 100644
index 0000000..9be4cd8
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_div_int_2addr.S
@@ -0,0 +1,29 @@
+%default {}
+    /*
+     * Specialized 32-bit binary operation
+     *
+     * Performs "r0 = r0 div r1". The selection between sdiv or the gcc helper
+     * depends on the compile time value of __ARM_ARCH_EXT_IDIV__ (defined for
+     * ARMv7 CPUs that have hardware division support).
+     *
+     * div-int/2addr
+     *
+     */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    GET_VREG r1, r3                     @ r1<- vB
+    GET_VREG r0, r9                     @ r0<- vA
+    cmp     r1, #0                      @ is second operand zero?
+    beq     common_errDivideByZero
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+
+#ifdef __ARM_ARCH_EXT_IDIV__
+    sdiv    r0, r0, r1                  @ r0<- op
+#else
+    bl       __aeabi_idiv               @ r0<- op, r0-r3 changed
+#endif
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG r0, r9                     @ vAA<- r0
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 10-13 instructions */
+
diff --git a/runtime/interpreter/mterp/arm/op_div_int_lit16.S b/runtime/interpreter/mterp/arm/op_div_int_lit16.S
new file mode 100644
index 0000000..d9bc7d6
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_div_int_lit16.S
@@ -0,0 +1,28 @@
+%default {}
+    /*
+     * Specialized 32-bit binary operation
+     *
+     * Performs "r0 = r0 div r1". The selection between sdiv or the gcc helper
+     * depends on the compile time value of __ARM_ARCH_EXT_IDIV__ (defined for
+     * ARMv7 CPUs that have hardware division support).
+     *
+     * div-int/lit16
+     *
+     */
+    FETCH_S r1, 1                       @ r1<- ssssCCCC (sign-extended)
+    mov     r2, rINST, lsr #12          @ r2<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    GET_VREG r0, r2                     @ r0<- vB
+    cmp     r1, #0                      @ is second operand zero?
+    beq     common_errDivideByZero
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+
+#ifdef __ARM_ARCH_EXT_IDIV__
+    sdiv    r0, r0, r1                  @ r0<- op
+#else
+    bl       __aeabi_idiv               @ r0<- op, r0-r3 changed
+#endif
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG r0, r9                     @ vAA<- r0
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 10-13 instructions */
diff --git a/runtime/interpreter/mterp/arm/op_div_int_lit8.S b/runtime/interpreter/mterp/arm/op_div_int_lit8.S
new file mode 100644
index 0000000..5d2dbd3
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_div_int_lit8.S
@@ -0,0 +1,29 @@
+%default {}
+    /*
+     * Specialized 32-bit binary operation
+     *
+     * Performs "r0 = r0 div r1". The selection between sdiv or the gcc helper
+     * depends on the compile time value of __ARM_ARCH_EXT_IDIV__ (defined for
+     * ARMv7 CPUs that have hardware division support).
+     *
+     * div-int/lit8
+     *
+     */
+    FETCH_S r3, 1                       @ r3<- ssssCCBB (sign-extended for CC
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    and     r2, r3, #255                @ r2<- BB
+    GET_VREG r0, r2                     @ r0<- vBB
+    movs    r1, r3, asr #8              @ r1<- ssssssCC (sign extended)
+    @cmp     r1, #0                     @ is second operand zero?
+    beq     common_errDivideByZero
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+
+#ifdef __ARM_ARCH_EXT_IDIV__
+    sdiv    r0, r0, r1                  @ r0<- op
+#else
+    bl   __aeabi_idiv                   @ r0<- op, r0-r3 changed
+#endif
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG r0, r9                     @ vAA<- r0
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 10-12 instructions */
diff --git a/runtime/interpreter/mterp/arm/op_div_long.S b/runtime/interpreter/mterp/arm/op_div_long.S
new file mode 100644
index 0000000..0f21a84
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_div_long.S
@@ -0,0 +1 @@
+%include "arm/binopWide.S" {"instr":"bl      __aeabi_ldivmod", "chkzero":"1"}
diff --git a/runtime/interpreter/mterp/arm/op_div_long_2addr.S b/runtime/interpreter/mterp/arm/op_div_long_2addr.S
new file mode 100644
index 0000000..e172b29
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_div_long_2addr.S
@@ -0,0 +1 @@
+%include "arm/binopWide2addr.S" {"instr":"bl      __aeabi_ldivmod", "chkzero":"1"}
diff --git a/runtime/interpreter/mterp/arm/op_double_to_float.S b/runtime/interpreter/mterp/arm/op_double_to_float.S
new file mode 100644
index 0000000..e327000
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_double_to_float.S
@@ -0,0 +1 @@
+%include "arm/funopNarrower.S" {"instr":"fcvtsd  s0, d0"}
diff --git a/runtime/interpreter/mterp/arm/op_double_to_int.S b/runtime/interpreter/mterp/arm/op_double_to_int.S
new file mode 100644
index 0000000..aa035de
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_double_to_int.S
@@ -0,0 +1 @@
+%include "arm/funopNarrower.S" {"instr":"ftosizd  s0, d0"}
diff --git a/runtime/interpreter/mterp/arm/op_double_to_long.S b/runtime/interpreter/mterp/arm/op_double_to_long.S
new file mode 100644
index 0000000..b100810
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_double_to_long.S
@@ -0,0 +1,52 @@
+@include "arm/unopWide.S" {"instr":"bl      __aeabi_d2lz"}
+%include "arm/unopWide.S" {"instr":"bl      d2l_doconv"}
+
+%break
+/*
+ * Convert the double in r0/r1 to a long in r0/r1.
+ *
+ * We have to clip values to long min/max per the specification.  The
+ * expected common case is a "reasonable" value that converts directly
+ * to modest integer.  The EABI convert function isn't doing this for us.
+ */
+d2l_doconv:
+    stmfd   sp!, {r4, r5, lr}           @ save regs
+    mov     r3, #0x43000000             @ maxlong, as a double (high word)
+    add     r3, #0x00e00000             @  0x43e00000
+    mov     r2, #0                      @ maxlong, as a double (low word)
+    sub     sp, sp, #4                  @ align for EABI
+    mov     r4, r0                      @ save a copy of r0
+    mov     r5, r1                      @  and r1
+    bl      __aeabi_dcmpge              @ is arg >= maxlong?
+    cmp     r0, #0                      @ nonzero == yes
+    mvnne   r0, #0                      @ return maxlong (7fffffffffffffff)
+    mvnne   r1, #0x80000000
+    bne     1f
+
+    mov     r0, r4                      @ recover arg
+    mov     r1, r5
+    mov     r3, #0xc3000000             @ minlong, as a double (high word)
+    add     r3, #0x00e00000             @  0xc3e00000
+    mov     r2, #0                      @ minlong, as a double (low word)
+    bl      __aeabi_dcmple              @ is arg <= minlong?
+    cmp     r0, #0                      @ nonzero == yes
+    movne   r0, #0                      @ return minlong (8000000000000000)
+    movne   r1, #0x80000000
+    bne     1f
+
+    mov     r0, r4                      @ recover arg
+    mov     r1, r5
+    mov     r2, r4                      @ compare against self
+    mov     r3, r5
+    bl      __aeabi_dcmpeq              @ is arg == self?
+    cmp     r0, #0                      @ zero == no
+    moveq   r1, #0                      @ return zero for NaN
+    beq     1f
+
+    mov     r0, r4                      @ recover arg
+    mov     r1, r5
+    bl      __aeabi_d2lz                @ convert double to long
+
+1:
+    add     sp, sp, #4
+    ldmfd   sp!, {r4, r5, pc}
diff --git a/runtime/interpreter/mterp/arm/op_fill_array_data.S b/runtime/interpreter/mterp/arm/op_fill_array_data.S
new file mode 100644
index 0000000..e1ca85c
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_fill_array_data.S
@@ -0,0 +1,14 @@
+    /* fill-array-data vAA, +BBBBBBBB */
+    EXPORT_PC
+    FETCH r0, 1                         @ r0<- bbbb (lo)
+    FETCH r1, 2                         @ r1<- BBBB (hi)
+    mov     r3, rINST, lsr #8           @ r3<- AA
+    orr     r1, r0, r1, lsl #16         @ r1<- BBBBbbbb
+    GET_VREG r0, r3                     @ r0<- vAA (array object)
+    add     r1, rPC, r1, lsl #1         @ r1<- PC + BBBBbbbb*2 (array data off.)
+    bl      MterpFillArrayData          @ (obj, payload)
+    cmp     r0, #0                      @ 0 means an exception is thrown
+    beq     MterpPossibleException      @ exception?
+    FETCH_ADVANCE_INST 3                @ advance rPC, load rINST
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_filled_new_array.S b/runtime/interpreter/mterp/arm/op_filled_new_array.S
new file mode 100644
index 0000000..1075f0c
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_filled_new_array.S
@@ -0,0 +1,19 @@
+%default { "helper":"MterpFilledNewArray" }
+    /*
+     * Create a new array with elements filled from registers.
+     *
+     * for: filled-new-array, filled-new-array/range
+     */
+    /* op vB, {vD, vE, vF, vG, vA}, class@CCCC */
+    /* op {vCCCC..v(CCCC+AA-1)}, type@BBBB */
+    .extern $helper
+    EXPORT_PC
+    add     r0, rFP, #OFF_FP_SHADOWFRAME
+    mov     r1, rPC
+    mov     r2, rSELF
+    bl      $helper
+    cmp     r0, #0
+    beq     MterpPossibleException
+    FETCH_ADVANCE_INST 3                @ advance rPC, load rINST
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_filled_new_array_range.S b/runtime/interpreter/mterp/arm/op_filled_new_array_range.S
new file mode 100644
index 0000000..16567af
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_filled_new_array_range.S
@@ -0,0 +1 @@
+%include "arm/op_filled_new_array.S" { "helper":"MterpFilledNewArrayRange" }
diff --git a/runtime/interpreter/mterp/arm/op_float_to_double.S b/runtime/interpreter/mterp/arm/op_float_to_double.S
new file mode 100644
index 0000000..fb1892b
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_float_to_double.S
@@ -0,0 +1 @@
+%include "arm/funopWider.S" {"instr":"fcvtds  d0, s0"}
diff --git a/runtime/interpreter/mterp/arm/op_float_to_int.S b/runtime/interpreter/mterp/arm/op_float_to_int.S
new file mode 100644
index 0000000..aab8716
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_float_to_int.S
@@ -0,0 +1 @@
+%include "arm/funop.S" {"instr":"ftosizs s1, s0"}
diff --git a/runtime/interpreter/mterp/arm/op_float_to_long.S b/runtime/interpreter/mterp/arm/op_float_to_long.S
new file mode 100644
index 0000000..24416d3
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_float_to_long.S
@@ -0,0 +1,39 @@
+@include "arm/unopWider.S" {"instr":"bl      __aeabi_f2lz"}
+%include "arm/unopWider.S" {"instr":"bl      f2l_doconv"}
+
+%break
+/*
+ * Convert the float in r0 to a long in r0/r1.
+ *
+ * We have to clip values to long min/max per the specification.  The
+ * expected common case is a "reasonable" value that converts directly
+ * to modest integer.  The EABI convert function isn't doing this for us.
+ */
+f2l_doconv:
+    stmfd   sp!, {r4, lr}
+    mov     r1, #0x5f000000             @ (float)maxlong
+    mov     r4, r0
+    bl      __aeabi_fcmpge              @ is arg >= maxlong?
+    cmp     r0, #0                      @ nonzero == yes
+    mvnne   r0, #0                      @ return maxlong (7fffffff)
+    mvnne   r1, #0x80000000
+    ldmnefd sp!, {r4, pc}
+
+    mov     r0, r4                      @ recover arg
+    mov     r1, #0xdf000000             @ (float)minlong
+    bl      __aeabi_fcmple              @ is arg <= minlong?
+    cmp     r0, #0                      @ nonzero == yes
+    movne   r0, #0                      @ return minlong (80000000)
+    movne   r1, #0x80000000
+    ldmnefd sp!, {r4, pc}
+
+    mov     r0, r4                      @ recover arg
+    mov     r1, r4
+    bl      __aeabi_fcmpeq              @ is arg == self?
+    cmp     r0, #0                      @ zero == no
+    moveq   r1, #0                      @ return zero for NaN
+    ldmeqfd sp!, {r4, pc}
+
+    mov     r0, r4                      @ recover arg
+    bl      __aeabi_f2lz                @ convert float to long
+    ldmfd   sp!, {r4, pc}
diff --git a/runtime/interpreter/mterp/arm/op_goto.S b/runtime/interpreter/mterp/arm/op_goto.S
new file mode 100644
index 0000000..9b3632a
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_goto.S
@@ -0,0 +1,28 @@
+    /*
+     * Unconditional branch, 8-bit offset.
+     *
+     * The branch distance is a signed code-unit offset, which we need to
+     * double to get a byte offset.
+     */
+    /* goto +AA */
+    /* tuning: use sbfx for 6t2+ targets */
+#if MTERP_SUSPEND
+    mov     r0, rINST, lsl #16          @ r0<- AAxx0000
+    movs    r1, r0, asr #24             @ r1<- ssssssAA (sign-extended)
+    add     r2, r1, r1                  @ r2<- byte offset, set flags
+       @ If backwards branch refresh rIBASE
+    ldrmi   rIBASE, [rSELF, #THREAD_CURRENT_IBASE_OFFSET] @ refresh handler base
+    FETCH_ADVANCE_INST_RB r2            @ update rPC, load rINST
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+#else
+    ldr     lr, [rSELF, #THREAD_FLAGS_OFFSET]
+    mov     r0, rINST, lsl #16          @ r0<- AAxx0000
+    movs    r1, r0, asr #24             @ r1<- ssssssAA (sign-extended)
+    add     r2, r1, r1                  @ r2<- byte offset, set flags
+    FETCH_ADVANCE_INST_RB r2            @ update rPC, load rINST
+       @ If backwards branch refresh rIBASE
+    bmi     MterpCheckSuspendAndContinue
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+#endif
diff --git a/runtime/interpreter/mterp/arm/op_goto_16.S b/runtime/interpreter/mterp/arm/op_goto_16.S
new file mode 100644
index 0000000..2231acd
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_goto_16.S
@@ -0,0 +1,23 @@
+    /*
+     * Unconditional branch, 16-bit offset.
+     *
+     * The branch distance is a signed code-unit offset, which we need to
+     * double to get a byte offset.
+     */
+    /* goto/16 +AAAA */
+#if MTERP_SUSPEND
+    FETCH_S r0, 1                       @ r0<- ssssAAAA (sign-extended)
+    adds    r1, r0, r0                  @ r1<- byte offset, flags set
+    FETCH_ADVANCE_INST_RB r1            @ update rPC, load rINST
+    ldrmi   rIBASE, [rSELF, #THREAD_CURRENT_IBASE_OFFSET] @ refresh handler base
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+#else
+    FETCH_S r0, 1                       @ r0<- ssssAAAA (sign-extended)
+    ldr     lr, [rSELF, #THREAD_FLAGS_OFFSET]
+    adds    r1, r0, r0                  @ r1<- byte offset, flags set
+    FETCH_ADVANCE_INST_RB r1            @ update rPC, load rINST
+    bmi     MterpCheckSuspendAndContinue
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+#endif
diff --git a/runtime/interpreter/mterp/arm/op_goto_32.S b/runtime/interpreter/mterp/arm/op_goto_32.S
new file mode 100644
index 0000000..6b72ff5
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_goto_32.S
@@ -0,0 +1,32 @@
+    /*
+     * Unconditional branch, 32-bit offset.
+     *
+     * The branch distance is a signed code-unit offset, which we need to
+     * double to get a byte offset.
+     *
+     * Unlike most opcodes, this one is allowed to branch to itself, so
+     * our "backward branch" test must be "<=0" instead of "<0".  Because
+     * we need the V bit set, we'll use an adds to convert from Dalvik
+     * offset to byte offset.
+     */
+    /* goto/32 +AAAAAAAA */
+#if MTERP_SUSPEND
+    FETCH r0, 1                         @ r0<- aaaa (lo)
+    FETCH r1, 2                         @ r1<- AAAA (hi)
+    orr     r0, r0, r1, lsl #16         @ r0<- AAAAaaaa
+    adds    r1, r0, r0                  @ r1<- byte offset
+    FETCH_ADVANCE_INST_RB r1            @ update rPC, load rINST
+    ldrle   rIBASE, [rSELF, #THREAD_CURRENT_IBASE_OFFSET] @ refresh handler base
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+#else
+    FETCH r0, 1                         @ r0<- aaaa (lo)
+    FETCH r1, 2                         @ r1<- AAAA (hi)
+    ldr     lr, [rSELF, #THREAD_FLAGS_OFFSET]
+    orr     r0, r0, r1, lsl #16         @ r0<- AAAAaaaa
+    adds    r1, r0, r0                  @ r1<- byte offset
+    FETCH_ADVANCE_INST_RB r1            @ update rPC, load rINST
+    ble     MterpCheckSuspendAndContinue
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+#endif
diff --git a/runtime/interpreter/mterp/arm/op_if_eq.S b/runtime/interpreter/mterp/arm/op_if_eq.S
new file mode 100644
index 0000000..5685686
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_if_eq.S
@@ -0,0 +1 @@
+%include "arm/bincmp.S" { "revcmp":"ne" }
diff --git a/runtime/interpreter/mterp/arm/op_if_eqz.S b/runtime/interpreter/mterp/arm/op_if_eqz.S
new file mode 100644
index 0000000..2a9c0f9
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_if_eqz.S
@@ -0,0 +1 @@
+%include "arm/zcmp.S" { "revcmp":"ne" }
diff --git a/runtime/interpreter/mterp/arm/op_if_ge.S b/runtime/interpreter/mterp/arm/op_if_ge.S
new file mode 100644
index 0000000..60a0307
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_if_ge.S
@@ -0,0 +1 @@
+%include "arm/bincmp.S" { "revcmp":"lt" }
diff --git a/runtime/interpreter/mterp/arm/op_if_gez.S b/runtime/interpreter/mterp/arm/op_if_gez.S
new file mode 100644
index 0000000..981cdec
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_if_gez.S
@@ -0,0 +1 @@
+%include "arm/zcmp.S" { "revcmp":"lt" }
diff --git a/runtime/interpreter/mterp/arm/op_if_gt.S b/runtime/interpreter/mterp/arm/op_if_gt.S
new file mode 100644
index 0000000..ca50cd7
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_if_gt.S
@@ -0,0 +1 @@
+%include "arm/bincmp.S" { "revcmp":"le" }
diff --git a/runtime/interpreter/mterp/arm/op_if_gtz.S b/runtime/interpreter/mterp/arm/op_if_gtz.S
new file mode 100644
index 0000000..c621812
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_if_gtz.S
@@ -0,0 +1 @@
+%include "arm/zcmp.S" { "revcmp":"le" }
diff --git a/runtime/interpreter/mterp/arm/op_if_le.S b/runtime/interpreter/mterp/arm/op_if_le.S
new file mode 100644
index 0000000..7e060f2
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_if_le.S
@@ -0,0 +1 @@
+%include "arm/bincmp.S" { "revcmp":"gt" }
diff --git a/runtime/interpreter/mterp/arm/op_if_lez.S b/runtime/interpreter/mterp/arm/op_if_lez.S
new file mode 100644
index 0000000..f92be23
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_if_lez.S
@@ -0,0 +1 @@
+%include "arm/zcmp.S" { "revcmp":"gt" }
diff --git a/runtime/interpreter/mterp/arm/op_if_lt.S b/runtime/interpreter/mterp/arm/op_if_lt.S
new file mode 100644
index 0000000..213344d
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_if_lt.S
@@ -0,0 +1 @@
+%include "arm/bincmp.S" { "revcmp":"ge" }
diff --git a/runtime/interpreter/mterp/arm/op_if_ltz.S b/runtime/interpreter/mterp/arm/op_if_ltz.S
new file mode 100644
index 0000000..dfd4e44
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_if_ltz.S
@@ -0,0 +1 @@
+%include "arm/zcmp.S" { "revcmp":"ge" }
diff --git a/runtime/interpreter/mterp/arm/op_if_ne.S b/runtime/interpreter/mterp/arm/op_if_ne.S
new file mode 100644
index 0000000..4a58b4a
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_if_ne.S
@@ -0,0 +1 @@
+%include "arm/bincmp.S" { "revcmp":"eq" }
diff --git a/runtime/interpreter/mterp/arm/op_if_nez.S b/runtime/interpreter/mterp/arm/op_if_nez.S
new file mode 100644
index 0000000..d864ef4
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_if_nez.S
@@ -0,0 +1 @@
+%include "arm/zcmp.S" { "revcmp":"eq" }
diff --git a/runtime/interpreter/mterp/arm/op_iget.S b/runtime/interpreter/mterp/arm/op_iget.S
new file mode 100644
index 0000000..c7f777b
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iget.S
@@ -0,0 +1,26 @@
+%default { "is_object":"0", "helper":"artGet32InstanceFromCode"}
+    /*
+     * General instance field get.
+     *
+     * for: iget, iget-object, iget-boolean, iget-byte, iget-char, iget-short
+     */
+    EXPORT_PC
+    FETCH    r0, 1                         @ r0<- field ref CCCC
+    mov      r1, rINST, lsr #12            @ r1<- B
+    GET_VREG r1, r1                        @ r1<- fp[B], the object pointer
+    ldr      r2, [rFP, #OFF_FP_METHOD]     @ r2<- referrer
+    mov      r3, rSELF                     @ r3<- self
+    bl       $helper
+    ldr      r3, [rSELF, #THREAD_EXCEPTION_OFFSET]
+    ubfx     r2, rINST, #8, #4             @ r2<- A
+    PREFETCH_INST 2
+    cmp      r3, #0
+    bne      MterpPossibleException        @ bail out
+    .if $is_object
+    SET_VREG_OBJECT r0, r2                 @ fp[A]<- r0
+    .else
+    SET_VREG r0, r2                        @ fp[A]<- r0
+    .endif
+    ADVANCE 2
+    GET_INST_OPCODE ip                     @ extract opcode from rINST
+    GOTO_OPCODE ip                         @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_iget_boolean.S b/runtime/interpreter/mterp/arm/op_iget_boolean.S
new file mode 100644
index 0000000..628f40a
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iget_boolean.S
@@ -0,0 +1 @@
+%include "arm/op_iget.S" { "helper":"artGetBooleanInstanceFromCode" }
diff --git a/runtime/interpreter/mterp/arm/op_iget_boolean_quick.S b/runtime/interpreter/mterp/arm/op_iget_boolean_quick.S
new file mode 100644
index 0000000..0ae4843
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iget_boolean_quick.S
@@ -0,0 +1 @@
+%include "arm/op_iget_quick.S" { "load":"ldrb" }
diff --git a/runtime/interpreter/mterp/arm/op_iget_byte.S b/runtime/interpreter/mterp/arm/op_iget_byte.S
new file mode 100644
index 0000000..c4e08e2
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iget_byte.S
@@ -0,0 +1 @@
+%include "arm/op_iget.S" { "helper":"artGetByteInstanceFromCode" }
diff --git a/runtime/interpreter/mterp/arm/op_iget_byte_quick.S b/runtime/interpreter/mterp/arm/op_iget_byte_quick.S
new file mode 100644
index 0000000..e1b3083
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iget_byte_quick.S
@@ -0,0 +1 @@
+%include "arm/op_iget_quick.S" { "load":"ldrsb" }
diff --git a/runtime/interpreter/mterp/arm/op_iget_char.S b/runtime/interpreter/mterp/arm/op_iget_char.S
new file mode 100644
index 0000000..5e8da66
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iget_char.S
@@ -0,0 +1 @@
+%include "arm/op_iget.S" { "helper":"artGetCharInstanceFromCode" }
diff --git a/runtime/interpreter/mterp/arm/op_iget_char_quick.S b/runtime/interpreter/mterp/arm/op_iget_char_quick.S
new file mode 100644
index 0000000..b44d8f1
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iget_char_quick.S
@@ -0,0 +1 @@
+%include "arm/op_iget_quick.S" { "load":"ldrh" }
diff --git a/runtime/interpreter/mterp/arm/op_iget_object.S b/runtime/interpreter/mterp/arm/op_iget_object.S
new file mode 100644
index 0000000..1cf2e3c
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iget_object.S
@@ -0,0 +1 @@
+%include "arm/op_iget.S" { "is_object":"1", "helper":"artGetObjInstanceFromCode" }
diff --git a/runtime/interpreter/mterp/arm/op_iget_object_quick.S b/runtime/interpreter/mterp/arm/op_iget_object_quick.S
new file mode 100644
index 0000000..1f8dc5a
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iget_object_quick.S
@@ -0,0 +1 @@
+%include "arm/op_iget_quick.S" {"is_object":"1"}
diff --git a/runtime/interpreter/mterp/arm/op_iget_quick.S b/runtime/interpreter/mterp/arm/op_iget_quick.S
new file mode 100644
index 0000000..9229afc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iget_quick.S
@@ -0,0 +1,18 @@
+%default { "load":"ldr", "is_object":"0" }
+    /* For: iget-quick, iget-object-quick */
+    /* op vA, vB, offset@CCCC */
+    mov     r2, rINST, lsr #12          @ r2<- B
+    FETCH r1, 1                         @ r1<- field byte offset
+    GET_VREG r3, r2                     @ r3<- object we're operating on
+    ubfx    r2, rINST, #8, #4           @ r2<- A
+    cmp     r3, #0                      @ check object for null
+    beq     common_errNullObject        @ object was null
+    $load   r0, [r3, r1]                @ r0<- obj.field
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    .if $is_object
+    SET_VREG_OBJECT r0, r2              @ fp[A]<- r0
+    .else
+    SET_VREG r0, r2                     @ fp[A]<- r0
+    .endif
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_iget_short.S b/runtime/interpreter/mterp/arm/op_iget_short.S
new file mode 100644
index 0000000..460f045
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iget_short.S
@@ -0,0 +1 @@
+%include "arm/op_iget.S" { "helper":"artGetShortInstanceFromCode" }
diff --git a/runtime/interpreter/mterp/arm/op_iget_short_quick.S b/runtime/interpreter/mterp/arm/op_iget_short_quick.S
new file mode 100644
index 0000000..1831b99
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iget_short_quick.S
@@ -0,0 +1 @@
+%include "arm/op_iget_quick.S" { "load":"ldrsh" }
diff --git a/runtime/interpreter/mterp/arm/op_iget_wide.S b/runtime/interpreter/mterp/arm/op_iget_wide.S
new file mode 100644
index 0000000..f8d2f41
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iget_wide.S
@@ -0,0 +1,22 @@
+    /*
+     * 64-bit instance field get.
+     *
+     * for: iget-wide
+     */
+    EXPORT_PC
+    FETCH    r0, 1                         @ r0<- field ref CCCC
+    mov      r1, rINST, lsr #12            @ r1<- B
+    GET_VREG r1, r1                        @ r1<- fp[B], the object pointer
+    ldr      r2, [rFP, #OFF_FP_METHOD]     @ r2<- referrer
+    mov      r3, rSELF                     @ r3<- self
+    bl       artGet64InstanceFromCode
+    ldr      r3, [rSELF, #THREAD_EXCEPTION_OFFSET]
+    ubfx     r2, rINST, #8, #4             @ r2<- A
+    PREFETCH_INST 2
+    cmp      r3, #0
+    bne      MterpException                @ bail out
+    add     r3, rFP, r2, lsl #2            @ r3<- &fp[A]
+    stmia   r3, {r0-r1}                    @ fp[A]<- r0/r1
+    ADVANCE 2
+    GET_INST_OPCODE ip                     @ extract opcode from rINST
+    GOTO_OPCODE ip                         @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_iget_wide_quick.S b/runtime/interpreter/mterp/arm/op_iget_wide_quick.S
new file mode 100644
index 0000000..4d6976e
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iget_wide_quick.S
@@ -0,0 +1,13 @@
+    /* iget-wide-quick vA, vB, offset@CCCC */
+    mov     r2, rINST, lsr #12          @ r2<- B
+    FETCH ip, 1                         @ ip<- field byte offset
+    GET_VREG r3, r2                     @ r3<- object we're operating on
+    ubfx    r2, rINST, #8, #4           @ r2<- A
+    cmp     r3, #0                      @ check object for null
+    beq     common_errNullObject        @ object was null
+    ldrd    r0, [r3, ip]                @ r0<- obj.field (64 bits, aligned)
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    add     r3, rFP, r2, lsl #2         @ r3<- &fp[A]
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r3, {r0-r1}                 @ fp[A]<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_instance_of.S b/runtime/interpreter/mterp/arm/op_instance_of.S
new file mode 100644
index 0000000..e94108c
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_instance_of.S
@@ -0,0 +1,24 @@
+    /*
+     * Check to see if an object reference is an instance of a class.
+     *
+     * Most common situation is a non-null object, being compared against
+     * an already-resolved class.
+     */
+    /* instance-of vA, vB, class@CCCC */
+    EXPORT_PC
+    FETCH     r0, 1                     @ r0<- CCCC
+    mov       r1, rINST, lsr #12        @ r1<- B
+    GET_VREG  r1, r1                    @ r1<- vB (object)
+    ldr       r2, [rFP, #OFF_FP_METHOD] @ r2<- method
+    mov       r3, rSELF                 @ r3<- self
+    mov       r9, rINST, lsr #8         @ r9<- A+
+    and       r9, r9, #15               @ r9<- A
+    bl        MterpInstanceOf           @ (index, obj, method, self)
+    ldr       r1, [rSELF, #THREAD_EXCEPTION_OFFSET]
+    PREFETCH_INST 2
+    cmp       r1, #0                    @ exception pending?
+    bne       MterpException
+    ADVANCE 2                           @ advance rPC
+    SET_VREG r0, r9                     @ vA<- r0
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_int_to_byte.S b/runtime/interpreter/mterp/arm/op_int_to_byte.S
new file mode 100644
index 0000000..059d5c2
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_int_to_byte.S
@@ -0,0 +1 @@
+%include "arm/unop.S" {"instr":"sxtb    r0, r0"}
diff --git a/runtime/interpreter/mterp/arm/op_int_to_char.S b/runtime/interpreter/mterp/arm/op_int_to_char.S
new file mode 100644
index 0000000..83a0c19
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_int_to_char.S
@@ -0,0 +1 @@
+%include "arm/unop.S" {"instr":"uxth    r0, r0"}
diff --git a/runtime/interpreter/mterp/arm/op_int_to_double.S b/runtime/interpreter/mterp/arm/op_int_to_double.S
new file mode 100644
index 0000000..810c2e4
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_int_to_double.S
@@ -0,0 +1 @@
+%include "arm/funopWider.S" {"instr":"fsitod  d0, s0"}
diff --git a/runtime/interpreter/mterp/arm/op_int_to_float.S b/runtime/interpreter/mterp/arm/op_int_to_float.S
new file mode 100644
index 0000000..f41654c
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_int_to_float.S
@@ -0,0 +1 @@
+%include "arm/funop.S" {"instr":"fsitos  s1, s0"}
diff --git a/runtime/interpreter/mterp/arm/op_int_to_long.S b/runtime/interpreter/mterp/arm/op_int_to_long.S
new file mode 100644
index 0000000..b5aed8e
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_int_to_long.S
@@ -0,0 +1 @@
+%include "arm/unopWider.S" {"instr":"mov     r1, r0, asr #31"}
diff --git a/runtime/interpreter/mterp/arm/op_int_to_short.S b/runtime/interpreter/mterp/arm/op_int_to_short.S
new file mode 100644
index 0000000..717bd96
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_int_to_short.S
@@ -0,0 +1 @@
+%include "arm/unop.S" {"instr":"sxth    r0, r0"}
diff --git a/runtime/interpreter/mterp/arm/op_invoke_direct.S b/runtime/interpreter/mterp/arm/op_invoke_direct.S
new file mode 100644
index 0000000..1edf221
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_invoke_direct.S
@@ -0,0 +1 @@
+%include "arm/invoke.S" { "helper":"MterpInvokeDirect" }
diff --git a/runtime/interpreter/mterp/arm/op_invoke_direct_range.S b/runtime/interpreter/mterp/arm/op_invoke_direct_range.S
new file mode 100644
index 0000000..3097b8e
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_invoke_direct_range.S
@@ -0,0 +1 @@
+%include "arm/invoke.S" { "helper":"MterpInvokeDirectRange" }
diff --git a/runtime/interpreter/mterp/arm/op_invoke_interface.S b/runtime/interpreter/mterp/arm/op_invoke_interface.S
new file mode 100644
index 0000000..f6d565b
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_invoke_interface.S
@@ -0,0 +1,8 @@
+%include "arm/invoke.S" { "helper":"MterpInvokeInterface" }
+    /*
+     * Handle an interface method call.
+     *
+     * for: invoke-interface, invoke-interface/range
+     */
+    /* op vB, {vD, vE, vF, vG, vA}, class@CCCC */
+    /* op {vCCCC..v(CCCC+AA-1)}, meth@BBBB */
diff --git a/runtime/interpreter/mterp/arm/op_invoke_interface_range.S b/runtime/interpreter/mterp/arm/op_invoke_interface_range.S
new file mode 100644
index 0000000..c8443b0
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_invoke_interface_range.S
@@ -0,0 +1 @@
+%include "arm/invoke.S" { "helper":"MterpInvokeInterfaceRange" }
diff --git a/runtime/interpreter/mterp/arm/op_invoke_static.S b/runtime/interpreter/mterp/arm/op_invoke_static.S
new file mode 100644
index 0000000..c3cefcf
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_invoke_static.S
@@ -0,0 +1,2 @@
+%include "arm/invoke.S" { "helper":"MterpInvokeStatic" }
+
diff --git a/runtime/interpreter/mterp/arm/op_invoke_static_range.S b/runtime/interpreter/mterp/arm/op_invoke_static_range.S
new file mode 100644
index 0000000..dd60d7b
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_invoke_static_range.S
@@ -0,0 +1 @@
+%include "arm/invoke.S" { "helper":"MterpInvokeStaticRange" }
diff --git a/runtime/interpreter/mterp/arm/op_invoke_super.S b/runtime/interpreter/mterp/arm/op_invoke_super.S
new file mode 100644
index 0000000..92ef2a4
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_invoke_super.S
@@ -0,0 +1,8 @@
+%include "arm/invoke.S" { "helper":"MterpInvokeSuper" }
+    /*
+     * Handle a "super" method call.
+     *
+     * for: invoke-super, invoke-super/range
+     */
+    /* op vB, {vD, vE, vF, vG, vA}, class@CCCC */
+    /* op vAA, {vCCCC..v(CCCC+AA-1)}, meth@BBBB */
diff --git a/runtime/interpreter/mterp/arm/op_invoke_super_range.S b/runtime/interpreter/mterp/arm/op_invoke_super_range.S
new file mode 100644
index 0000000..9e4fb1c
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_invoke_super_range.S
@@ -0,0 +1 @@
+%include "arm/invoke.S" { "helper":"MterpInvokeSuperRange" }
diff --git a/runtime/interpreter/mterp/arm/op_invoke_virtual.S b/runtime/interpreter/mterp/arm/op_invoke_virtual.S
new file mode 100644
index 0000000..5b893ff
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_invoke_virtual.S
@@ -0,0 +1,8 @@
+%include "arm/invoke.S" { "helper":"MterpInvokeVirtual" }
+    /*
+     * Handle a virtual method call.
+     *
+     * for: invoke-virtual, invoke-virtual/range
+     */
+    /* op vB, {vD, vE, vF, vG, vA}, class@CCCC */
+    /* op vAA, {vCCCC..v(CCCC+AA-1)}, meth@BBBB */
diff --git a/runtime/interpreter/mterp/arm/op_invoke_virtual_quick.S b/runtime/interpreter/mterp/arm/op_invoke_virtual_quick.S
new file mode 100644
index 0000000..020e8b8
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_invoke_virtual_quick.S
@@ -0,0 +1 @@
+%include "arm/invoke.S" { "helper":"MterpInvokeVirtualQuick" }
diff --git a/runtime/interpreter/mterp/arm/op_invoke_virtual_range.S b/runtime/interpreter/mterp/arm/op_invoke_virtual_range.S
new file mode 100644
index 0000000..2b42a78
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_invoke_virtual_range.S
@@ -0,0 +1 @@
+%include "arm/invoke.S" { "helper":"MterpInvokeVirtualRange" }
diff --git a/runtime/interpreter/mterp/arm/op_invoke_virtual_range_quick.S b/runtime/interpreter/mterp/arm/op_invoke_virtual_range_quick.S
new file mode 100644
index 0000000..42f2ded
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_invoke_virtual_range_quick.S
@@ -0,0 +1 @@
+%include "arm/invoke.S" { "helper":"MterpInvokeVirtualQuickRange" }
diff --git a/runtime/interpreter/mterp/arm/op_iput.S b/runtime/interpreter/mterp/arm/op_iput.S
new file mode 100644
index 0000000..d224cd8
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iput.S
@@ -0,0 +1,22 @@
+%default { "is_object":"0", "handler":"artSet32InstanceFromMterp" }
+    /*
+     * General 32-bit instance field put.
+     *
+     * for: iput, iput-object, iput-boolean, iput-byte, iput-char, iput-short
+     */
+    /* op vA, vB, field@CCCC */
+    .extern $handler
+    EXPORT_PC
+    FETCH    r0, 1                      @ r0<- field ref CCCC
+    mov      r1, rINST, lsr #12         @ r1<- B
+    GET_VREG r1, r1                     @ r1<- fp[B], the object pointer
+    ubfx     r2, rINST, #8, #4          @ r2<- A
+    GET_VREG r2, r2                     @ r2<- fp[A]
+    ldr      r3, [rFP, #OFF_FP_METHOD]  @ r3<- referrer
+    PREFETCH_INST 2
+    bl       $handler
+    cmp      r0, #0
+    bne      MterpPossibleException
+    ADVANCE  2                          @ advance rPC
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_iput_boolean.S b/runtime/interpreter/mterp/arm/op_iput_boolean.S
new file mode 100644
index 0000000..c9e8589
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iput_boolean.S
@@ -0,0 +1 @@
+%include "arm/op_iput.S" { "handler":"artSet8InstanceFromMterp" }
diff --git a/runtime/interpreter/mterp/arm/op_iput_boolean_quick.S b/runtime/interpreter/mterp/arm/op_iput_boolean_quick.S
new file mode 100644
index 0000000..f0a2777
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iput_boolean_quick.S
@@ -0,0 +1 @@
+%include "arm/op_iput_quick.S" { "store":"strb" }
diff --git a/runtime/interpreter/mterp/arm/op_iput_byte.S b/runtime/interpreter/mterp/arm/op_iput_byte.S
new file mode 100644
index 0000000..c9e8589
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iput_byte.S
@@ -0,0 +1 @@
+%include "arm/op_iput.S" { "handler":"artSet8InstanceFromMterp" }
diff --git a/runtime/interpreter/mterp/arm/op_iput_byte_quick.S b/runtime/interpreter/mterp/arm/op_iput_byte_quick.S
new file mode 100644
index 0000000..f0a2777
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iput_byte_quick.S
@@ -0,0 +1 @@
+%include "arm/op_iput_quick.S" { "store":"strb" }
diff --git a/runtime/interpreter/mterp/arm/op_iput_char.S b/runtime/interpreter/mterp/arm/op_iput_char.S
new file mode 100644
index 0000000..5046f6b
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iput_char.S
@@ -0,0 +1 @@
+%include "arm/op_iput.S" { "handler":"artSet16InstanceFromMterp" }
diff --git a/runtime/interpreter/mterp/arm/op_iput_char_quick.S b/runtime/interpreter/mterp/arm/op_iput_char_quick.S
new file mode 100644
index 0000000..5212fc3
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iput_char_quick.S
@@ -0,0 +1 @@
+%include "arm/op_iput_quick.S" { "store":"strh" }
diff --git a/runtime/interpreter/mterp/arm/op_iput_object.S b/runtime/interpreter/mterp/arm/op_iput_object.S
new file mode 100644
index 0000000..d942e84
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iput_object.S
@@ -0,0 +1,11 @@
+    EXPORT_PC
+    add     r0, rFP, #OFF_FP_SHADOWFRAME
+    mov     r1, rPC
+    mov     r2, rINST
+    mov     r3, rSELF
+    bl      MterpIputObject
+    cmp     r0, #0
+    beq     MterpException
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_iput_object_quick.S b/runtime/interpreter/mterp/arm/op_iput_object_quick.S
new file mode 100644
index 0000000..876b3da
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iput_object_quick.S
@@ -0,0 +1,10 @@
+    EXPORT_PC
+    add     r0, rFP, #OFF_FP_SHADOWFRAME
+    mov     r1, rPC
+    mov     r2, rINST
+    bl      MterpIputObjectQuick
+    cmp     r0, #0
+    beq     MterpException
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_iput_quick.S b/runtime/interpreter/mterp/arm/op_iput_quick.S
new file mode 100644
index 0000000..98c8150
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iput_quick.S
@@ -0,0 +1,14 @@
+%default { "store":"str" }
+    /* For: iput-quick, iput-object-quick */
+    /* op vA, vB, offset@CCCC */
+    mov     r2, rINST, lsr #12          @ r2<- B
+    FETCH r1, 1                         @ r1<- field byte offset
+    GET_VREG r3, r2                     @ r3<- fp[B], the object pointer
+    ubfx    r2, rINST, #8, #4           @ r2<- A
+    cmp     r3, #0                      @ check object for null
+    beq     common_errNullObject        @ object was null
+    GET_VREG r0, r2                     @ r0<- fp[A]
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    $store     r0, [r3, r1]             @ obj.field<- r0
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_iput_short.S b/runtime/interpreter/mterp/arm/op_iput_short.S
new file mode 100644
index 0000000..5046f6b
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iput_short.S
@@ -0,0 +1 @@
+%include "arm/op_iput.S" { "handler":"artSet16InstanceFromMterp" }
diff --git a/runtime/interpreter/mterp/arm/op_iput_short_quick.S b/runtime/interpreter/mterp/arm/op_iput_short_quick.S
new file mode 100644
index 0000000..5212fc3
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iput_short_quick.S
@@ -0,0 +1 @@
+%include "arm/op_iput_quick.S" { "store":"strh" }
diff --git a/runtime/interpreter/mterp/arm/op_iput_wide.S b/runtime/interpreter/mterp/arm/op_iput_wide.S
new file mode 100644
index 0000000..8bbd63e
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iput_wide.S
@@ -0,0 +1,16 @@
+    /* iput-wide vA, vB, field@CCCC */
+    .extern artSet64InstanceFromMterp
+    EXPORT_PC
+    FETCH    r0, 1                      @ r0<- field ref CCCC
+    mov      r1, rINST, lsr #12         @ r1<- B
+    GET_VREG r1, r1                     @ r1<- fp[B], the object pointer
+    ubfx     r2, rINST, #8, #4          @ r2<- A
+    add      r2, rFP, r2, lsl #2        @ r2<- &fp[A]
+    ldr      r3, [rFP, #OFF_FP_METHOD]  @ r3<- referrer
+    PREFETCH_INST 2
+    bl       artSet64InstanceFromMterp
+    cmp      r0, #0
+    bne      MterpPossibleException
+    ADVANCE  2                          @ advance rPC
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_iput_wide_quick.S b/runtime/interpreter/mterp/arm/op_iput_wide_quick.S
new file mode 100644
index 0000000..a2fc9e1
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_iput_wide_quick.S
@@ -0,0 +1,13 @@
+    /* iput-wide-quick vA, vB, offset@CCCC */
+    mov     r2, rINST, lsr #12          @ r2<- B
+    FETCH r3, 1                         @ r3<- field byte offset
+    GET_VREG r2, r2                     @ r2<- fp[B], the object pointer
+    ubfx    r0, rINST, #8, #4           @ r0<- A
+    cmp     r2, #0                      @ check object for null
+    beq     common_errNullObject        @ object was null
+    add     r0, rFP, r0, lsl #2         @ r0<- &fp[A]
+    ldmia   r0, {r0-r1}                 @ r0/r1<- fp[A]/fp[A+1]
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    strd    r0, [r2, r3]                @ obj.field<- r0/r1
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_long_to_double.S b/runtime/interpreter/mterp/arm/op_long_to_double.S
new file mode 100644
index 0000000..1d48a2a
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_long_to_double.S
@@ -0,0 +1,27 @@
+%default {}
+    /*
+     * Specialised 64-bit floating point operation.
+     *
+     * Note: The result will be returned in d2.
+     *
+     * For: long-to-double
+     */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[B]
+    add     r9, rFP, r9, lsl #2         @ r9<- &fp[A]
+    vldr    d0, [r3]                    @ d0<- vAA
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+
+    vcvt.f64.s32    d1, s1              @ d1<- (double)(vAAh)
+    vcvt.f64.u32    d2, s0              @ d2<- (double)(vAAl)
+    vldr            d3, constval$opcode
+    vmla.f64        d2, d1, d3          @ d2<- vAAh*2^32 + vAAl
+
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    vstr.64 d2, [r9]                    @ vAA<- d2
+    GOTO_OPCODE ip                      @ jump to next instruction
+
+    /* literal pool helper */
+constval${opcode}:
+    .8byte          0x41f0000000000000
diff --git a/runtime/interpreter/mterp/arm/op_long_to_float.S b/runtime/interpreter/mterp/arm/op_long_to_float.S
new file mode 100644
index 0000000..efa5a66
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_long_to_float.S
@@ -0,0 +1 @@
+%include "arm/unopNarrower.S" {"instr":"bl      __aeabi_l2f"}
diff --git a/runtime/interpreter/mterp/arm/op_long_to_int.S b/runtime/interpreter/mterp/arm/op_long_to_int.S
new file mode 100644
index 0000000..3e91f23
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_long_to_int.S
@@ -0,0 +1,2 @@
+/* we ignore the high word, making this equivalent to a 32-bit reg move */
+%include "arm/op_move.S"
diff --git a/runtime/interpreter/mterp/arm/op_monitor_enter.S b/runtime/interpreter/mterp/arm/op_monitor_enter.S
new file mode 100644
index 0000000..3c34f75
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_monitor_enter.S
@@ -0,0 +1,14 @@
+    /*
+     * Synchronize on an object.
+     */
+    /* monitor-enter vAA */
+    EXPORT_PC
+    mov      r2, rINST, lsr #8           @ r2<- AA
+    GET_VREG r0, r2                      @ r0<- vAA (object)
+    mov      r1, rSELF                   @ r1<- self
+    bl       artLockObjectFromCode
+    cmp      r0, #0
+    bne      MterpException
+    FETCH_ADVANCE_INST 1
+    GET_INST_OPCODE ip                   @ extract opcode from rINST
+    GOTO_OPCODE ip                       @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_monitor_exit.S b/runtime/interpreter/mterp/arm/op_monitor_exit.S
new file mode 100644
index 0000000..fc7cef5
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_monitor_exit.S
@@ -0,0 +1,18 @@
+    /*
+     * Unlock an object.
+     *
+     * Exceptions that occur when unlocking a monitor need to appear as
+     * if they happened at the following instruction.  See the Dalvik
+     * instruction spec.
+     */
+    /* monitor-exit vAA */
+    EXPORT_PC
+    mov      r2, rINST, lsr #8          @ r2<- AA
+    GET_VREG r0, r2                     @ r0<- vAA (object)
+    mov      r1, rSELF                  @ r0<- self
+    bl       artUnlockObjectFromCode    @ r0<- success for unlock(self, obj)
+    cmp     r0, #0                      @ failed?
+    bne     MterpException
+    FETCH_ADVANCE_INST 1                @ before throw: advance rPC, load rINST
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_move.S b/runtime/interpreter/mterp/arm/op_move.S
new file mode 100644
index 0000000..dfecc24
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_move.S
@@ -0,0 +1,14 @@
+%default { "is_object":"0" }
+    /* for move, move-object, long-to-int */
+    /* op vA, vB */
+    mov     r1, rINST, lsr #12          @ r1<- B from 15:12
+    ubfx    r0, rINST, #8, #4           @ r0<- A from 11:8
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    GET_VREG r2, r1                     @ r2<- fp[B]
+    GET_INST_OPCODE ip                  @ ip<- opcode from rINST
+    .if $is_object
+    SET_VREG_OBJECT r2, r0              @ fp[A]<- r2
+    .else
+    SET_VREG r2, r0                     @ fp[A]<- r2
+    .endif
+    GOTO_OPCODE ip                      @ execute next instruction
diff --git a/runtime/interpreter/mterp/arm/op_move_16.S b/runtime/interpreter/mterp/arm/op_move_16.S
new file mode 100644
index 0000000..78138a2
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_move_16.S
@@ -0,0 +1,14 @@
+%default { "is_object":"0" }
+    /* for: move/16, move-object/16 */
+    /* op vAAAA, vBBBB */
+    FETCH r1, 2                         @ r1<- BBBB
+    FETCH r0, 1                         @ r0<- AAAA
+    FETCH_ADVANCE_INST 3                @ advance rPC, load rINST
+    GET_VREG r2, r1                     @ r2<- fp[BBBB]
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    .if $is_object
+    SET_VREG_OBJECT r2, r0              @ fp[AAAA]<- r2
+    .else
+    SET_VREG r2, r0                     @ fp[AAAA]<- r2
+    .endif
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_move_exception.S b/runtime/interpreter/mterp/arm/op_move_exception.S
new file mode 100644
index 0000000..0242e26
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_move_exception.S
@@ -0,0 +1,9 @@
+    /* move-exception vAA */
+    mov     r2, rINST, lsr #8           @ r2<- AA
+    ldr     r3, [rSELF, #THREAD_EXCEPTION_OFFSET]
+    mov     r1, #0                      @ r1<- 0
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    SET_VREG_OBJECT r3, r2              @ fp[AA]<- exception obj
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    str     r1, [rSELF, #THREAD_EXCEPTION_OFFSET]  @ clear exception
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_move_from16.S b/runtime/interpreter/mterp/arm/op_move_from16.S
new file mode 100644
index 0000000..3e79417
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_move_from16.S
@@ -0,0 +1,14 @@
+%default { "is_object":"0" }
+    /* for: move/from16, move-object/from16 */
+    /* op vAA, vBBBB */
+    FETCH r1, 1                         @ r1<- BBBB
+    mov     r0, rINST, lsr #8           @ r0<- AA
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    GET_VREG r2, r1                     @ r2<- fp[BBBB]
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    .if $is_object
+    SET_VREG_OBJECT r2, r0              @ fp[AA]<- r2
+    .else
+    SET_VREG r2, r0                     @ fp[AA]<- r2
+    .endif
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_move_object.S b/runtime/interpreter/mterp/arm/op_move_object.S
new file mode 100644
index 0000000..16de57b
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_move_object.S
@@ -0,0 +1 @@
+%include "arm/op_move.S" {"is_object":"1"}
diff --git a/runtime/interpreter/mterp/arm/op_move_object_16.S b/runtime/interpreter/mterp/arm/op_move_object_16.S
new file mode 100644
index 0000000..2534300
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_move_object_16.S
@@ -0,0 +1 @@
+%include "arm/op_move_16.S" {"is_object":"1"}
diff --git a/runtime/interpreter/mterp/arm/op_move_object_from16.S b/runtime/interpreter/mterp/arm/op_move_object_from16.S
new file mode 100644
index 0000000..9e0cf02
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_move_object_from16.S
@@ -0,0 +1 @@
+%include "arm/op_move_from16.S" {"is_object":"1"}
diff --git a/runtime/interpreter/mterp/arm/op_move_result.S b/runtime/interpreter/mterp/arm/op_move_result.S
new file mode 100644
index 0000000..f2586a0
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_move_result.S
@@ -0,0 +1,14 @@
+%default { "is_object":"0" }
+    /* for: move-result, move-result-object */
+    /* op vAA */
+    mov     r2, rINST, lsr #8           @ r2<- AA
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    ldr     r0, [rFP, #OFF_FP_RESULT_REGISTER]  @ get pointer to result JType.
+    ldr     r0, [r0]                    @ r0 <- result.i.
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    .if $is_object
+    SET_VREG_OBJECT r0, r2, r1          @ fp[AA]<- r0
+    .else
+    SET_VREG r0, r2                     @ fp[AA]<- r0
+    .endif
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_move_result_object.S b/runtime/interpreter/mterp/arm/op_move_result_object.S
new file mode 100644
index 0000000..643296a
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_move_result_object.S
@@ -0,0 +1 @@
+%include "arm/op_move_result.S" {"is_object":"1"}
diff --git a/runtime/interpreter/mterp/arm/op_move_result_wide.S b/runtime/interpreter/mterp/arm/op_move_result_wide.S
new file mode 100644
index 0000000..c64103c
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_move_result_wide.S
@@ -0,0 +1,9 @@
+    /* move-result-wide vAA */
+    mov     r2, rINST, lsr #8           @ r2<- AA
+    ldr     r3, [rFP, #OFF_FP_RESULT_REGISTER]
+    add     r2, rFP, r2, lsl #2         @ r2<- &fp[AA]
+    ldmia   r3, {r0-r1}                 @ r0/r1<- retval.j
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    stmia   r2, {r0-r1}                 @ fp[AA]<- r0/r1
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_move_wide.S b/runtime/interpreter/mterp/arm/op_move_wide.S
new file mode 100644
index 0000000..1345b95
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_move_wide.S
@@ -0,0 +1,11 @@
+    /* move-wide vA, vB */
+    /* NOTE: regs can overlap, e.g. "move v6,v7" or "move v7,v6" */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    ubfx    r2, rINST, #8, #4           @ r2<- A
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[B]
+    add     r2, rFP, r2, lsl #2         @ r2<- &fp[A]
+    ldmia   r3, {r0-r1}                 @ r0/r1<- fp[B]
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r2, {r0-r1}                 @ fp[A]<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_move_wide_16.S b/runtime/interpreter/mterp/arm/op_move_wide_16.S
new file mode 100644
index 0000000..133a4c3
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_move_wide_16.S
@@ -0,0 +1,11 @@
+    /* move-wide/16 vAAAA, vBBBB */
+    /* NOTE: regs can overlap, e.g. "move v6,v7" or "move v7,v6" */
+    FETCH r3, 2                         @ r3<- BBBB
+    FETCH r2, 1                         @ r2<- AAAA
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[BBBB]
+    add     r2, rFP, r2, lsl #2         @ r2<- &fp[AAAA]
+    ldmia   r3, {r0-r1}                 @ r0/r1<- fp[BBBB]
+    FETCH_ADVANCE_INST 3                @ advance rPC, load rINST
+    stmia   r2, {r0-r1}                 @ fp[AAAA]<- r0/r1
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_move_wide_from16.S b/runtime/interpreter/mterp/arm/op_move_wide_from16.S
new file mode 100644
index 0000000..f2ae785
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_move_wide_from16.S
@@ -0,0 +1,11 @@
+    /* move-wide/from16 vAA, vBBBB */
+    /* NOTE: regs can overlap, e.g. "move v6,v7" or "move v7,v6" */
+    FETCH r3, 1                         @ r3<- BBBB
+    mov     r2, rINST, lsr #8           @ r2<- AA
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[BBBB]
+    add     r2, rFP, r2, lsl #2         @ r2<- &fp[AA]
+    ldmia   r3, {r0-r1}                 @ r0/r1<- fp[BBBB]
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r2, {r0-r1}                 @ fp[AA]<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_mul_double.S b/runtime/interpreter/mterp/arm/op_mul_double.S
new file mode 100644
index 0000000..530e85a
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_mul_double.S
@@ -0,0 +1 @@
+%include "arm/fbinopWide.S" {"instr":"fmuld   d2, d0, d1"}
diff --git a/runtime/interpreter/mterp/arm/op_mul_double_2addr.S b/runtime/interpreter/mterp/arm/op_mul_double_2addr.S
new file mode 100644
index 0000000..da1abc6
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_mul_double_2addr.S
@@ -0,0 +1 @@
+%include "arm/fbinopWide2addr.S" {"instr":"fmuld   d2, d0, d1"}
diff --git a/runtime/interpreter/mterp/arm/op_mul_float.S b/runtime/interpreter/mterp/arm/op_mul_float.S
new file mode 100644
index 0000000..6a72e6f
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_mul_float.S
@@ -0,0 +1 @@
+%include "arm/fbinop.S" {"instr":"fmuls   s2, s0, s1"}
diff --git a/runtime/interpreter/mterp/arm/op_mul_float_2addr.S b/runtime/interpreter/mterp/arm/op_mul_float_2addr.S
new file mode 100644
index 0000000..edb5101
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_mul_float_2addr.S
@@ -0,0 +1 @@
+%include "arm/fbinop2addr.S" {"instr":"fmuls   s2, s0, s1"}
diff --git a/runtime/interpreter/mterp/arm/op_mul_int.S b/runtime/interpreter/mterp/arm/op_mul_int.S
new file mode 100644
index 0000000..d6151d4
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_mul_int.S
@@ -0,0 +1,2 @@
+/* must be "mul r0, r1, r0" -- "r0, r0, r1" is illegal */
+%include "arm/binop.S" {"instr":"mul     r0, r1, r0"}
diff --git a/runtime/interpreter/mterp/arm/op_mul_int_2addr.S b/runtime/interpreter/mterp/arm/op_mul_int_2addr.S
new file mode 100644
index 0000000..66a797d
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_mul_int_2addr.S
@@ -0,0 +1,2 @@
+/* must be "mul r0, r1, r0" -- "r0, r0, r1" is illegal */
+%include "arm/binop2addr.S" {"instr":"mul     r0, r1, r0"}
diff --git a/runtime/interpreter/mterp/arm/op_mul_int_lit16.S b/runtime/interpreter/mterp/arm/op_mul_int_lit16.S
new file mode 100644
index 0000000..4e40c43
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_mul_int_lit16.S
@@ -0,0 +1,2 @@
+/* must be "mul r0, r1, r0" -- "r0, r0, r1" is illegal */
+%include "arm/binopLit16.S" {"instr":"mul     r0, r1, r0"}
diff --git a/runtime/interpreter/mterp/arm/op_mul_int_lit8.S b/runtime/interpreter/mterp/arm/op_mul_int_lit8.S
new file mode 100644
index 0000000..dbafae9
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_mul_int_lit8.S
@@ -0,0 +1,2 @@
+/* must be "mul r0, r1, r0" -- "r0, r0, r1" is illegal */
+%include "arm/binopLit8.S" {"instr":"mul     r0, r1, r0"}
diff --git a/runtime/interpreter/mterp/arm/op_mul_long.S b/runtime/interpreter/mterp/arm/op_mul_long.S
new file mode 100644
index 0000000..9e83778
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_mul_long.S
@@ -0,0 +1,36 @@
+    /*
+     * Signed 64-bit integer multiply.
+     *
+     * Consider WXxYZ (r1r0 x r3r2) with a long multiply:
+     *        WX
+     *      x YZ
+     *  --------
+     *     ZW ZX
+     *  YW YX
+     *
+     * The low word of the result holds ZX, the high word holds
+     * (ZW+YX) + (the high overflow from ZX).  YW doesn't matter because
+     * it doesn't fit in the low 64 bits.
+     *
+     * Unlike most ARM math operations, multiply instructions have
+     * restrictions on using the same register more than once (Rd and Rm
+     * cannot be the same).
+     */
+    /* mul-long vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    and     r2, r0, #255                @ r2<- BB
+    mov     r3, r0, lsr #8              @ r3<- CC
+    add     r2, rFP, r2, lsl #2         @ r2<- &fp[BB]
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[CC]
+    ldmia   r2, {r0-r1}                 @ r0/r1<- vBB/vBB+1
+    ldmia   r3, {r2-r3}                 @ r2/r3<- vCC/vCC+1
+    mul     ip, r2, r1                  @  ip<- ZxW
+    umull   r9, r10, r2, r0             @  r9/r10 <- ZxX
+    mla     r2, r0, r3, ip              @  r2<- YxX + (ZxW)
+    mov     r0, rINST, lsr #8           @ r0<- AA
+    add     r10, r2, r10                @  r10<- r10 + low(ZxW + (YxX))
+    add     r0, rFP, r0, lsl #2         @ r0<- &fp[AA]
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r0, {r9-r10}                @ vAA/vAA+1<- r9/r10
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_mul_long_2addr.S b/runtime/interpreter/mterp/arm/op_mul_long_2addr.S
new file mode 100644
index 0000000..789dbd3
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_mul_long_2addr.S
@@ -0,0 +1,24 @@
+    /*
+     * Signed 64-bit integer multiply, "/2addr" version.
+     *
+     * See op_mul_long for an explanation.
+     *
+     * We get a little tight on registers, so to avoid looking up &fp[A]
+     * again we stuff it into rINST.
+     */
+    /* mul-long/2addr vA, vB */
+    mov     r1, rINST, lsr #12          @ r1<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    add     r1, rFP, r1, lsl #2         @ r1<- &fp[B]
+    add     rINST, rFP, r9, lsl #2      @ rINST<- &fp[A]
+    ldmia   r1, {r2-r3}                 @ r2/r3<- vBB/vBB+1
+    ldmia   rINST, {r0-r1}              @ r0/r1<- vAA/vAA+1
+    mul     ip, r2, r1                  @  ip<- ZxW
+    umull   r9, r10, r2, r0             @  r9/r10 <- ZxX
+    mla     r2, r0, r3, ip              @  r2<- YxX + (ZxW)
+    mov     r0, rINST                   @ r0<- &fp[A] (free up rINST)
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    add     r10, r2, r10                @  r10<- r10 + low(ZxW + (YxX))
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r0, {r9-r10}                @ vAA/vAA+1<- r9/r10
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_neg_double.S b/runtime/interpreter/mterp/arm/op_neg_double.S
new file mode 100644
index 0000000..33e609c
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_neg_double.S
@@ -0,0 +1 @@
+%include "arm/unopWide.S" {"instr":"add     r1, r1, #0x80000000"}
diff --git a/runtime/interpreter/mterp/arm/op_neg_float.S b/runtime/interpreter/mterp/arm/op_neg_float.S
new file mode 100644
index 0000000..993583f
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_neg_float.S
@@ -0,0 +1 @@
+%include "arm/unop.S" {"instr":"add     r0, r0, #0x80000000"}
diff --git a/runtime/interpreter/mterp/arm/op_neg_int.S b/runtime/interpreter/mterp/arm/op_neg_int.S
new file mode 100644
index 0000000..ec0b253
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_neg_int.S
@@ -0,0 +1 @@
+%include "arm/unop.S" {"instr":"rsb     r0, r0, #0"}
diff --git a/runtime/interpreter/mterp/arm/op_neg_long.S b/runtime/interpreter/mterp/arm/op_neg_long.S
new file mode 100644
index 0000000..dab2eb4
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_neg_long.S
@@ -0,0 +1 @@
+%include "arm/unopWide.S" {"preinstr":"rsbs    r0, r0, #0", "instr":"rsc     r1, r1, #0"}
diff --git a/runtime/interpreter/mterp/arm/op_new_array.S b/runtime/interpreter/mterp/arm/op_new_array.S
new file mode 100644
index 0000000..8bb792c
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_new_array.S
@@ -0,0 +1,19 @@
+    /*
+     * Allocate an array of objects, specified with the array class
+     * and a count.
+     *
+     * The verifier guarantees that this is an array class, so we don't
+     * check for it here.
+     */
+    /* new-array vA, vB, class@CCCC */
+    EXPORT_PC
+    add     r0, rFP, #OFF_FP_SHADOWFRAME
+    mov     r1, rPC
+    mov     r2, rINST
+    mov     r3, rSELF
+    bl      MterpNewArray
+    cmp     r0, #0
+    beq     MterpPossibleException
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_new_instance.S b/runtime/interpreter/mterp/arm/op_new_instance.S
new file mode 100644
index 0000000..95d4be8
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_new_instance.S
@@ -0,0 +1,14 @@
+    /*
+     * Create a new instance of a class.
+     */
+    /* new-instance vAA, class@BBBB */
+    EXPORT_PC
+    add     r0, rFP, #OFF_FP_SHADOWFRAME
+    mov     r1, rSELF
+    mov     r2, rINST
+    bl      MterpNewInstance           @ (shadow_frame, self, inst_data)
+    cmp     r0, #0
+    beq     MterpPossibleException
+    FETCH_ADVANCE_INST 2               @ advance rPC, load rINST
+    GET_INST_OPCODE ip                 @ extract opcode from rINST
+    GOTO_OPCODE ip                     @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_nop.S b/runtime/interpreter/mterp/arm/op_nop.S
new file mode 100644
index 0000000..af0f88f
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_nop.S
@@ -0,0 +1,3 @@
+    FETCH_ADVANCE_INST 1                @ advance to next instr, load rINST
+    GET_INST_OPCODE ip                  @ ip<- opcode from rINST
+    GOTO_OPCODE ip                      @ execute it
diff --git a/runtime/interpreter/mterp/arm/op_not_int.S b/runtime/interpreter/mterp/arm/op_not_int.S
new file mode 100644
index 0000000..816485a
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_not_int.S
@@ -0,0 +1 @@
+%include "arm/unop.S" {"instr":"mvn     r0, r0"}
diff --git a/runtime/interpreter/mterp/arm/op_not_long.S b/runtime/interpreter/mterp/arm/op_not_long.S
new file mode 100644
index 0000000..49a5905
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_not_long.S
@@ -0,0 +1 @@
+%include "arm/unopWide.S" {"preinstr":"mvn     r0, r0", "instr":"mvn     r1, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_or_int.S b/runtime/interpreter/mterp/arm/op_or_int.S
new file mode 100644
index 0000000..b046e8d
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_or_int.S
@@ -0,0 +1 @@
+%include "arm/binop.S" {"instr":"orr     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_or_int_2addr.S b/runtime/interpreter/mterp/arm/op_or_int_2addr.S
new file mode 100644
index 0000000..493c59f
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_or_int_2addr.S
@@ -0,0 +1 @@
+%include "arm/binop2addr.S" {"instr":"orr     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_or_int_lit16.S b/runtime/interpreter/mterp/arm/op_or_int_lit16.S
new file mode 100644
index 0000000..0a01db8
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_or_int_lit16.S
@@ -0,0 +1 @@
+%include "arm/binopLit16.S" {"instr":"orr     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_or_int_lit8.S b/runtime/interpreter/mterp/arm/op_or_int_lit8.S
new file mode 100644
index 0000000..2d85038
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_or_int_lit8.S
@@ -0,0 +1 @@
+%include "arm/binopLit8.S" {"instr":"orr     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_or_long.S b/runtime/interpreter/mterp/arm/op_or_long.S
new file mode 100644
index 0000000..048c45c
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_or_long.S
@@ -0,0 +1 @@
+%include "arm/binopWide.S" {"preinstr":"orr     r0, r0, r2", "instr":"orr     r1, r1, r3"}
diff --git a/runtime/interpreter/mterp/arm/op_or_long_2addr.S b/runtime/interpreter/mterp/arm/op_or_long_2addr.S
new file mode 100644
index 0000000..9395346
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_or_long_2addr.S
@@ -0,0 +1 @@
+%include "arm/binopWide2addr.S" {"preinstr":"orr     r0, r0, r2", "instr":"orr     r1, r1, r3"}
diff --git a/runtime/interpreter/mterp/arm/op_packed_switch.S b/runtime/interpreter/mterp/arm/op_packed_switch.S
new file mode 100644
index 0000000..1e3370e
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_packed_switch.S
@@ -0,0 +1,39 @@
+%default { "func":"MterpDoPackedSwitch" }
+    /*
+     * Handle a packed-switch or sparse-switch instruction.  In both cases
+     * we decode it and hand it off to a helper function.
+     *
+     * We don't really expect backward branches in a switch statement, but
+     * they're perfectly legal, so we check for them here.
+     *
+     * for: packed-switch, sparse-switch
+     */
+    /* op vAA, +BBBB */
+#if MTERP_SUSPEND
+    FETCH r0, 1                         @ r0<- bbbb (lo)
+    FETCH r1, 2                         @ r1<- BBBB (hi)
+    mov     r3, rINST, lsr #8           @ r3<- AA
+    orr     r0, r0, r1, lsl #16         @ r0<- BBBBbbbb
+    GET_VREG r1, r3                     @ r1<- vAA
+    add     r0, rPC, r0, lsl #1         @ r0<- PC + BBBBbbbb*2
+    bl      $func                       @ r0<- code-unit branch offset
+    adds    r1, r0, r0                  @ r1<- byte offset; clear V
+    ldrle   rIBASE, [rSELF, #THREAD_CURRENT_IBASE_OFFSET] @ refresh handler base
+    FETCH_ADVANCE_INST_RB r1            @ update rPC, load rINST
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+#else
+    FETCH r0, 1                         @ r0<- bbbb (lo)
+    FETCH r1, 2                         @ r1<- BBBB (hi)
+    mov     r3, rINST, lsr #8           @ r3<- AA
+    orr     r0, r0, r1, lsl #16         @ r0<- BBBBbbbb
+    GET_VREG r1, r3                     @ r1<- vAA
+    add     r0, rPC, r0, lsl #1         @ r0<- PC + BBBBbbbb*2
+    bl      $func                       @ r0<- code-unit branch offset
+    ldr     lr, [rSELF, #THREAD_FLAGS_OFFSET]
+    adds    r1, r0, r0                  @ r1<- byte offset; clear V
+    FETCH_ADVANCE_INST_RB r1            @ update rPC, load rINST
+    ble     MterpCheckSuspendAndContinue
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+#endif
diff --git a/runtime/interpreter/mterp/arm/op_rem_double.S b/runtime/interpreter/mterp/arm/op_rem_double.S
new file mode 100644
index 0000000..b539221
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_rem_double.S
@@ -0,0 +1,2 @@
+/* EABI doesn't define a double remainder function, but libm does */
+%include "arm/binopWide.S" {"instr":"bl      fmod"}
diff --git a/runtime/interpreter/mterp/arm/op_rem_double_2addr.S b/runtime/interpreter/mterp/arm/op_rem_double_2addr.S
new file mode 100644
index 0000000..372ef1d
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_rem_double_2addr.S
@@ -0,0 +1,2 @@
+/* EABI doesn't define a double remainder function, but libm does */
+%include "arm/binopWide2addr.S" {"instr":"bl      fmod"}
diff --git a/runtime/interpreter/mterp/arm/op_rem_float.S b/runtime/interpreter/mterp/arm/op_rem_float.S
new file mode 100644
index 0000000..7bd10de
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_rem_float.S
@@ -0,0 +1,2 @@
+/* EABI doesn't define a float remainder function, but libm does */
+%include "arm/binop.S" {"instr":"bl      fmodf"}
diff --git a/runtime/interpreter/mterp/arm/op_rem_float_2addr.S b/runtime/interpreter/mterp/arm/op_rem_float_2addr.S
new file mode 100644
index 0000000..93c5fae
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_rem_float_2addr.S
@@ -0,0 +1,2 @@
+/* EABI doesn't define a float remainder function, but libm does */
+%include "arm/binop2addr.S" {"instr":"bl      fmodf"}
diff --git a/runtime/interpreter/mterp/arm/op_rem_int.S b/runtime/interpreter/mterp/arm/op_rem_int.S
new file mode 100644
index 0000000..ff62573
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_rem_int.S
@@ -0,0 +1,33 @@
+%default {}
+    /*
+     * Specialized 32-bit binary operation
+     *
+     * Performs "r1 = r0 rem r1". The selection between sdiv block or the gcc helper
+     * depends on the compile time value of __ARM_ARCH_EXT_IDIV__ (defined for
+     * ARMv7 CPUs that have hardware division support).
+     *
+     * NOTE: idivmod returns quotient in r0 and remainder in r1
+     *
+     * rem-int
+     *
+     */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    mov     r3, r0, lsr #8              @ r3<- CC
+    and     r2, r0, #255                @ r2<- BB
+    GET_VREG r1, r3                     @ r1<- vCC
+    GET_VREG r0, r2                     @ r0<- vBB
+    cmp     r1, #0                      @ is second operand zero?
+    beq     common_errDivideByZero
+
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+#ifdef __ARM_ARCH_EXT_IDIV__
+    sdiv    r2, r0, r1
+    mls  r1, r1, r2, r0                 @ r1<- op, r0-r2 changed
+#else
+    bl   __aeabi_idivmod                @ r1<- op, r0-r3 changed
+#endif
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG r1, r9                     @ vAA<- r1
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 11-14 instructions */
diff --git a/runtime/interpreter/mterp/arm/op_rem_int_2addr.S b/runtime/interpreter/mterp/arm/op_rem_int_2addr.S
new file mode 100644
index 0000000..ba5751a
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_rem_int_2addr.S
@@ -0,0 +1,32 @@
+%default {}
+    /*
+     * Specialized 32-bit binary operation
+     *
+     * Performs "r1 = r0 rem r1". The selection between sdiv block or the gcc helper
+     * depends on the compile time value of __ARM_ARCH_EXT_IDIV__ (defined for
+     * ARMv7 CPUs that have hardware division support).
+     *
+     * NOTE: idivmod returns quotient in r0 and remainder in r1
+     *
+     * rem-int/2addr
+     *
+     */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    GET_VREG r1, r3                     @ r1<- vB
+    GET_VREG r0, r9                     @ r0<- vA
+    cmp     r1, #0                      @ is second operand zero?
+    beq     common_errDivideByZero
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+
+#ifdef __ARM_ARCH_EXT_IDIV__
+    sdiv    r2, r0, r1
+    mls     r1, r1, r2, r0              @ r1<- op
+#else
+    bl      __aeabi_idivmod             @ r1<- op, r0-r3 changed
+#endif
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG r1, r9                     @ vAA<- r1
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 10-13 instructions */
+
diff --git a/runtime/interpreter/mterp/arm/op_rem_int_lit16.S b/runtime/interpreter/mterp/arm/op_rem_int_lit16.S
new file mode 100644
index 0000000..4edb187
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_rem_int_lit16.S
@@ -0,0 +1,31 @@
+%default {}
+    /*
+     * Specialized 32-bit binary operation
+     *
+     * Performs "r1 = r0 rem r1". The selection between sdiv block or the gcc helper
+     * depends on the compile time value of __ARM_ARCH_EXT_IDIV__ (defined for
+     * ARMv7 CPUs that have hardware division support).
+     *
+     * NOTE: idivmod returns quotient in r0 and remainder in r1
+     *
+     * rem-int/lit16
+     *
+     */
+    FETCH_S r1, 1                       @ r1<- ssssCCCC (sign-extended)
+    mov     r2, rINST, lsr #12          @ r2<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    GET_VREG r0, r2                     @ r0<- vB
+    cmp     r1, #0                      @ is second operand zero?
+    beq     common_errDivideByZero
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+
+#ifdef __ARM_ARCH_EXT_IDIV__
+    sdiv    r2, r0, r1
+    mls     r1, r1, r2, r0              @ r1<- op
+#else
+    bl     __aeabi_idivmod              @ r1<- op, r0-r3 changed
+#endif
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG r1, r9                     @ vAA<- r1
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 10-13 instructions */
diff --git a/runtime/interpreter/mterp/arm/op_rem_int_lit8.S b/runtime/interpreter/mterp/arm/op_rem_int_lit8.S
new file mode 100644
index 0000000..3888361
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_rem_int_lit8.S
@@ -0,0 +1,32 @@
+%default {}
+    /*
+     * Specialized 32-bit binary operation
+     *
+     * Performs "r1 = r0 rem r1". The selection between sdiv block or the gcc helper
+     * depends on the compile time value of __ARM_ARCH_EXT_IDIV__ (defined for
+     * ARMv7 CPUs that have hardware division support).
+     *
+     * NOTE: idivmod returns quotient in r0 and remainder in r1
+     *
+     * rem-int/lit8
+     *
+     */
+    FETCH_S r3, 1                       @ r3<- ssssCCBB (sign-extended for CC)
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    and     r2, r3, #255                @ r2<- BB
+    GET_VREG r0, r2                     @ r0<- vBB
+    movs    r1, r3, asr #8              @ r1<- ssssssCC (sign extended)
+    @cmp     r1, #0                     @ is second operand zero?
+    beq     common_errDivideByZero
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+
+#ifdef __ARM_ARCH_EXT_IDIV__
+    sdiv    r2, r0, r1
+    mls     r1, r1, r2, r0              @ r1<- op
+#else
+    bl       __aeabi_idivmod            @ r1<- op, r0-r3 changed
+#endif
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG r1, r9                     @ vAA<- r1
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 10-12 instructions */
diff --git a/runtime/interpreter/mterp/arm/op_rem_long.S b/runtime/interpreter/mterp/arm/op_rem_long.S
new file mode 100644
index 0000000..b2b1c24
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_rem_long.S
@@ -0,0 +1,2 @@
+/* ldivmod returns quotient in r0/r1 and remainder in r2/r3 */
+%include "arm/binopWide.S" {"instr":"bl      __aeabi_ldivmod", "result0":"r2", "result1":"r3", "chkzero":"1"}
diff --git a/runtime/interpreter/mterp/arm/op_rem_long_2addr.S b/runtime/interpreter/mterp/arm/op_rem_long_2addr.S
new file mode 100644
index 0000000..f87d493
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_rem_long_2addr.S
@@ -0,0 +1,2 @@
+/* ldivmod returns quotient in r0/r1 and remainder in r2/r3 */
+%include "arm/binopWide2addr.S" {"instr":"bl      __aeabi_ldivmod", "result0":"r2", "result1":"r3", "chkzero":"1"}
diff --git a/runtime/interpreter/mterp/arm/op_return.S b/runtime/interpreter/mterp/arm/op_return.S
new file mode 100644
index 0000000..a4ffd04
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_return.S
@@ -0,0 +1,12 @@
+    /*
+     * Return a 32-bit value.
+     *
+     * for: return, return-object
+     */
+    /* op vAA */
+    .extern MterpThreadFenceForConstructor
+    bl      MterpThreadFenceForConstructor
+    mov     r2, rINST, lsr #8           @ r2<- AA
+    GET_VREG r0, r2                     @ r0<- vAA
+    mov     r1, #0
+    b       MterpReturn
diff --git a/runtime/interpreter/mterp/arm/op_return_object.S b/runtime/interpreter/mterp/arm/op_return_object.S
new file mode 100644
index 0000000..c490730
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_return_object.S
@@ -0,0 +1 @@
+%include "arm/op_return.S"
diff --git a/runtime/interpreter/mterp/arm/op_return_void.S b/runtime/interpreter/mterp/arm/op_return_void.S
new file mode 100644
index 0000000..f6dfd99
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_return_void.S
@@ -0,0 +1,5 @@
+    .extern MterpThreadFenceForConstructor
+    bl      MterpThreadFenceForConstructor
+    mov    r0, #0
+    mov    r1, #0
+    b      MterpReturn
diff --git a/runtime/interpreter/mterp/arm/op_return_void_no_barrier.S b/runtime/interpreter/mterp/arm/op_return_void_no_barrier.S
new file mode 100644
index 0000000..7322940
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_return_void_no_barrier.S
@@ -0,0 +1,3 @@
+    mov    r0, #0
+    mov    r1, #0
+    b      MterpReturn
diff --git a/runtime/interpreter/mterp/arm/op_return_wide.S b/runtime/interpreter/mterp/arm/op_return_wide.S
new file mode 100644
index 0000000..2881c87
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_return_wide.S
@@ -0,0 +1,10 @@
+    /*
+     * Return a 64-bit value.
+     */
+    /* return-wide vAA */
+    .extern MterpThreadFenceForConstructor
+    bl      MterpThreadFenceForConstructor
+    mov     r2, rINST, lsr #8           @ r2<- AA
+    add     r2, rFP, r2, lsl #2         @ r2<- &fp[AA]
+    ldmia   r2, {r0-r1}                 @ r0/r1 <- vAA/vAA+1
+    b       MterpReturn
diff --git a/runtime/interpreter/mterp/arm/op_rsub_int.S b/runtime/interpreter/mterp/arm/op_rsub_int.S
new file mode 100644
index 0000000..1508dd4
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_rsub_int.S
@@ -0,0 +1,2 @@
+/* this op is "rsub-int", but can be thought of as "rsub-int/lit16" */
+%include "arm/binopLit16.S" {"instr":"rsb     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_rsub_int_lit8.S b/runtime/interpreter/mterp/arm/op_rsub_int_lit8.S
new file mode 100644
index 0000000..2ee11e1
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_rsub_int_lit8.S
@@ -0,0 +1 @@
+%include "arm/binopLit8.S" {"instr":"rsb     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_sget.S b/runtime/interpreter/mterp/arm/op_sget.S
new file mode 100644
index 0000000..2b81f50
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sget.S
@@ -0,0 +1,27 @@
+%default { "is_object":"0", "helper":"artGet32StaticFromCode" }
+    /*
+     * General SGET handler wrapper.
+     *
+     * for: sget, sget-object, sget-boolean, sget-byte, sget-char, sget-short
+     */
+    /* op vAA, field@BBBB */
+
+    .extern $helper
+    EXPORT_PC
+    FETCH r0, 1                         @ r0<- field ref BBBB
+    ldr   r1, [rFP, #OFF_FP_METHOD]
+    mov   r2, rSELF
+    bl    $helper
+    ldr   r3, [rSELF, #THREAD_EXCEPTION_OFFSET]
+    mov   r2, rINST, lsr #8             @ r2<- AA
+    PREFETCH_INST 2
+    cmp   r3, #0                        @ Fail to resolve?
+    bne   MterpException                @ bail out
+.if $is_object
+    SET_VREG_OBJECT r0, r2              @ fp[AA]<- r0
+.else
+    SET_VREG r0, r2                     @ fp[AA]<- r0
+.endif
+    ADVANCE 2
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip
diff --git a/runtime/interpreter/mterp/arm/op_sget_boolean.S b/runtime/interpreter/mterp/arm/op_sget_boolean.S
new file mode 100644
index 0000000..ebfb44c
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sget_boolean.S
@@ -0,0 +1 @@
+%include "arm/op_sget.S" {"helper":"artGetBooleanStaticFromCode"}
diff --git a/runtime/interpreter/mterp/arm/op_sget_byte.S b/runtime/interpreter/mterp/arm/op_sget_byte.S
new file mode 100644
index 0000000..d76862e
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sget_byte.S
@@ -0,0 +1 @@
+%include "arm/op_sget.S" {"helper":"artGetByteStaticFromCode"}
diff --git a/runtime/interpreter/mterp/arm/op_sget_char.S b/runtime/interpreter/mterp/arm/op_sget_char.S
new file mode 100644
index 0000000..b7fcfc2
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sget_char.S
@@ -0,0 +1 @@
+%include "arm/op_sget.S" {"helper":"artGetCharStaticFromCode"}
diff --git a/runtime/interpreter/mterp/arm/op_sget_object.S b/runtime/interpreter/mterp/arm/op_sget_object.S
new file mode 100644
index 0000000..8e7d075
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sget_object.S
@@ -0,0 +1 @@
+%include "arm/op_sget.S" {"is_object":"1", "helper":"artGetObjStaticFromCode"}
diff --git a/runtime/interpreter/mterp/arm/op_sget_short.S b/runtime/interpreter/mterp/arm/op_sget_short.S
new file mode 100644
index 0000000..3e80f0d
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sget_short.S
@@ -0,0 +1 @@
+%include "arm/op_sget.S" {"helper":"artGetShortStaticFromCode"}
diff --git a/runtime/interpreter/mterp/arm/op_sget_wide.S b/runtime/interpreter/mterp/arm/op_sget_wide.S
new file mode 100644
index 0000000..97db05f
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sget_wide.S
@@ -0,0 +1,21 @@
+    /*
+     * SGET_WIDE handler wrapper.
+     *
+     */
+    /* sget-wide vAA, field@BBBB */
+
+    .extern artGet64StaticFromCode
+    EXPORT_PC
+    FETCH r0, 1                         @ r0<- field ref BBBB
+    ldr   r1, [rFP, #OFF_FP_METHOD]
+    mov   r2, rSELF
+    bl    artGet64StaticFromCode
+    ldr   r3, [rSELF, #THREAD_EXCEPTION_OFFSET]
+    mov   r9, rINST, lsr #8             @ r9<- AA
+    add   r9, rFP, r9, lsl #2           @ r9<- &fp[AA]
+    cmp   r3, #0                        @ Fail to resolve?
+    bne   MterpException                @ bail out
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    stmia   r9, {r0-r1}                 @ vAA/vAA+1<- r0/r1
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_shl_int.S b/runtime/interpreter/mterp/arm/op_shl_int.S
new file mode 100644
index 0000000..7e4c768
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_shl_int.S
@@ -0,0 +1 @@
+%include "arm/binop.S" {"preinstr":"and     r1, r1, #31", "instr":"mov     r0, r0, asl r1"}
diff --git a/runtime/interpreter/mterp/arm/op_shl_int_2addr.S b/runtime/interpreter/mterp/arm/op_shl_int_2addr.S
new file mode 100644
index 0000000..4286577
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_shl_int_2addr.S
@@ -0,0 +1 @@
+%include "arm/binop2addr.S" {"preinstr":"and     r1, r1, #31", "instr":"mov     r0, r0, asl r1"}
diff --git a/runtime/interpreter/mterp/arm/op_shl_int_lit8.S b/runtime/interpreter/mterp/arm/op_shl_int_lit8.S
new file mode 100644
index 0000000..6a48bfc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_shl_int_lit8.S
@@ -0,0 +1 @@
+%include "arm/binopLit8.S" {"preinstr":"and     r1, r1, #31", "instr":"mov     r0, r0, asl r1"}
diff --git a/runtime/interpreter/mterp/arm/op_shl_long.S b/runtime/interpreter/mterp/arm/op_shl_long.S
new file mode 100644
index 0000000..dc8a679
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_shl_long.S
@@ -0,0 +1,27 @@
+    /*
+     * Long integer shift.  This is different from the generic 32/64-bit
+     * binary operations because vAA/vBB are 64-bit but vCC (the shift
+     * distance) is 32-bit.  Also, Dalvik requires us to mask off the low
+     * 6 bits of the shift distance.
+     */
+    /* shl-long vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    and     r3, r0, #255                @ r3<- BB
+    mov     r0, r0, lsr #8              @ r0<- CC
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[BB]
+    GET_VREG r2, r0                     @ r2<- vCC
+    ldmia   r3, {r0-r1}                 @ r0/r1<- vBB/vBB+1
+    and     r2, r2, #63                 @ r2<- r2 & 0x3f
+    add     r9, rFP, r9, lsl #2         @ r9<- &fp[AA]
+
+    mov     r1, r1, asl r2              @  r1<- r1 << r2
+    rsb     r3, r2, #32                 @  r3<- 32 - r2
+    orr     r1, r1, r0, lsr r3          @  r1<- r1 | (r0 << (32-r2))
+    subs    ip, r2, #32                 @  ip<- r2 - 32
+    movpl   r1, r0, asl ip              @  if r2 >= 32, r1<- r0 << (r2-32)
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    mov     r0, r0, asl r2              @  r0<- r0 << r2
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r9, {r0-r1}                 @ vAA/vAA+1<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_shl_long_2addr.S b/runtime/interpreter/mterp/arm/op_shl_long_2addr.S
new file mode 100644
index 0000000..fd7668d
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_shl_long_2addr.S
@@ -0,0 +1,22 @@
+    /*
+     * Long integer shift, 2addr version.  vA is 64-bit value/result, vB is
+     * 32-bit shift distance.
+     */
+    /* shl-long/2addr vA, vB */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    GET_VREG r2, r3                     @ r2<- vB
+    add     r9, rFP, r9, lsl #2         @ r9<- &fp[A]
+    and     r2, r2, #63                 @ r2<- r2 & 0x3f
+    ldmia   r9, {r0-r1}                 @ r0/r1<- vAA/vAA+1
+
+    mov     r1, r1, asl r2              @  r1<- r1 << r2
+    rsb     r3, r2, #32                 @  r3<- 32 - r2
+    orr     r1, r1, r0, lsr r3          @  r1<- r1 | (r0 << (32-r2))
+    subs    ip, r2, #32                 @  ip<- r2 - 32
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    movpl   r1, r0, asl ip              @  if r2 >= 32, r1<- r0 << (r2-32)
+    mov     r0, r0, asl r2              @  r0<- r0 << r2
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r9, {r0-r1}                 @ vAA/vAA+1<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_shr_int.S b/runtime/interpreter/mterp/arm/op_shr_int.S
new file mode 100644
index 0000000..6317605
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_shr_int.S
@@ -0,0 +1 @@
+%include "arm/binop.S" {"preinstr":"and     r1, r1, #31", "instr":"mov     r0, r0, asr r1"}
diff --git a/runtime/interpreter/mterp/arm/op_shr_int_2addr.S b/runtime/interpreter/mterp/arm/op_shr_int_2addr.S
new file mode 100644
index 0000000..cc8632f
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_shr_int_2addr.S
@@ -0,0 +1 @@
+%include "arm/binop2addr.S" {"preinstr":"and     r1, r1, #31", "instr":"mov     r0, r0, asr r1"}
diff --git a/runtime/interpreter/mterp/arm/op_shr_int_lit8.S b/runtime/interpreter/mterp/arm/op_shr_int_lit8.S
new file mode 100644
index 0000000..60fe5fc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_shr_int_lit8.S
@@ -0,0 +1 @@
+%include "arm/binopLit8.S" {"preinstr":"and     r1, r1, #31", "instr":"mov     r0, r0, asr r1"}
diff --git a/runtime/interpreter/mterp/arm/op_shr_long.S b/runtime/interpreter/mterp/arm/op_shr_long.S
new file mode 100644
index 0000000..c0edf90
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_shr_long.S
@@ -0,0 +1,27 @@
+    /*
+     * Long integer shift.  This is different from the generic 32/64-bit
+     * binary operations because vAA/vBB are 64-bit but vCC (the shift
+     * distance) is 32-bit.  Also, Dalvik requires us to mask off the low
+     * 6 bits of the shift distance.
+     */
+    /* shr-long vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    and     r3, r0, #255                @ r3<- BB
+    mov     r0, r0, lsr #8              @ r0<- CC
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[BB]
+    GET_VREG r2, r0                     @ r2<- vCC
+    ldmia   r3, {r0-r1}                 @ r0/r1<- vBB/vBB+1
+    and     r2, r2, #63                 @ r0<- r0 & 0x3f
+    add     r9, rFP, r9, lsl #2         @ r9<- &fp[AA]
+
+    mov     r0, r0, lsr r2              @  r0<- r2 >> r2
+    rsb     r3, r2, #32                 @  r3<- 32 - r2
+    orr     r0, r0, r1, asl r3          @  r0<- r0 | (r1 << (32-r2))
+    subs    ip, r2, #32                 @  ip<- r2 - 32
+    movpl   r0, r1, asr ip              @  if r2 >= 32, r0<-r1 >> (r2-32)
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    mov     r1, r1, asr r2              @  r1<- r1 >> r2
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r9, {r0-r1}                 @ vAA/vAA+1<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_shr_long_2addr.S b/runtime/interpreter/mterp/arm/op_shr_long_2addr.S
new file mode 100644
index 0000000..ffeaf9c
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_shr_long_2addr.S
@@ -0,0 +1,22 @@
+    /*
+     * Long integer shift, 2addr version.  vA is 64-bit value/result, vB is
+     * 32-bit shift distance.
+     */
+    /* shr-long/2addr vA, vB */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    GET_VREG r2, r3                     @ r2<- vB
+    add     r9, rFP, r9, lsl #2         @ r9<- &fp[A]
+    and     r2, r2, #63                 @ r2<- r2 & 0x3f
+    ldmia   r9, {r0-r1}                 @ r0/r1<- vAA/vAA+1
+
+    mov     r0, r0, lsr r2              @  r0<- r2 >> r2
+    rsb     r3, r2, #32                 @  r3<- 32 - r2
+    orr     r0, r0, r1, asl r3          @  r0<- r0 | (r1 << (32-r2))
+    subs    ip, r2, #32                 @  ip<- r2 - 32
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    movpl   r0, r1, asr ip              @  if r2 >= 32, r0<-r1 >> (r2-32)
+    mov     r1, r1, asr r2              @  r1<- r1 >> r2
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r9, {r0-r1}                 @ vAA/vAA+1<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_sparse_switch.S b/runtime/interpreter/mterp/arm/op_sparse_switch.S
new file mode 100644
index 0000000..9f7a42b
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sparse_switch.S
@@ -0,0 +1 @@
+%include "arm/op_packed_switch.S" { "func":"MterpDoSparseSwitch" }
diff --git a/runtime/interpreter/mterp/arm/op_sput.S b/runtime/interpreter/mterp/arm/op_sput.S
new file mode 100644
index 0000000..7e0c1a6
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sput.S
@@ -0,0 +1,20 @@
+%default { "helper":"artSet32StaticFromCode"}
+    /*
+     * General SPUT handler wrapper.
+     *
+     * for: sput, sput-boolean, sput-byte, sput-char, sput-short
+     */
+    /* op vAA, field@BBBB */
+    EXPORT_PC
+    FETCH   r0, 1                       @ r0<- field ref BBBB
+    mov     r3, rINST, lsr #8           @ r3<- AA
+    GET_VREG r1, r3                     @ r1<= fp[AA]
+    ldr     r2, [rFP, #OFF_FP_METHOD]
+    mov     r3, rSELF
+    PREFETCH_INST 2                     @ Get next inst, but don't advance rPC
+    bl      $helper
+    cmp     r0, #0                      @ 0 on success, -1 on failure
+    bne     MterpException
+    ADVANCE 2                           @ Past exception point - now advance rPC
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_sput_boolean.S b/runtime/interpreter/mterp/arm/op_sput_boolean.S
new file mode 100644
index 0000000..e3bbf2b
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sput_boolean.S
@@ -0,0 +1 @@
+%include "arm/op_sput.S" {"helper":"artSet8StaticFromCode"}
diff --git a/runtime/interpreter/mterp/arm/op_sput_byte.S b/runtime/interpreter/mterp/arm/op_sput_byte.S
new file mode 100644
index 0000000..e3bbf2b
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sput_byte.S
@@ -0,0 +1 @@
+%include "arm/op_sput.S" {"helper":"artSet8StaticFromCode"}
diff --git a/runtime/interpreter/mterp/arm/op_sput_char.S b/runtime/interpreter/mterp/arm/op_sput_char.S
new file mode 100644
index 0000000..d8d65cb
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sput_char.S
@@ -0,0 +1 @@
+%include "arm/op_sput.S" {"helper":"artSet16StaticFromCode"}
diff --git a/runtime/interpreter/mterp/arm/op_sput_object.S b/runtime/interpreter/mterp/arm/op_sput_object.S
new file mode 100644
index 0000000..6d3a9a7
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sput_object.S
@@ -0,0 +1,11 @@
+    EXPORT_PC
+    add     r0, rFP, #OFF_FP_SHADOWFRAME
+    mov     r1, rPC
+    mov     r2, rINST
+    mov     r3, rSELF
+    bl      MterpSputObject
+    cmp     r0, #0
+    beq     MterpException
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_sput_short.S b/runtime/interpreter/mterp/arm/op_sput_short.S
new file mode 100644
index 0000000..d8d65cb
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sput_short.S
@@ -0,0 +1 @@
+%include "arm/op_sput.S" {"helper":"artSet16StaticFromCode"}
diff --git a/runtime/interpreter/mterp/arm/op_sput_wide.S b/runtime/interpreter/mterp/arm/op_sput_wide.S
new file mode 100644
index 0000000..adbcffa
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sput_wide.S
@@ -0,0 +1,19 @@
+    /*
+     * SPUT_WIDE handler wrapper.
+     *
+     */
+    /* sput-wide vAA, field@BBBB */
+    .extern artSet64IndirectStaticFromMterp
+    EXPORT_PC
+    FETCH   r0, 1                       @ r0<- field ref BBBB
+    ldr     r1, [rFP, #OFF_FP_METHOD]
+    mov     r2, rINST, lsr #8           @ r3<- AA
+    add     r2, rFP, r2, lsl #2
+    mov     r3, rSELF
+    PREFETCH_INST 2                     @ Get next inst, but don't advance rPC
+    bl      artSet64IndirectStaticFromMterp
+    cmp     r0, #0                      @ 0 on success, -1 on failure
+    bne     MterpException
+    ADVANCE 2                           @ Past exception point - now advance rPC
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_sub_double.S b/runtime/interpreter/mterp/arm/op_sub_double.S
new file mode 100644
index 0000000..69bcc67
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sub_double.S
@@ -0,0 +1 @@
+%include "arm/fbinopWide.S" {"instr":"fsubd   d2, d0, d1"}
diff --git a/runtime/interpreter/mterp/arm/op_sub_double_2addr.S b/runtime/interpreter/mterp/arm/op_sub_double_2addr.S
new file mode 100644
index 0000000..2ea59fe
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sub_double_2addr.S
@@ -0,0 +1 @@
+%include "arm/fbinopWide2addr.S" {"instr":"fsubd   d2, d0, d1"}
diff --git a/runtime/interpreter/mterp/arm/op_sub_float.S b/runtime/interpreter/mterp/arm/op_sub_float.S
new file mode 100644
index 0000000..3f17a0d
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sub_float.S
@@ -0,0 +1 @@
+%include "arm/fbinop.S" {"instr":"fsubs   s2, s0, s1"}
diff --git a/runtime/interpreter/mterp/arm/op_sub_float_2addr.S b/runtime/interpreter/mterp/arm/op_sub_float_2addr.S
new file mode 100644
index 0000000..2f4aac4
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sub_float_2addr.S
@@ -0,0 +1 @@
+%include "arm/fbinop2addr.S" {"instr":"fsubs   s2, s0, s1"}
diff --git a/runtime/interpreter/mterp/arm/op_sub_int.S b/runtime/interpreter/mterp/arm/op_sub_int.S
new file mode 100644
index 0000000..efb9e10
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sub_int.S
@@ -0,0 +1 @@
+%include "arm/binop.S" {"instr":"sub     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_sub_int_2addr.S b/runtime/interpreter/mterp/arm/op_sub_int_2addr.S
new file mode 100644
index 0000000..4d3036b
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sub_int_2addr.S
@@ -0,0 +1 @@
+%include "arm/binop2addr.S" {"instr":"sub     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_sub_long.S b/runtime/interpreter/mterp/arm/op_sub_long.S
new file mode 100644
index 0000000..6f1eb6e
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sub_long.S
@@ -0,0 +1 @@
+%include "arm/binopWide.S" {"preinstr":"subs    r0, r0, r2", "instr":"sbc     r1, r1, r3"}
diff --git a/runtime/interpreter/mterp/arm/op_sub_long_2addr.S b/runtime/interpreter/mterp/arm/op_sub_long_2addr.S
new file mode 100644
index 0000000..8e9da05
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_sub_long_2addr.S
@@ -0,0 +1 @@
+%include "arm/binopWide2addr.S" {"preinstr":"subs    r0, r0, r2", "instr":"sbc     r1, r1, r3"}
diff --git a/runtime/interpreter/mterp/arm/op_throw.S b/runtime/interpreter/mterp/arm/op_throw.S
new file mode 100644
index 0000000..be49ada
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_throw.S
@@ -0,0 +1,11 @@
+    /*
+     * Throw an exception object in the current thread.
+     */
+    /* throw vAA */
+    EXPORT_PC
+    mov      r2, rINST, lsr #8           @ r2<- AA
+    GET_VREG r1, r2                      @ r1<- vAA (exception object)
+    cmp      r1, #0                      @ null object?
+    beq      common_errNullObject        @ yes, throw an NPE instead
+    str      r1, [rSELF, #THREAD_EXCEPTION_OFFSET]  @ thread->exception<- obj
+    b        MterpException
diff --git a/runtime/interpreter/mterp/arm/op_unused_3e.S b/runtime/interpreter/mterp/arm/op_unused_3e.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_3e.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_3f.S b/runtime/interpreter/mterp/arm/op_unused_3f.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_3f.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_40.S b/runtime/interpreter/mterp/arm/op_unused_40.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_40.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_41.S b/runtime/interpreter/mterp/arm/op_unused_41.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_41.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_42.S b/runtime/interpreter/mterp/arm/op_unused_42.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_42.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_43.S b/runtime/interpreter/mterp/arm/op_unused_43.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_43.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_73.S b/runtime/interpreter/mterp/arm/op_unused_73.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_73.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_79.S b/runtime/interpreter/mterp/arm/op_unused_79.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_79.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_7a.S b/runtime/interpreter/mterp/arm/op_unused_7a.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_7a.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_f3.S b/runtime/interpreter/mterp/arm/op_unused_f3.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_f3.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_f4.S b/runtime/interpreter/mterp/arm/op_unused_f4.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_f4.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_f5.S b/runtime/interpreter/mterp/arm/op_unused_f5.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_f5.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_f6.S b/runtime/interpreter/mterp/arm/op_unused_f6.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_f6.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_f7.S b/runtime/interpreter/mterp/arm/op_unused_f7.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_f7.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_f8.S b/runtime/interpreter/mterp/arm/op_unused_f8.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_f8.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_f9.S b/runtime/interpreter/mterp/arm/op_unused_f9.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_f9.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_fa.S b/runtime/interpreter/mterp/arm/op_unused_fa.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_fa.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_fb.S b/runtime/interpreter/mterp/arm/op_unused_fb.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_fb.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_fc.S b/runtime/interpreter/mterp/arm/op_unused_fc.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_fc.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_fd.S b/runtime/interpreter/mterp/arm/op_unused_fd.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_fd.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_fe.S b/runtime/interpreter/mterp/arm/op_unused_fe.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_fe.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_unused_ff.S b/runtime/interpreter/mterp/arm/op_unused_ff.S
new file mode 100644
index 0000000..10948dc
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_unused_ff.S
@@ -0,0 +1 @@
+%include "arm/unused.S"
diff --git a/runtime/interpreter/mterp/arm/op_ushr_int.S b/runtime/interpreter/mterp/arm/op_ushr_int.S
new file mode 100644
index 0000000..a74361b
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_ushr_int.S
@@ -0,0 +1 @@
+%include "arm/binop.S" {"preinstr":"and     r1, r1, #31", "instr":"mov     r0, r0, lsr r1"}
diff --git a/runtime/interpreter/mterp/arm/op_ushr_int_2addr.S b/runtime/interpreter/mterp/arm/op_ushr_int_2addr.S
new file mode 100644
index 0000000..f2d1d13
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_ushr_int_2addr.S
@@ -0,0 +1 @@
+%include "arm/binop2addr.S" {"preinstr":"and     r1, r1, #31", "instr":"mov     r0, r0, lsr r1"}
diff --git a/runtime/interpreter/mterp/arm/op_ushr_int_lit8.S b/runtime/interpreter/mterp/arm/op_ushr_int_lit8.S
new file mode 100644
index 0000000..40a4435
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_ushr_int_lit8.S
@@ -0,0 +1 @@
+%include "arm/binopLit8.S" {"preinstr":"and     r1, r1, #31", "instr":"mov     r0, r0, lsr r1"}
diff --git a/runtime/interpreter/mterp/arm/op_ushr_long.S b/runtime/interpreter/mterp/arm/op_ushr_long.S
new file mode 100644
index 0000000..f64c861
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_ushr_long.S
@@ -0,0 +1,27 @@
+    /*
+     * Long integer shift.  This is different from the generic 32/64-bit
+     * binary operations because vAA/vBB are 64-bit but vCC (the shift
+     * distance) is 32-bit.  Also, Dalvik requires us to mask off the low
+     * 6 bits of the shift distance.
+     */
+    /* ushr-long vAA, vBB, vCC */
+    FETCH r0, 1                         @ r0<- CCBB
+    mov     r9, rINST, lsr #8           @ r9<- AA
+    and     r3, r0, #255                @ r3<- BB
+    mov     r0, r0, lsr #8              @ r0<- CC
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[BB]
+    GET_VREG r2, r0                     @ r2<- vCC
+    ldmia   r3, {r0-r1}                 @ r0/r1<- vBB/vBB+1
+    and     r2, r2, #63                 @ r0<- r0 & 0x3f
+    add     r9, rFP, r9, lsl #2         @ r9<- &fp[AA]
+
+    mov     r0, r0, lsr r2              @  r0<- r2 >> r2
+    rsb     r3, r2, #32                 @  r3<- 32 - r2
+    orr     r0, r0, r1, asl r3          @  r0<- r0 | (r1 << (32-r2))
+    subs    ip, r2, #32                 @  ip<- r2 - 32
+    movpl   r0, r1, lsr ip              @  if r2 >= 32, r0<-r1 >>> (r2-32)
+    FETCH_ADVANCE_INST 2                @ advance rPC, load rINST
+    mov     r1, r1, lsr r2              @  r1<- r1 >>> r2
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r9, {r0-r1}                 @ vAA/vAA+1<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_ushr_long_2addr.S b/runtime/interpreter/mterp/arm/op_ushr_long_2addr.S
new file mode 100644
index 0000000..dbab08d
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_ushr_long_2addr.S
@@ -0,0 +1,22 @@
+    /*
+     * Long integer shift, 2addr version.  vA is 64-bit value/result, vB is
+     * 32-bit shift distance.
+     */
+    /* ushr-long/2addr vA, vB */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    GET_VREG r2, r3                     @ r2<- vB
+    add     r9, rFP, r9, lsl #2         @ r9<- &fp[A]
+    and     r2, r2, #63                 @ r2<- r2 & 0x3f
+    ldmia   r9, {r0-r1}                 @ r0/r1<- vAA/vAA+1
+
+    mov     r0, r0, lsr r2              @  r0<- r2 >> r2
+    rsb     r3, r2, #32                 @  r3<- 32 - r2
+    orr     r0, r0, r1, asl r3          @  r0<- r0 | (r1 << (32-r2))
+    subs    ip, r2, #32                 @  ip<- r2 - 32
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    movpl   r0, r1, lsr ip              @  if r2 >= 32, r0<-r1 >>> (r2-32)
+    mov     r1, r1, lsr r2              @  r1<- r1 >>> r2
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r9, {r0-r1}                 @ vAA/vAA+1<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
diff --git a/runtime/interpreter/mterp/arm/op_xor_int.S b/runtime/interpreter/mterp/arm/op_xor_int.S
new file mode 100644
index 0000000..fd7a4b7
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_xor_int.S
@@ -0,0 +1 @@
+%include "arm/binop.S" {"instr":"eor     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_xor_int_2addr.S b/runtime/interpreter/mterp/arm/op_xor_int_2addr.S
new file mode 100644
index 0000000..196a665
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_xor_int_2addr.S
@@ -0,0 +1 @@
+%include "arm/binop2addr.S" {"instr":"eor     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_xor_int_lit16.S b/runtime/interpreter/mterp/arm/op_xor_int_lit16.S
new file mode 100644
index 0000000..39f2a47
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_xor_int_lit16.S
@@ -0,0 +1 @@
+%include "arm/binopLit16.S" {"instr":"eor     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_xor_int_lit8.S b/runtime/interpreter/mterp/arm/op_xor_int_lit8.S
new file mode 100644
index 0000000..46bb712
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_xor_int_lit8.S
@@ -0,0 +1 @@
+%include "arm/binopLit8.S" {"instr":"eor     r0, r0, r1"}
diff --git a/runtime/interpreter/mterp/arm/op_xor_long.S b/runtime/interpreter/mterp/arm/op_xor_long.S
new file mode 100644
index 0000000..4f830d0
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_xor_long.S
@@ -0,0 +1 @@
+%include "arm/binopWide.S" {"preinstr":"eor     r0, r0, r2", "instr":"eor     r1, r1, r3"}
diff --git a/runtime/interpreter/mterp/arm/op_xor_long_2addr.S b/runtime/interpreter/mterp/arm/op_xor_long_2addr.S
new file mode 100644
index 0000000..5b5ed88
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/op_xor_long_2addr.S
@@ -0,0 +1 @@
+%include "arm/binopWide2addr.S" {"preinstr":"eor     r0, r0, r2", "instr":"eor     r1, r1, r3"}
diff --git a/runtime/interpreter/mterp/arm/unop.S b/runtime/interpreter/mterp/arm/unop.S
new file mode 100644
index 0000000..56518b5
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/unop.S
@@ -0,0 +1,20 @@
+%default {"preinstr":""}
+    /*
+     * Generic 32-bit unary operation.  Provide an "instr" line that
+     * specifies an instruction that performs "result = op r0".
+     * This could be an ARM instruction or a function call.
+     *
+     * for: neg-int, not-int, neg-float, int-to-float, float-to-int,
+     *      int-to-byte, int-to-char, int-to-short
+     */
+    /* unop vA, vB */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    GET_VREG r0, r3                     @ r0<- vB
+    $preinstr                           @ optional op; may set condition codes
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    $instr                              @ r0<- op, r0-r3 changed
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG r0, r9                     @ vAA<- r0
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 8-9 instructions */
diff --git a/runtime/interpreter/mterp/arm/unopNarrower.S b/runtime/interpreter/mterp/arm/unopNarrower.S
new file mode 100644
index 0000000..a5fc027
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/unopNarrower.S
@@ -0,0 +1,23 @@
+%default {"preinstr":""}
+    /*
+     * Generic 64bit-to-32bit unary operation.  Provide an "instr" line
+     * that specifies an instruction that performs "result = op r0/r1", where
+     * "result" is a 32-bit quantity in r0.
+     *
+     * For: long-to-float, double-to-int, double-to-float
+     *
+     * (This would work for long-to-int, but that instruction is actually
+     * an exact match for op_move.)
+     */
+    /* unop vA, vB */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[B]
+    ldmia   r3, {r0-r1}                 @ r0/r1<- vB/vB+1
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    $preinstr                           @ optional op; may set condition codes
+    $instr                              @ r0<- op, r0-r3 changed
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    SET_VREG r0, r9                     @ vA<- r0
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 9-10 instructions */
diff --git a/runtime/interpreter/mterp/arm/unopWide.S b/runtime/interpreter/mterp/arm/unopWide.S
new file mode 100644
index 0000000..7b8739c
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/unopWide.S
@@ -0,0 +1,21 @@
+%default {"preinstr":""}
+    /*
+     * Generic 64-bit unary operation.  Provide an "instr" line that
+     * specifies an instruction that performs "result = op r0/r1".
+     * This could be an ARM instruction or a function call.
+     *
+     * For: neg-long, not-long, neg-double, long-to-double, double-to-long
+     */
+    /* unop vA, vB */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    add     r3, rFP, r3, lsl #2         @ r3<- &fp[B]
+    add     r9, rFP, r9, lsl #2         @ r9<- &fp[A]
+    ldmia   r3, {r0-r1}                 @ r0/r1<- vAA
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    $preinstr                           @ optional op; may set condition codes
+    $instr                              @ r0/r1<- op, r2-r3 changed
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r9, {r0-r1}                 @ vAA<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 10-11 instructions */
diff --git a/runtime/interpreter/mterp/arm/unopWider.S b/runtime/interpreter/mterp/arm/unopWider.S
new file mode 100644
index 0000000..657a395
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/unopWider.S
@@ -0,0 +1,20 @@
+%default {"preinstr":""}
+    /*
+     * Generic 32bit-to-64bit unary operation.  Provide an "instr" line
+     * that specifies an instruction that performs "result = op r0", where
+     * "result" is a 64-bit quantity in r0/r1.
+     *
+     * For: int-to-long, int-to-double, float-to-long, float-to-double
+     */
+    /* unop vA, vB */
+    mov     r3, rINST, lsr #12          @ r3<- B
+    ubfx    r9, rINST, #8, #4           @ r9<- A
+    GET_VREG r0, r3                     @ r0<- vB
+    add     r9, rFP, r9, lsl #2         @ r9<- &fp[A]
+    $preinstr                           @ optional op; may set condition codes
+    FETCH_ADVANCE_INST 1                @ advance rPC, load rINST
+    $instr                              @ r0<- op, r0-r3 changed
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    stmia   r9, {r0-r1}                 @ vA/vA+1<- r0/r1
+    GOTO_OPCODE ip                      @ jump to next instruction
+    /* 9-10 instructions */
diff --git a/runtime/interpreter/mterp/arm/unused.S b/runtime/interpreter/mterp/arm/unused.S
new file mode 100644
index 0000000..ffa00be
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/unused.S
@@ -0,0 +1,4 @@
+/*
+ * Bail to reference interpreter to throw.
+ */
+  b MterpFallback
diff --git a/runtime/interpreter/mterp/arm/zcmp.S b/runtime/interpreter/mterp/arm/zcmp.S
new file mode 100644
index 0000000..6e9ef55
--- /dev/null
+++ b/runtime/interpreter/mterp/arm/zcmp.S
@@ -0,0 +1,32 @@
+    /*
+     * Generic one-operand compare-and-branch operation.  Provide a "revcmp"
+     * fragment that specifies the *reverse* comparison to perform, e.g.
+     * for "if-le" you would use "gt".
+     *
+     * for: if-eqz, if-nez, if-ltz, if-gez, if-gtz, if-lez
+     */
+    /* if-cmp vAA, +BBBB */
+#if MTERP_SUSPEND
+    mov     r0, rINST, lsr #8           @ r0<- AA
+    GET_VREG r2, r0                     @ r2<- vAA
+    FETCH_S r1, 1                       @ r1<- branch offset, in code units
+    cmp     r2, #0                      @ compare (vA, 0)
+    mov${revcmp} r1, #2                 @ r1<- inst branch dist for not-taken
+    adds    r1, r1, r1                  @ convert to bytes & set flags
+    FETCH_ADVANCE_INST_RB r1            @ update rPC, load rINST
+    ldrmi   rIBASE, [rSELF, #THREAD_CURRENT_IBASE_OFFSET]   @ refresh table base
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+#else
+    mov     r0, rINST, lsr #8           @ r0<- AA
+    GET_VREG r2, r0                     @ r2<- vAA
+    FETCH_S r1, 1                       @ r1<- branch offset, in code units
+    ldr     lr, [rSELF, #THREAD_FLAGS_OFFSET]
+    cmp     r2, #0                      @ compare (vA, 0)
+    mov${revcmp} r1, #2                 @ r1<- inst branch dist for not-taken
+    adds    r1, r1, r1                  @ convert to bytes & set flags
+    FETCH_ADVANCE_INST_RB r1            @ update rPC, load rINST
+    bmi     MterpCheckSuspendAndContinue
+    GET_INST_OPCODE ip                  @ extract opcode from rINST
+    GOTO_OPCODE ip                      @ jump to next instruction
+#endif